To implement this algorithm it is assumed the investigator should have
R and Perl software installed (though the program can be run without
Perl it the investigator is willing to manually edit .xml files).
- An .xml file exported from Ciphergen software -- either Ciphergen
Express or Ciphergen ProteinChip Software should be able to export a
spectrum as an .xml file.
- A .csv file with two columns containing the mass locations
of a small number of peaks in the reference spectrum and the spectrum
to be adjusted.
Reference Values (mi)
|
Misaligned Values (pi)
|
3247.2
|
3238.8
|
5510.9
|
5496.0
|
7727.9
|
7708.5
|
11034.9
|
11009.5
|
13831.7
|
13800.7
|
The .csv form of this file (called peaks-quadratic.csv) is here.
The content of the header information is not important as long as it is
comma-delimited.
- Perl code (called quadratic.pl).
This code 1) opens the original .xml file and extracts the a, b, and t0
parameters and writes those parameters to a file, Rinput.csv, 2)
calls an R program (quadratic-r-program.r) that reads Rinput.csv,
computes the new a, b, and t0
parameters, and writes out the new parameters to Rouput.csv, and 3)
reads parameters from Routput.csv and puts these new parameters into a
new .xml file.
- R code (called
quadratic-r-program.r) that reads the data in Rinput.csv, calculates
the new parameters and writes them out to Routput.csv.
quadratic-r-program.log is a log file produced by the program that will
show error messages if something goes wrong. There is some
question if fitting the quadratic equation (equation 1 in the paper)
might 'overfit' the data if the b parameter is allowed to be
non-zero. Some discussion of this is presented in the context of choosing
appropriate calibrants, mi, and pi
values. An alternative R
program is provided that fits quadratic equations with b =
0. To implement this the line $Rprogram =
"quadratic-r-program.r"; in quadratic.pl needs to be changed to
$Rprogram = "quadratic-rb0-program.r";
The user can skip the Perl code and just use the R code if they want to
instead edit the .xml files -- putting the old parameters in Rinput.csv
and running the R code will produce Routput.csv and the new
parameters. It should be noted that the
a, b, and
t0
parameters in the .xml files are not the same as those that appear when
the user looks at these parameters within Ciphergen's ProteinChip
software using the Calibration /Current Equation menu. For some
reason, Ciphergen has multiplied these parameters by various powers of
10. Consequently, changing the parameters through this menu does
not have the same effect as changing them through the .xml files.
Presently, the user gives the name of the .xml file to the Perl code
(e.g. input.xml) and a new xml file is written, (e.g. input-new.xml)
with the same basename but "-new.xml" in place of ".xml". The
user is also prompted for the name of the .csv file showing the peaks
in the two spectra. To run the code type "quadratic.pl" at the
"C:\appropriate directory" prompt after Perl and R have been
successfully installed.
Currently it is assumed these program
and data files are in the same directory. Because of their roots
in UNIX, R and Perl have some peculiarities with respect to designating
directory structure if you are used to working in a Windows /DOS
world. If you want to put things in different directories
you will likely need to change the R and perhaps Perl code. With
both R and Perl it is probably easiest to use forward slashes rather
than backslashes to designate directories e.g. "c:/alignment/test.xml"
rather than "c:\alignment\test.xml". Also, some Perl commands may
not work if implemented on UNIX/linux machines or with other operating
systems -- for instance the way in which the R program is called by
Perl. Those using other operating systems will need to make some
minor changes to the Perl script. Future versions of these
programs may introduce more flexibility in these respect.