Software supporting the algorithm for Ciphergen data

To implement this algorithm it is assumed the investigator should have R and Perl software installed (though the program can be run without Perl it the investigator is willing to manually edit .xml files). 
  1. An .xml file exported from Ciphergen software -- either Ciphergen Express or Ciphergen ProteinChip Software should be able to export a spectrum as an .xml file.
  2. A .csv file with two columns containing the mass locations of a small number of peaks in the reference spectrum and the spectrum to be adjusted.  
    Reference Values (mi)
    Misaligned Values (pi)
    3247.2
    3238.8
    5510.9
    5496.0
    7727.9
    7708.5
    11034.9
    11009.5
    13831.7
    13800.7
    The .csv form of this file (called peaks-quadratic.csv) is here.  The content of the header information is not important as long as it is comma-delimited.
  3. Perl code (called quadratic.pl).  This code 1) opens the original .xml file and extracts the a, b, and t0 parameters and writes those parameters to a file, Rinput.csv,  2) calls an R program (quadratic-r-program.r) that reads Rinput.csv, computes the new a, b, and t0 parameters, and writes out the new parameters to Rouput.csv, and 3) reads parameters from Routput.csv and puts these new parameters into a new .xml file.
  4. R code (called quadratic-r-program.r) that reads the data in Rinput.csv, calculates the new parameters and writes them out to Routput.csv.  quadratic-r-program.log is a log file produced by the program that will show error messages if something goes wrong.  There is some question if fitting the quadratic equation (equation 1 in the paper) might 'overfit' the data if the b parameter is allowed to be non-zero.  Some discussion of this is presented in the context of choosing appropriate calibrants, mi, and pi values.  An alternative R program is provided that fits quadratic equations with b =  0.  To implement this the line $Rprogram = "quadratic-r-program.r"; in quadratic.pl needs to be changed to $Rprogram = "quadratic-rb0-program.r";


The user can skip the Perl code and just use the R code if they want to instead edit the .xml files -- putting the old parameters in Rinput.csv and running the R code will produce Routput.csv and the new parameters.  It should be noted that the  a, b, and t0 parameters in the .xml files are not the same as those that appear when the user looks at these parameters within Ciphergen's ProteinChip software using the Calibration /Current Equation menu.  For some reason, Ciphergen has multiplied these parameters by various powers of 10.  Consequently, changing the parameters through this menu does not have the same effect as changing them through the .xml files. 

Presently, the user gives the name of the .xml file to the Perl code (e.g. input.xml) and a new xml file is written, (e.g. input-new.xml) with the same basename but "-new.xml" in place of ".xml".  The user is also prompted for the name of the .csv file showing the peaks in the two spectra.  To run the code type "quadratic.pl" at the "C:\appropriate directory" prompt after Perl and R have been successfully installed.

Currently it is assumed these program and data files are in the same directory.  Because of their roots in UNIX, R and Perl have some peculiarities with respect to designating directory structure if you are used to working in a Windows /DOS world.   If you want to put things in different directories you will likely need to change the R and perhaps Perl code.  With both R and Perl it is probably easiest to use forward slashes rather than backslashes to designate directories e.g. "c:/alignment/test.xml" rather than "c:\alignment\test.xml".  Also, some Perl commands may not work if implemented on UNIX/linux machines or with other operating systems -- for instance the way in which the R program is called by Perl.  Those using other operating systems will need to make some minor changes to the Perl script.  Future versions of these programs may introduce more flexibility in these respect. 

Here is a link to a view all the files in this directory.