Software supporting the algorithm for general data

To implement this algorithm it is assumed the investigator has
  1. A .csv file with two columns, the first with putative m/z values and the second column containing associated intensities.  It is assumed the file's first row contains header information.  This .csv files contains all the mass/intensity data over the entire range of interest, e.g. 2,000 - 100,000 Daltons.  The beginning of such a file might look like this:
    M/Z
    Intensity
    2000.21
    .02701
    2000.68
    -.02636
    2001.16
    -.04542
    2001.63
    -.00593
    2002.10 .08831
    The negative intensities are produced by baseline subtraction.  An example of a complete file (called alldata.csv) is here.
  2. A second .csv file with two columns containing the mass locations of a small number of peaks in the reference spectrum and the spectrum to be adjusted.  
    Reference Values (mi)
    Misaligned Values (pi)
    3247.2
    3238.8
    5510.9
    5496.0
    7727.9
    7708.5
    11034.9
    11009.5
    13831.7
    13800.7
    The .csv form of this file (called peaks-spline.csv) is here.  The content of the header information is not important as long as it is comma-delimited.
  3. R code (called cubic-r-program.r) that reads and processes the information within these first two files.
Output from the R code (alldata-new.csv) is a third .csv file in the same format as the first file with new intensity values for the same m/z values.  As the R code is presently written, files 1, 2, and 3 should all be in the same directory.  Rather than renaming files in the R code it may be easier to make copies of particular .csv files and rename them alldata.csv and peaks-spline.csv, the names of the files in the R code.  The output of the R code is by default named alldata-new.csv -- this could then also be renamed.

If, for some reason, an investigator would like to use a data transformation that is simpler than a cubic spline one alternative might be to try to estimate a constant shift on the time scale (as opposed to the m/z scale).  In this instance the same approach of finding mi and pi would be performed, but one would use a time of flight values instead of m/z.  Once the mi and pi are found one can calculate an average difference between the mi and pi and use this average to create a shift correction in the misaligned times.  It is suggested that the time scale be used because empirically a fixed shift in time leads to a non-linear change in m/z.  Though the data are not presented here, this approach yields results that are nearly identical to those obtained with the spline correction.  There is some question as to whether such a simple shift structure would be appropriate when differences in spectra arise due to different machines or other factors that may not be captured in a linear shift on the time or m/z scale.

Here is a link to a view all the files in this directory.