Importance of appropriate calibrant and alignment points

In checking the 2 spectra from each day provided on the author's web site (which is very nicely organized and helpful), I noted from the xml files that while the scans may have been optimized for the range from 2000 to 20000 Daltons (in the spotProtocolInstructions field), the calibration was not suited for this range. This is because the calibrants used (as supplied in massCalibrationInfo) are at m/z valueof 12360.2+H, 16951.5+H, 35688+H, 66433+H, and 116351+H, so that the location of m/z 5300 or so shown in figure 2 must be found by extrapolation as opposed to interpolation. By contrast, the peaks that the author uses for calibrating the spectra (taken from peaks-quadratic.csv) have actual values of 2169.8, 5363.4, 7782.9, 9298.7, and 13866.9, nicely bracketing the region of interest. I strongly suspect that the Ciphergen plots would look much better if these peaks were used.

This is the case and an important point. I included on page 7, column 1 a discussion of the importance in choosing calibrants that cover and bracket the range of interest. In our case we were inexperienced and guided by a Ciphergen representative in our choice of calibrants. In retrospect we should probably have chosen lower range peptides.

We also obtained data (not presented) from the same chips and spots using a high laser setting – in this case the calibrants we selected were more reasonable. Had we used just low range peptides for calibration we would likely get poor results for this high range. The instrument operator says she thinks it would be difficult to calibrate the machine for low and high ranges simultaneously – her impression is that the machine may be relatively precise in one or the other but not both. One way of testing this is to try to mix low-range peptide calibrants with a high-range protein calibrants – it is my understanding that Ciphergen sells them separately. We plan trying this mixture in the future.

This (i.e. the effect of choice of calibrants) is a major issue, and the author should compare the behavior of his algorithms with the Ciphergen software when they are using the same set of calibrating peaks. I would suggest that this be done with both sets discussed above, (a) to see how good things are when the peaks bracket the target region, and (b) to see how bad things get when they are far away.

I attempted to address these questions in the following way. The spectra in the paper provided data over the 0 to 100,000 Dalton range. In tables 2 and 3 of the paper I compared unadjusted spectra (calibrated using high range calibrants) with these same spectra aligned using low-range m_i and p_i. Below are the c.v.s for peaks obtained in the 2,000-20,000 range. The results are all obtained using the PROcess preprocessing and peak-picking tools and reproduce what is in the paper (page 7 column 1).

		Distribution of c.v.s for 2-20KD
Method	25%	50%	Mean	75%	Number of Peaks
Unadjusted	37	44	63	61	89
Cubic spline	19	25	26	31	86
Quadratic (ciphergen)	19	25	26	31	86

The quadratic results come from adjusting the Ciphergen XML files, exporting the data as CSV files and then using PROcess – as discussed in the text, the results are the same for the cubic spline and quadratic approaches.

Using the same spectra I then chose m_i and p_i values in the high range – near the range of the calibrants. The approximate locations of the p_i (they changed a bit from spectrum to spectrum) were 11.7 KD, 14.0 KD, 33.2 KD, 66.4 KD, and 79.1 KD, relatively close to the calibrants chosen for the instrument (12.3 KD, 17.0 KD, 35.7 KD, 66.4 KD, and 116.4 KD) with the exception of the last calibrant that is outside the data range. Two versions of the quadratic method were conducted, one with b=0 and one with nonzero b.

This table shows the results when looking at peaks in the low range – i.e. all the methods are calibrated/aligned far from the region of interest

		Distribution of c.v.s for 2-20KD
Method	25%	50%	Mean	75%	Number of Peaks
Unadjusted	37	44	63	61	89
Cubic spline	29	36	41	43	84
Quadratic (b nonzero)	39	73	88	113	88
Quadratic (b = 0)	25	31	33	38	80

The table shows the cubic spline is still a significant improvement over the unadjusted data, but the quadratic method is terrible – even worse than the unadjusted results. Visual inspection of the spectra revealed the extrapolation errors were quite substantial. I then processed the data constraining b=0 and the improvement was remarkable – surpassing even the cubic spline (that has linear as opposed to quadratic extrapolation). This supports the Ciphergen rep’s suggestion that overfitting could occur with non-zero b values if extrapolation is an issue. This strengthens the concerns about extrapolating results outside the calibration/alignment range.

I then examined how these same alignments had adjusted the higher peaks – those near the calibration and high m_i and p_i. The table below shows the c.v.s in the 10-100 KD range.

		Distribution of c.v.s for 10-100KD
Method	25%	50%	Mean	75%	Number of Peaks
Unadjusted	32	35	36	42	31
Cubic spline	29	33	34	39	36
Quadratic (b nonzero)	30	34	35	40	33
Quadratic (b = 0)	29	33	34	38	30

All the methods perform pretty well. This supports the reviewer’s suggestion that had the calibration been appropriate for the low range then the results in Tables 2 and 3 would have probably been similar for the unadjusted, cubic spline, and quadratic methods.

The lessons I draw from this exercise are
1) It is critical that calibration/alignment be performed over the range of interest. If two ranges are of interest then I suspect some type of alignment procedure will be necessary for at least one of the ranges.
2) Fitting the b parameter likely provides at best only modest improvements and may create additional problems if the calibration/alignment region is inappropriate. This has led to an option of fitting the model in equation 1) with b=0 -- the webpage discussing the ciphergen algorithm has R code implementing this alternative.
3) While it may be true that good calibration may reduce the need for a separate alignment procedure when data are processed on a single machine within one lab over a short period of time, there will still likely be a need for such algorithms when data are compared across machines, centers, or long periods of time.