Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching

被引:552
作者
Du, Pan [1 ]
Kibbe, Warren A. [1 ]
Lin, Simon M. [1 ]
机构
[1] Northwestern Univ, Robert H Lurie Comprehens Canc Ctr, Chicago, IL 60611 USA
关键词
D O I
10.1093/bioinformatics/btl355
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: A major problem for current peak detection algorithms is that noise in mass spectrometry (MIS) spectra gives rise to a high rate of false positives. The false positive rate is especially problematic in detecting peaks with low amplitudes. Usually, various baseline correction algorithms and smoothing methods are applied before attempting peak detection. This approach is very sensitiveto the amount of smoothing and aggressiveness of the baseline correction, which contribute to making peak detection results inconsistent between runs, instrumentation and analysis methods. Results: Most peak detection algorithms simply identify peaks based on amplitude, ignoring the additional information present in the shape of the peaks in a spectrum. In our experience, 'true' peaks have characteristic shapes, and providing a shape-matching function that provides a 'goodness of fit' coefficient should provide a more robust peak identification method. Based on these observations, a continuous wavelet transform (CWT)-based peak detection algorithm has been devised that identifies peakswith different scales and amplitudes. Bytransforming the spectrum into wavelet space, the pattern -matching problem is simplified and in addition provides a powerful technique for identifying and separating the signal from the spike noise and colored noise. This transformation, with the additional information provided by the 2D CWT coefficients can greatly enhance the effective signal-to-noise ratio. Furthermore, with this technique no baseline removal or peak smoothing preprocessing steps are required before peak detection, and this improvesthe robustness of peak detection underavarietyof conditions. The algorithm was evaluated with SELDI-TOF spectra with known polypeptide positions. Comparisons with two other popular algorithms were performed. The results show the CWT-based algorithm can identify both strong and weak peaks while keeping false positive rate low. Availability: The algorithm is implemented in R and will be included as an open source module in the Bioconductor project. Contact: s-lin2@northwestern.edu Supplementary material: http://basic.northwestern.edu/publications/ peakdetection/. Colour versions of the figures in this article can be found at Bioinformatics Online.
引用
收藏
页码:2059 / 2065
页数:7
相关论文
共 21 条
[1]   A universal denoising and peak picking algorithm for LC-MS based on matched filtration in the chromatographic time domain [J].
Andreev, VP ;
Rejtar, T ;
Chen, HS ;
Moskovets, EV ;
Ivanov, AR ;
Karger, BL .
ANALYTICAL CHEMISTRY, 2003, 75 (22) :6314-6326
[2]  
[Anonymous], 1993, Ten Lectures of Wavelets
[3]  
*CAMDA, 2006, CAMDA 2006
[4]  
Carmona R., 1998, PRACTICAL TIME FREQU
[5]   Improved peak detection and quantification of mass spectrometry data acquired from surface-enhanced laser desorption and ionization by denoising spectra with the undecimated discrete wavelet transform [J].
Coombes, KR ;
Tsavachidis, S ;
Morris, JS ;
Baggerly, KA ;
Hung, MC ;
Kuerer, HM .
PROTEOMICS, 2005, 5 (16) :4107-4117
[6]  
DASGUPTA N, 2004, IEEE SIGNAL P LETT, V9, P407
[7]  
Gentleman R, 2005, BIOINFORMATICS COMPU, V746718470
[8]   Preprocessing of tandem mass spectrometric data to support automatic protein identification [J].
Gentzel, M ;
Köcher, T ;
Ponnusamy, S ;
Wilm, M .
PROTEOMICS, 2003, 3 (08) :1597-1610
[9]  
Gras R, 1999, ELECTROPHORESIS, V20, P3535, DOI 10.1002/(SICI)1522-2683(19991201)20:18<3535::AID-ELPS3535>3.3.CO
[10]  
2-A