Title: Improve peptide identification from tandem mass spectra
Speaker: Jiarui Ding
Department of Computer Science, University of British Columbia

Amino acids are the building blocks of peptides. Peptide identification which tries to determine the amino acid sequences of peptides is very important for the diagnosis of many diseases, such as cancers. Tandem mass spectrometry is a powerful tool for peptide identification. In a typical experiment, tandem mass spectrometers can produce a large number of tandem mass spectra, and these spectra are used for peptide identification. However, nearly all spectra are noise-contaminated, which means spectra contain not only peptide information but also non-peptide information. In addition, a majority of spectra are not identifiable because they are of too poor quality, e.g., the non-peptide information is very high in spectra. Current peptide identification algorithms are based on these noisy spectra. As a result, much time is wasted to identify those unidentifiable spectra. In addition, the accuracy of peptide identification algorithms may suffer from the noises in spectra.

We will talk about how to improve peptide identification from tandem mass spectra, which means to both speed up and improve the reliability of peptide identification from tandem mass spectra. To achieve the goals, several kinds of methods are used to pre-process spectra before identifications, e.g., to remove the noisy peaks of spectra and to remove the spectra which are produced by chemical noise. The methods used are from digital signal processing and machine learning. Since tandem mass spectra are one dimensional signals, it is natural to use the algorithms from digital signal processing. Machine learning algorithms as general data processing tools are quite effective to process tandem mass spectral data.