Math-Bio seminar: "Controlling the rate of false discoveries in tandem mass spectrum identifications"

Mon, 11/14/2016 - 16:00 - 17:00
Uri Keich, University of Sydney

A typical shotgun proteomics experiment produces thousands of tandem mass spectra, each of which can be tentatively assigned a corresponding peptide by using a database search procedure that looks for a peptide-spectrum match (PSM) that optimizes the score assigned to a matched pair. Some of the resulting PSMs will be correct while others will be false, and we have no way to verify which is which. The statistical problem we face is of controlling the false discovery rate (FDR), or the expected proportion of false PSMs among all reported pairings. While there is a rich statistical literature on controlling the FDR in the multiple hypothesis testing context, controlling the FDR in the PSM context is mostly done through the "home grown" method called target-decoy competition (TDC). After a brief introduction to the problem of tandem mass spectrum identification we will explore the reasons why the mass spec community has been using this non-standard approach to controlling the FDR. We will then discuss how calibration can increase the number of correct discoveries and offer an alternative method for controlling the FDR in the presence of calibrated scores. We will conclude by arguing that our analysis extends to a more general setup than the mass spectrum identification problem.
Joint work with Bill Noble (University of Washington)

318 Carolyn Lynch Laboratory