Math-Bio seminar: "Decoding of pairwise coalescent times and detection of recent adaptation in biobank-scale SNP array data sets"
Coalescent hidden Markov models (HMM) such as the pairwise sequentially Markovian coalescent (PSMC, Li and Durbin, 2010) enable estimating the locus-specific posterior distribution of the time to most recent common ancestor (TMRCA) of a pair of haploid chromosomes when high-coverage sequencing data is available. I will present the “ascertained sequentially Markovian coalescent” (ASMC), a coalescent HMM that can be used to accurately estimate locus-specific TMRCA probabilities in widely available SNP array data. ASMC utilizes an extremely efficient recursive formulation of the forward/backward HMM algorithm, which enables analysis of very large data sets to reconstruct a detailed landscape of coalescent times along the genome. I will describe results from running ASMC in several cohorts, including ~120,000 unrelated British individuals from the UK Biobank data set, where we find that multiple loci underwent positive selection during the past ~200 generations. Looking at deeper time scales, we detect widespread negative selection that concentrates in regions enriched for heritability in several disease phenotypes.