Math-Bio seminar: "A flexible inference of complex population histories and recombination from multiple genomes"

Mon, 02/06/2017 - 16:00 - 17:00
Champak Reddy, CUNY Graduate Center

Analyzing whole genome sequences provides an unprecedented resolution of the historical demography of populations. In the process, most inferential methods either ignore or simplify the confounding effects of recombination and population history on the observed polymorphism. Going further, we build upon an existing analytic approach that partitions the genome into blocks of equal (and arbitrary) size and summarizes the polymorphism and linkage information as blockwise counts of SFS types (bSFS). We introduce a novel composite likelihood framework, using the bSFS, that jointly models demography and recombination and is explicitly designed to scale up to multiple whole genome sequences. The flexible nature of our method further allows for arbitrarily complex population histories using unphased and unpolarized whole genome sequences (https://github.com/champost/ABLE). We review the demographic history of the two known Orangutan species for the first time using multiple genome sequences (over 160 Mbp in length) from each population. Our results indicate that the orangutan species diverged approximately 650-950 thousand years ago. After speciation, secondary contact modelled as pulse admixture (∼300,000 years ago) is shown to have a better support than continuous gene flow which corresponds to dispersal opportunity coupled with the periodic sea-level changes in South East Asia.

318 Carolyn Lynch Laboratory