Errors and error-correction in plant and animal genomes
Current sequencing methods produce large amounts of data, but plant and animal genome assemblies based on these data are often woefully incomplete. The genomes are almost always draft-quality, with a lot of missing data, many gaps, and many errors in the published sequences. These incomplete and error-filled assemblies result in many annotation errors, especially in the number of genes present in a genome. In this talk I discuss the extent of such errors, showing huge numbers of errors in the number of genes predicted from draft assemblies, with more than half of all genes having the wrong number of copies in the draft genomes examined. I then present multiple approaches for dealing with these errors, from statistical corrections of the data you have to entirely new methods for doing genome assembly. The work presented here suggests that many inferences based on published plant and animal genomes may be erroneous, but offer a way forward for future analyses.
Professor Matthew Hahn
Matthew Hahn received a BS degree from Cornell University a PhD from Duke University under the mentorship of Professor Mark Rausher. From 2003 to 2005, he held a US National Science Foundation postdoctoral fellowship to work at the University of California, Davis with Professors Charles Langley and John Gillespie. He is a Professor in the Department of Biology and the Department of Computer Science at Indiana University, where he has held a faculty position since 2005. His research interests include bioinformatics, genomics, population genetics, and phylogenetics. He is an author or coauthor on more than 130 publications in these areas, as well as two books, Introduction to Computational Genomics (Cambridge University Press, 2007) and Molecular Population Genetics (Oxford University Press, 2018). He has received a US National Science Foundation CAREER award (2009) and a fellowship from the Alfred P. Sloan Foundation (2010-2012), and was recently elected a fellow of the American Association for the Advancement of Science (2018).