2014 New Investigator Grant
Nathan L. Clark, Ph.D. Assistant Professor, Department of Computational and Systems Biology, University of Pittsburgh
Co-evolutionary Signatures as a Novel Approach to Gene Discovery
Abstract
It is humbling for genetics researchers to acknowledge that even a decade after determining the DNA sequence of the human genome, we still do not understand the basic functional roles of most genes. This lack of knowledge severely hinders our ability to locate genes affecting important biological processes and those responsible for disease. One solution is to blindly screen all ~23,000 genes in the genome; however, only limited cases are scalable enough to do this. As an alternative, we propose to deploy our unique computational biology tools to prioritize the best candidate genes for any biological processes of interest. Our innovative approach is based on a principle of evolutionary biology stating that environmental forces influence or “select” how our genes change over time. Because such forces will similarly affect all genes carrying out a particular biological function, those genes will co-evolve over time, and this correlative signature can be detected in DNA sequences. The central advantage of our novel tools is that they allow us to measure and exploit this co-evolutionary signal and thereby identify functional links between genes. This has permitted us to reveal previously unknown genes as contributors to specific biological functions, and we have published several proof-of-principle articles describing these successes. In this proposal, we use co-evolutionary signatures to discover novel functional links between important cardiac muscle proteins and the genes that regulate them. These new regulatory genes will advance the fields of cardiac physiology and protein trafficking, and suggest new gene targets for therapy. In addition, our project will hone and validate our co-evolutionary tools while simultaneously making them publicly available for all biological researchers through a webserver. The core datasets generated in this project will be valuable for prioritizing genetic work in many fields in basic biology, as well as in medical genetics.