Sept. 22, 2017
[embed]<iframe width="640" height="360" src="https://www.youtube.com/embed/tdChiGzQOYM?rel=0" frameborder="0" allowfullscreen></iframe>[/embed]
Invention Big-data approaches to precision medicine and drug development
Inventor Olga Troyanskaya, professor of computer science and the Lewis-Sigler Institute for Integrative Genomics
Olga Troyanskaya and her team have developed techniques to comb large collections of genomic and other data to make fundamental discoveries and identify new therapeutic targets. These “big data” methods can be applied to numerous disorders for which people can have genetic susceptibilities, from cancer and chronic kidney disease to autism and neurodegenerative diseases such as Alzheimer’s.
Many of the disease-related genes previously identified have been found by analyzing the genomes of affected individuals and their families. While powerful, this approach generally reveals genes with a strong tie to the disease and can miss those that act together. This is a challenge for the study of autism, which has a strong genetic basis and may involve up to 1,000 genes. “It is like looking for your keys under the light post because that is the only place you have light,” said Troyanskaya, who is also appointed at the Simons Foundation’s Flatiron Institute.
Big-data methods allow Troyanskaya’s team to search the entire genome, including noncoding regions that are not genes but rather regulate gene expression. In a study published in 2016 in the journal Nature Neuroscience, the team used these computational methods to identify roughly 2,500 genes that are likely linked to autism among the almost 26,000 known human genes. This vastly reduces the number of genes that need to be sequenced for quantitative genetic studies, Troyanskaya said. “Instead we can focus on the set of genes that are likely to be autism-related, so we can screen a lot more families or use this information to help interpret whole-genome studies,” she said. “This approach can allow us to find the ‘keys’ that are very far from the ‘light.’”
That study leveraged what was known about autism-related genes to discover new ones, but Troyanskaya and her team have also created methodologies for searching for disease-related genomic factors even when no existing information is available. In a 2015 paper in Nature Methods, the researchers describe a deep-learning approach called DeepSEA that mines large data sets to predict the effects of mutations in the noncoding region of the genome. “We can use that information to give a single impact score that indicates whether or not a mutation appears to be a human-disease mutation and whether it is likely to be functional,” Troyanskaya said.
Team members Arjun Krishnan, associate research scholar in the Lewis-Sigler Institute for Integrative Genomics; Ran Zhang, graduate student in the Department of Molecular Biology; Jian Zhou, graduate student in the Lewis-Sigler Institute for Integrative Genomics
Collaborators Alex Lash, chief informatics officer, and Alan Packer, senior scientist, at the Simons Foundation Autism Research Initiative
Development status Princeton is seeking outside interest for further development of this technology.
Funding source National Institutes of Health, Simons Foundation