Distilling the Data

By Michael Lovett, Ph.D.

The burgeoning field of bioinformatics allows the Hearing Restoration Project to analyze and compare large genomics datasets and identify the best genes for more testing. This sophisticated data analysis will help speed the way toward a cure for hearing loss and tinnitus.

 

Since its launch in 2011, the Hearing Restoration Project (HRP) is focused on identifying new therapies that will restore inner ear hair cell function, and hence hearing. Within the consortium, smaller research groups engage in separate projects over the course of the year, to move the science along more quickly.

Over the past decade my group, and the group led by my collaborator Mark Warchol, Ph.D., have worked to identify genes that are potential targets for drug development or for gene therapies to cure hearing loss. Our approach has been to determine the exact mechanisms that some vertebrates—in our case, birds—use to regenerate their hair cells and thus spontaneously restore their hearing. We have been comparing this genetic “tool kit” with the mechanisms that mammals normally use to make hair cells.

Unlike birds, mammals cannot regenerate adult hair cells when they are damaged, which is a leading cause of human hearing and balance disorders. Our working hypothesis is that birds have regeneration mechanisms that mammals are missing—or that mammals have developed a repressive mechanism that prevents hair cell regeneration.

In either case, our strategy has been to get a detailed picture of what transpires during hair cell regeneration in birds by using cutting-edge technologies developed during the Human Genome Project (the international research collaboration whose goal was the complete mapping of all the nuclear DNA in humans). These next-generation (NextGen) DNA sequencing methods have allowed us to accurately measure changes in every single gene as chick sensory hair cells regenerate.

The good news is that this gives us, for the first time, an exquisitely detailed and accurate description of all of the genes that are potential players in the process. The bad news is that this is an enormous amount of information; thousands of genes change over the course of seven days of regeneration.

Some of these will be the crucially important—and possibly game-changing—genes that we want to explore in potential therapies, but most will be downstream effects of those upstream formative events. The challenge is to correctly identify the important causative needles in the haystack of later consequences.

We already know some important genetic players, but we are still far from understanding the genetic wiring of hair cell development or regeneration. For example, after decades of basic research, we know that certain signaling pathways, such as those termed Notch and Wnt, are important in specifying how hair cells develop. These chemical signaling pathways are made of multiple protein molecules, each of which is encoded by a single gene.

However, the Notch and Wnt pathways together comprise fewer than 100 genes and, despite being intensively studied for years, we do not completely understand every nuance of how they fit together.

It also may seem surprising that—more than a decade after the completion of the Human Genome Project and projects sequencing mouse, chick, and many other species’ genomic DNA—we still do not know the exact functions of many of the roughly 20,000 genes, mostly shared, that are found in each organism. This is partly because teasing out all of their interactions and biochemical properties is a painstaking process, and some of the genes exert subtly different effects in different organs. It is also because the genetic wiring diagram in different cells is a lot more complicated than a simple set of “on/off” switches.

All of this sounds a bit dire. Fortunately, we do have some tools for filtering the data deluge into groups of genes that are more likely to be top candidates. The first is to extract all of the information on “known” pathways, such as the Notch and Wnt mentioned earlier. That is relatively trivial and can be accomplished by someone reasonably well versed in Microsoft Excel.

That leaves us with the vast “unknown” world. Analyzing this requires computational, mathematical, and statistical methods that are collectively called bioinformatics. This burgeoning field has been in existence for a couple of decades and covers the computational analysis of very large datasets in all its forms. For example, we routinely use well-established bioinformatic methods to assemble and identify all of the gene sequences from our NextGen DNA sequence reads. These tasks would take many years if done by hand, but a matter of hours by computational methods.

In the case of our hair cell regeneration data, our major bioinformatic task is to identify the best genes for further experimental testing. One method is to computationally search the vast biological literature to see if any of them can be connected into new networks or pathways. There are now numerous software tools for conducting these types of searches. However, this really is not very helpful when searching through several thousand genes at once. The data must be filtered another way to be more useful.

We have used statistical pattern matching tools called self-organizing maps to analyze all of our data across every time point of hair cell regeneration. In this way we can detect genes that show similar patterns of changes and then drill down deeper into whether these genes are connected. This has provided us with an interesting “hit list” of genes that have strong supporting evidence of being good candidates for follow-up.

An additional approach is to compare our chick data to other datasets that the HRP consortium is collecting. The logic here is that we expect key genetic components to be shared across species. For example, we now know a great deal about what genes are used in zebrafish hair cell regeneration and the genes that specify mouse hair cells during normal development. We can conduct computational comparisons across these big datasets to identify what is similar and what is different. Again, this has yielded a small and interesting collection of genes that is being experimentally tested. 

Our final strategy has been to extract classes of genes that act as important switches in development. These transcription factors control other genetic circuits. We have identified all of these that change during chick hair cell regeneration. As a consortium the HRP now has a collection of about 200 very good candidate genes for follow-up. However, software and high-speed computation are not going to do it all for us. We still need biologists to ask and answer the important questions and to direct the correct bioinformatics comparisons.

Hair cell regeneration is a plausible goal for the treatment of hearing and balance disorders. The question is not if we will regenerate hair cells in humans, but when. Your financial support will help to ensure we can continue this vital research and find a cure in our lifetime! Please help us accelerate the pace of hearing and balance research and donate today. Your HELP is OUR hope!

If you have any questions about this research or our progress toward a cure for hearing loss and tinnitus, please contact Hearing Health Foundation at info@hhf.org.

Michael Lovett, Ph.D., is a professor at the National Lung & Heart Institute in London and the chair in systems biology at Imperial College London.

Print Friendly and PDF