Data Made Visual
By Christopher Geissler, Ph.D.
Over the past several years, Hearing Health Foundation (HHF)’s Hearing Restoration Project (HRP) has generated a significant amount of data. Part of the challenge for HRP consortium members, as for many life scientists, comes not only from the amount of data they need to analyze but also the need to examine multi-omic datasets.
“Multi-omic” refers to data related to any of the “-omes”: genome, epigenome, transcriptome, etc., and these data are key to providing a better understanding of the body and its disorders in their full complexity. The work of the HRP is made even more complicated because it is multispecies, with consortium members working on zebrafish, chick, and mouse models.
The wealth of data collected is also a result of scientists’ ability to work at ever smaller scales. HRP consortium member Ronna Hertzano, M.D., Ph.D., an associate professor at the University of Maryland School of Medicine, recalls that when she finished her doctorate in 2005, scientists were able to examine 1 microgram of material (1/1,000,000th of a gram). By the time the HRP was founded in 2011, researchers could isolate and examine 100 nanograms of material (or 1/10th of a microgram). Today Hertzano and colleagues can drill down to just 1 to 10 picograms of material, or 0.000001–0.00001 micrograms.
Single Cell Information
Additionally, copious new data is produced due to the recently developed ability to analyze information from single cells. Scientists are no longer restricted to looking at averages across the ear as a whole.
Hertzano uses the metaphor of a classroom to explain single-cell analysis. Let’s say one individual student is the cell type in which you are interested. Before single-cell analysis was possible, to test whether the student’s performance improved after a new teacher joined the class, for example, you would have had to test all the students in the entire school, average the scores, and use that average to estimate how that particular student did.
Single-cell analysis is like being able to test each student separately, with their name attached to their exam, while following their individual progress over time. It facilitates a much more detailed view.
Knowing how to detect specific cell types results in a fine-tuned picture of what happens and where things happen in the inner ear when changes are introduced (to stimulate hair cell regeneration, for example).
To help manage the data, since 2016 HHF has funded the development of the gEAR portal, which stands for gene Expression Analysis Resource (umgear.org). It has since become the premier tool for data visualization and analysis for all researchers, not just the HRP, working in the hearing and balance field. The online portal enables scientists to analyze multi-omic and multi-species datasets—from their own labs as well as those of colleagues.
For Hertzano, who leads the gEAR team, the idea came from feelings of frustration she experienced at conferences, when she realized there was no easy way to compare the data she and fellow researchers were presenting. “Scientists normally share data through journal publications, accessing large tables of raw numbers, which made it very difficult to find what you were looking for or even identify what you were interested in,” she says.
Discovering that the teams behind existing data analysis tools had no interest in expanding, Hertzano set about building a new one, starting with hand-drawn “cartoons” of what data visualization would be able to do and what the portal itself would look like. She recruited Anup Mahurkar and Joshua Orvis, colleagues at the University of Maryland Institute for Genome Sciences. “They were willing to embark on this visionary task,” she says.
Convinced the project would be invaluable to the field, Hertzano began looking for funders before she even had a working prototype. The HRP consortium and HHF were easily convinced that the gEAR had the potential to change how scientists would work by facilitating data analysis and data sharing.
HHF funded the majority of the gEAR’s development. “HHF advances discovery in the entire ear field by providing access to this data in this way,” Hertzano says, noting that additional funding came from the National Institute on Deafness and Other Communication Disorders. Data is shared far and wide beyond the HRP consortium. The portal has more than 730 registered users, and over 580 datasets have been uploaded (as of January 2020), 65 of which are organized in “thematic profiles,” or groups of datasets on related topics.
The thematic profile organization allows users to see how a gene behaves in experiments with related topics (e.g., development, regeneration, adult, brain, etc.). While the gEAR has not been officially published yet, it has been cited in over 20 publications as a tool for hypothesis generation, data comparison, data validation, and hypotheses testing.
Hertzano and her team hold several workshops every year. They hosted two at the January 2020 Association for Research in Otolaryngology (ARO) conference, teaching roughly 160 fellow scientists how to use the gEAR in their own work. (See the following story for more about the ARO.)
Data Visualization
Data sharing is one of the organizing principles behind the gEAR; data visualization is another. Data visualization provides graphical representations of data that would traditionally be displayed in tables. Instead of scouring rows and rows of numbers, for example, to find cells with similar characteristics or similar reactions to specific stimuli, plotting the data visually allows scientists to cluster these cell types together according to these characteristics or features.
The similarities are then more immediately evident. Above is a figure, taken from HRP research, showing the development of cells in a zebrafish embryo five days after fertilization. The figure illustrates the types of cells into which the embryo’s inner cells (stem cells) are being transformed. On the figure’s bottom right in darker green are those cells that have already formed into sensory hair cells. Referring to the graphic’s passing resemblance to North America, Peter G. Barr-Gillespie, Ph.D., the scientific director of the HRP, has jokingly noted that what researchers in hair cell regeneration are looking for is the cells that move “south” from “Florida” (4) to “Cuba” (2)—that is, that continue their development from progenitor cells into young hair cells.
The gEAR is also expanding data sharing and data visualization in other fields. It was recently cloned to create NeMO Analytics, or Neuroscience Multi-Omic Archive (nemoanalytics.org). Neuroscientists need to examine millions of cells, versus the tens of thousands of cells that hearing and balance researchers look at. “This is really exciting because we tend to think of tools and methods coming from bigger fields and being adopted by smaller fields,” Hertzano says. “The gEAR is an example of something developed in a smaller field and now being adopted and used productively in a much bigger field.”
Open Source
This type of cloning is possible and encouraged because the gEAR is based on open source code. The platform is malleable, and the gEAR team is always keen to work with researchers in other fields to promote access to data and support research across disciplines. Additional developments and new features to the gEAR itself further serve to support scientists in and beyond the ear field.
The intent is to eventually also make the gEAR fully open source itself, so the code and basic structure will be available to anyone who wants it. HHF is proud to support the development of a tool that reflects its underlying mission: to promote research and the dynamic exchange of data and ideas to treat, prevent, and cure hearing loss.
Christopher Geissler, Ph.D., is HHF’s director of program and research support. HRP consortium member Ronna Hertzano, M.D., Ph.D., is an associate professor of otorhinolaryngology–head and neck surgery at the University of Maryland School of Medicine. For more, see hhf.org/hrp.