Variant View: Visualizing Sequence Variants in their Gene Context

Joel A. Ferstay, Cydney B. Nielsen , Tamara Munzner


Abstract | Paper | Talk | Video | Supplementary Materials | Software | Paper Figures
The Variant View display.

Abstract

Scientists use DNA sequence differences between an individual's genome and a standard reference genome to study the genetic basis of disease. Such differences are called sequence variants, and determining their impact in the cell is difficult because it requires reasoning about both the type and location of the variant across several levels of biological context. In this design study, we worked with 4 analysts to design a visualization tool supporting variant impact assessment for 3 different tasks. We contribute data and task abstractions for the problem of variant impact assessment, and the carefully justified design and implementation of the Variant View tool. Variant View features an information-dense visual encoding that provides maximal information at the overview level, in contrast to the extensive navigation required by currently-prevalent genome browsers. We provide initial evidence that the tool simplified and accelerated workflows for these three tasks through four case studies. Finally, we reflect on the lessons learned in creating and refining data and task abstractions that allow for concise overviews of sprawling information spaces that can reduce or remove the need for the memory-intensive use of navigation.

Paper

Variant View: Visualizing Sequence Variants in their Gene Context
To appear in IEEE Transactions on Visualization and Computer Graphics (TVCG)
Proceedings of IEEE Conference on Information Visualization
, Atlanta, GA, USA. 2013.

[PDF] [BibTeX]

Talk

A talk will be given during the 10:30 EST InfoVis session on Thursday, October 17, at InfoVis 2013 in Atlanta, Georgia, USA.
slides:[PDF][PPT]

Video

The video explains Variant View's interface components and provides usage examples:
MP4 video (13.3 MB, with audio, tested on QuickTime 10.0)

Supplementary Materials

The supplementary materials contain additional figures referenced in the main paper.
supplementary_materials.pdf

Software

Variant View is available as open source at the Variant View Software Page.

Figures

Fig. 1 Sequence variants and their attributes shown in Variant View with respect to biological context annotations at multiple scales. This gene, whose name is anonymized, was identified by analysts as a putative cancer candidate gene through using the tool.
Fig. 2. Biological context in which variants occur. The gene level is specified by genomic coordinates, the exon-containing transcript level is specified by transcript coordinates, and the protein level is specified by protein coordinates.
Fig. 3. The Variant View tool, annotated to indicate its three main views. The primary view (A) is the central overview for performing variant impact assessment; the reorderable gene list view (B) can sort genes alphabetically or by derived measures of variant importance; the secondary Variant Data table (C) contains peripheral information.
Fig. 4. Variant visual encoding.
Fig. 5. Comparison of the same variant data between different visual encoding schemes.
Fig. 6. Variant View allowed analysts to quickly confirm known results: known AML genes could be found near the top of the sorted lists, and the per-gene views clearly and immediately showed tell-tale structure. (a) IDH2. (b) FLT3. (c) Example gene without interesting structure near the list bottom.
Fig. 7. Comparison of patient data to a known-AML variant database. The immediate neighbors for each variant are shown.
Fig. 8. Debugging the bioinformatics pipeline.

Joel Ferstay
Last modified: August 1, 2013