SequenceJuxtaposer

Fluid Navigation For Large-Scale Sequence Comparison

Note: this project is only intended for alpha testing

Download

Summary

SequenceJuxtaposer is a sequence visualization tool for the exploration and comparison of biomolecular sequences. It allows users to fluidly stretch and shrink parts of the view, as if manipulating a rubber sheet with the borders tacked down. It supports the guaranteed visibility of landmarks at every point in smooth transitions between a big-picture overview and a drilled-down view that shows details in context. Landmarks can include specified motifs or thresholded differences between bases across multiple sequences. We use and extend efficient progressive rendering algorithms from recent information visualization literature to provide a guaranteed high frame rate. We demonstrate the effectiveness of our approach on two large publicly available datasets, for a proof of concept of the power of that fluid exploration of single and multiple sequences brings to biologists. SequenceJuxtaposer supports interaction at 20 frames per second when browsing a single sequence of over 1.2 million base pairs, or large collections of sequences up to 2 million total base pairs.

People

Imager people Other associates

Papers

Software

  • Read Me
  • Windows
  • Linux
    • All necessary files can be extracted from here: sj.tar.gz
    • Run "./installSJ" to install, see the Read Me for details on how to run sj.jar
  • Macs
    • Install this first: download gl4java for Mac OS X (courtesy of Dmitry Markman)
    • Second, download and run the jar archive (sj.jar, see Read Me for details)
  • Java archive (JAR) file
  • Sample data, in tar.gz format
    • SARS data sets

    • Murphy et al. data sets (from: Resolution of the early placental mammal radiation using Bayesian phylogenetics, Science, 294(5550):2348-51, 2001)     DNA, 44 sequences, 17000 sites per sequence, 764686 bytes

    • Onion yellows phytoplasma data set (from: Shin-ichi Miyata et al. Two different thymidylate kinase gene homologues, including one that has catalytic activity, are encoded in the onion yellows phytoplasma genome. Microbiology, 149:2243-2250, 2003.)
      • onion_killer2.fa     DNA, 1 sequence, 874976 sites per sequence, 874989 bytes
        • data set used for Figure 2 of paper with successive screenshots of increasing detail


    • Smaller test data sets
      • science_5seq.fa     DNA, 5 sequences, 17000 sites per sequence, 86897 bytes
      • diff1.fa     RNA, 3 sequences, 500 sites per sequence, 1591 bytes
      • test.fa     DNA, 8 sequences, 5 sites per sequence, 241 bytes
      • test2.fa     DNA, 3 sequences, 5 sites per sequence, 86 bytes
      • small.fa     DNA, 3 sequences, 4 sites per sequence, 83 bytes
      • otherSARS.fa     DNA, 9 sequences, unaligned, 234261 bytes
      • Caenor_test.fa     DNA, 3 sequences, unaligned, 46027 bytes
      • smaller.fa     DNA, 3 sequences, unaligned, 369 bytes

Related Projects

Support

  • US National Science Founcation (NSF/DEB-0121651/0121682)
  • National Science and Engineering Research Council of Canada (NSERC/RGPIN-262047-03)
  • German Academic Exchange Service