Concerning Multimaps

I sort of came to my conclusion about this already, but I'll spam it here anyway for posterity.

When faced when multimaps, there are three modes of resolution: randomly select 1, report all, or report none.

Currently, it seems that by default I find all possible mappings, and only during the output phase do I filter to one of the above three (in reality... the latter 2) cases. This isn't very computationally efficient, so I suspect we'll have to adapt something like a report variable found in readaligner.

-- Main.jujubix - 21 May 2010

Concerning the Class Hierarchy

As the library starts to take shape, we have to decide upon a class hierarchy which project will be built upon. I imagine that changing the hierarchy down the road will be difficult, so in hopes or avoiding that, let's commit ourselves to a single hierarchy.

Some history about the existing hierarchy directories:

  • Originally, there was only IO, alignment, and index
    • IO would read in the reference and reads
    • The index (Kmer) would return positions in the reference that matched the first k bases of a read
    • The aligner would align the entire to the reference at the specified position
  • Then then index was swapped... aligner was completely replaced when searching for exact reads
    • The index would "locate" the position in the reference where the entire read was found
  • Inexact reads were supported, leading to the need for Mapper classes
    • Would "map" reads to the reference, but allowed some form of variation (e.g. mismatches, gaps, etc...)
    • Some required aligner classes, bringing back the need for them
  • To reduce the code seen in /tools/, Drivers were created
    • Essentially, took in a mapper, input and output classes, and ran through every read in the given file
  • Pairend classes were introduced to handle the post-processing to make reads paired...
    • These were fed into some specific Drivers, and works independently from index and mappers

As you can see, the entire hierarchy wasn't carefully planned, and rather extended when the need arose... so I wouldn't be surprised if there was room for improvement... or a completely restructuring.

Some personal concerns:

  • Some classes in IO are actually Types... this could be pulled out
  • The creation of every Mapper class requires the addition of a new Locate functions in the Index class
    • Should the index simply be a container? And the mapper classes take care of the actually "locating", using the index?

-- Main.jujubix - 26 May 2010

Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r2 - 2010-05-26 - jujubix
 
  • Edit
  • Attach
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2025 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback