---++++Concerning Multimaps I sort of came to my conclusion about this already, but I'll spam it here anyway for posterity. When faced when multimaps, there are three modes of resolution: randomly select 1, report all, or report none. Currently, it seems that by default I find all possible mappings, and only during the output phase do I filter to one of the above three (in reality... the latter 2) cases. This isn't very computationally efficient, so I suspect we'll have to adapt something like a =report= variable found in =readaligner=. -- Main.jujubix - 21 May 2010 ---++++Concerning the Class Hierarchy As the library starts to take shape, we have to decide upon a class hierarchy which project will be built upon. I imagine that changing the hierarchy down the road will be difficult, so in hopes or avoiding that, let's commit ourselves to a single hierarchy. Some history about the existing hierarchy directories: * Originally, there was only =IO=, =alignment=, and =index= * IO would read in the reference and reads * The index (Kmer) would return positions in the reference that matched the _first k bases_ of a read * The aligner would align the _entire_ to the reference at the specified position * Then then index was swapped... aligner was completely replaced when searching for exact reads * The index would "locate" the position in the reference where the _entire_ read was found * Inexact reads were supported, leading to the need for =Mapper= classes * Would "map" reads to the reference, but allowed some form of variation (e.g. mismatches, gaps, etc...) * Some required =aligner= classes, bringing back the need for them * To reduce the code seen in /tools/, =Drivers= were created * Essentially, took in a =mapper=, =input= and =output= classes, and ran through every read in the given file * =Pairend= classes were introduced to handle the post-processing to make reads paired... * These were fed into some specific =Drivers=, and works independently from index and mappers As you can see, the entire hierarchy wasn't carefully planned, and rather extended when the need arose... so I wouldn't be surprised if there was room for improvement... or a completely restructuring. Some personal concerns: * Some classes in =IO= are actually Types... this could be pulled out * The creation of every =Mapper= class requires the addition of a new =Locate= functions in the Index class * Should the index simply be a container? And the mapper classes take care of the actually "locating", using the index? -- Main.jujubix - 26 May 2010 <img src="%ATTACHURLPATH%/Drawing2.jpg" alt="Drawing2.jpg" width="484" height="554" /> * The main core of any tool is a =Driver= * A typical driver requires: * =FastqFile= to read input * =SamFile= to write output * =Mapper= for mapping reads to the reference * Contains an =Index= holding the reference * The =Index= requires the refernece from a =FastaFile= * May optionally use an =Aligner= * =MatchMaker= for post-processing pair-end reads, if applicable * Requires access to an =Index= and an =Aligner= * The typical way to build an aligner executable is as follows: 1. =FastaFile= to read refernece 2. create an =Index= by loading or building a new one 3. create an =Aligner= with user-set penalty values 4. create a =Mapper= using the above two objects 5. create a =MatchMaker= only if dealing with _pair end_ (PE) reads 6. create a =FastqFile= (or two in PE), using files with reads 7. create a =SamFile= 8. create a =Driver= using the above three (or four, for PE) 9. Run() the =Driver= I don't see anything horribly wrong with it... as it's built various usable aligners, but I don't see anything wonderful about it either, being built without much planning. -- Main.jujubix - 26 May 2010 <img src="%ATTACHURLPATH%/uml.png" alt="Jay's take on the UML" width="800" height="560" /> Here's my take on the current UML. Some points to note: * The Mappers are now specific to their index. * Construction of an aligner is still the same. You just have to be careful with which Mapper you choose to use with which Index. -- Main.jayzhang - 28 May 2010
This topic: BETA
>
TipsAndTricks
>
WebHome
>
NGSAlignerProject
>
NGSAlignerDiscussion
Topic revision: r5 - 2010-05-28 - jayzhang
Copyright © 2008-2025 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback