Tags:
tag this topic
create new tag
view all tags
---++++Concerning Multimaps I sort of came to my conclusion about this already, but I'll spam it here anyway for posterity. When faced when multimaps, there are three modes of resolution: randomly select 1, report all, or report none. Currently, it seems that by default I find all possible mappings, and only during the output phase do I filter to one of the above three (in reality... the latter 2) cases. This isn't very computationally efficient, so I suspect we'll have to adapt something like a =report= variable found in =readaligner=. -- Main.jujubix - 21 May 2010 ---++++Concerning the Class Hierarchy As the library starts to take shape, we have to decide upon a class hierarchy which project will be built upon. I imagine that changing the hierarchy down the road will be difficult, so in hopes or avoiding that, let's commit ourselves to a single hierarchy. Some history about the existing hierarchy directories: * Originally, there was only =IO=, =alignment=, and =index= * IO would read in the reference and reads * The index (Kmer) would return positions in the reference that matched the _first k bases_ of a read * The aligner would align the _entire_ to the reference at the specified position * Then then index was swapped... aligner was completely replaced when searching for exact reads * The index would "locate" the position in the reference where the _entire_ read was found * Inexact reads were supported, leading to the need for =Mapper= classes * Would "map" reads to the reference, but allowed some form of variation (e.g. mismatches, gaps, etc...) * Some required =aligner= classes, bringing back the need for them * To reduce the code seen in /tools/, =Drivers= were created * Essentially, took in a =mapper=, =input= and =output= classes, and ran through every read in the given file * =Pairend= classes were introduced to handle the post-processing to make reads paired... * These were fed into some specific =Drivers=, and works independently from index and mappers As you can see, the entire hierarchy wasn't carefully planned, and rather extended when the need arose... so I wouldn't be surprised if there was room for improvement... or a completely restructuring. Some personal concerns: * Some classes in =IO= are actually Types... this could be pulled out * The creation of every =Mapper= class requires the addition of a new =Locate= functions in the Index class * Should the index simply be a container? And the mapper classes take care of the actually "locating", using the index? -- Main.jujubix - 26 May 2010 <img src="%ATTACHURLPATH%/Drawing2.jpg" alt="Drawing2.jpg" width="484" height="554" /> * The main core of any tool is a =Driver= * A typical driver requires: * =FastqFile= to read input * =SamFile= to write output * =Mapper= for mapping reads to the reference * Contains an =Index= holding the reference * The =Index= requires the refernece from a =FastaFile= * May optionally use an =Aligner= * =MatchMaker= for post-processing pair-end reads, if applicable * Requires access to an =Index= and an =Aligner= * The typical way to build an aligner executable is as follows: 1. =FastaFile= to read refernece 2. create an =Index= by loading or building a new one 3. create an =Aligner= with user-set penalty values 4. create a =Mapper= using the above two objects 5. create a =MatchMaker= only if dealing with _pair end_ (PE) reads 6. create a =FastqFile= (or two in PE), using files with reads 7. create a =SamFile= 8. create a =Driver= using the above three (or four, for PE) 9. Run() the =Driver= I don't see anything horribly wrong with it... as it's built various usable aligners, but I don't see anything wonderful about it either, being built without much planning. -- Main.jujubix - 26 May 2010 <img src="%ATTACHURLPATH%/uml.png" alt="Jay's take on the UML" width="800" height="560" /> Here's my take on the current UML. Some points to note: * The Mappers are now specific to their index. * Construction of an aligner is still the same. You just have to be careful with which Mapper you choose to use with which Index. -- Main.jayzhang - 28 May 2010
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r5
<
r4
<
r3
<
r2
<
r1
|
B
acklinks
|
V
iew topic
|
Ra
w
edit
|
M
ore topic actions
Topic revision: r5 - 2010-05-28
-
jayzhang
Home
Site map
BETA web
Communications web
Faculty web
Imager web
LCI web
Main web
SPL web
Sandbox web
TWiki web
TestCases web
BETA Web
Create New Topic
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Account
Log In
Register User
E
dit
A
ttach
Copyright © 2008-2025 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback