Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
05/06/10I'm starting this journal 3 days late (whoops). Here's a brief overview of what happened during my first three days:
| ||||||||
Line: 172 to 172 | ||||||||
| ||||||||
Added: | ||||||||
> > | 05/26/10Spent a lot of time just thinking...Daniel and I were revisiting the class design for the aligner, and we encountered some design issues with theMapper and Index classes. Specifically, at this point, the Mapper classes feel too dependent on the specific Index implementation, a problem arising from using external libraries (readaligner). For example, I wanted to move the NGS_FMIndex::LocateWithMismatches and NGS_FMIndex::LocateWithGaps methods into their respective Mapper classes, but I found it troublesome, since the LocateWithX methods encapsulate the Query::align() method. The Query class is a readaligner class and must take in TextCollection objects to be instantiated. Those objects are outputted from the FM Index used in readaligner only, so if the Mappers were to instantiate their own Query classes, they would essentially have to be used with only the FM Index implementation.
Daniel and I have also decided to put an option in the Mapper classes for the user to choose the maximum number of results they want. This is a minor optimization that would allow us to terminate the alignment function once a set amount of results have been reached, rather than going through with the whole alignment. This might also allow us to use fixed-length arrays instead of vectors (or at least reserve the space in the vector beforehand to prevent reallocs ).
Another area I've investigated today was the cost of abstraction in our Index classes. I tried unabstracting the NGS_FMIndex and comparing runtimes. On 125k reads, I get about 9.8-10.1 seconds for both implementations, so the abstraction cost isn't really an issue. I just realized that Query objects are also abstracted those methods are called much more often. This might also be a good area to investigate.
Finally, I just realized that string 's can reserve space, too. I think this might be a good area to improve on, as I'm seeing a lot of calls to malloc from both string and vector . Considering both of these could get quite large in some cases, reserving![]() ![]() realloc 's.
To do:
|