Difference: JaysJournal (8 vs. 9)

Revision 92010-05-14 - jayzhang

Line: 1 to 1

META TOPICPARENT	name="NGSAlignerProject"

05/06/10

I'm starting this journal 3 days late (whoops). Here's a brief overview of what happened during my first three days:

Line: 66 to 66

Finish up my banded non-vectorized algorithm and test it
Try coming up with a vectorized version (may involve having to "flip" the matrix).

Added:

>
>

05/13/10

Had a discussion with Chris about implementing the banded local aligner. He told me that I have actually been doing my non-vectorized version right and I should go left-to-right, top-to-bottom through the array with artificial bounds instead of from 0 to max x. He also told me that I can pretty much assume the matrix will be square, since we will be passing output from the index into the aligner, and the index will find a reference that is roughly the same size as the read. So, that solves the slope issue. He also told me a general way to traverse the matrix with a vector: instead of going left-to-right, then starting at the beginning, the vector will follow a right-down-right-down pattern (so it will "bounce").

I came in today and realized my method of finding the bounding lines on the banded non-vectorized version was pretty complicated; instead of finding the equations for the two bounding lines given some k distance of separation from the diagonal (which involves finding the slope and y-intercept and has square roots in it), I instead take the diagonal line and calculate all cells that are a horizontal distance k from it. This is much easier, and works out to pretty much the same thing. It has the added advantage of letting me take in a horizontal k value for my band radius, which will let me guarantee a correct score given the number of gaps <= k.

I managed to finish up the non-vectorized version as well as make up a (I think) pretty comprehensive test suite for it. In the end, I passed all my own tests, though I did miss a "-1" in one line of code that had me banging my head for a while. I also managed to improve the memory efficiency of the score-only local alignment function in general based on a few tricks I learned while combing through the vectorized Smith-Waterman code, though I've only implemented them in my banded aligner for now. Improvements include:

Keeping only one read-sized row of scores instead of 2.
Keeping only one read-sized row to keep track of gaps in the x-axis instead of 2 rows.
Having just one int to keep track of the last match, instead of 2 read-sized rows.
Having just one int to keep track of the last gap in y, instead of 2 read-sized rows.

To do:

Start tackling the vectorized banded algorithm! That'll be pretty challenging and probably keep me busy all day.
(Much later) Add in my improvements to the original local aligner.
(Still much later) Polish up the different aligners and have CMake compile either the vectorized or non-vectorized versions of the aligners based on CPU support.

View topic | History: r73 < r72 < r71 < r70 | More topic actions...