Difference: JaysJournal (8 vs. 9)

Revision 92010-05-14 - jayzhang

Line: 1 to 1
 
META TOPICPARENT name="NGSAlignerProject"

05/06/10

I'm starting this journal 3 days late (whoops). Here's a brief overview of what happened during my first three days:
Line: 66 to 66
 
  • Finish up my banded non-vectorized algorithm and test it
  • Try coming up with a vectorized version (may involve having to "flip" the matrix).
Added:
>
>

05/13/10

Had a discussion with Chris about implementing the banded local aligner. He told me that I have actually been doing my non-vectorized version right and I should go left-to-right, top-to-bottom through the array with artificial bounds instead of from 0 to max x. He also told me that I can pretty much assume the matrix will be square, since we will be passing output from the index into the aligner, and the index will find a reference that is roughly the same size as the read. So, that solves the slope issue. He also told me a general way to traverse the matrix with a vector: instead of going left-to-right, then starting at the beginning, the vector will follow a right-down-right-down pattern (so it will "bounce").

I came in today and realized my method of finding the bounding lines on the banded non-vectorized version was pretty complicated; instead of finding the equations for the two bounding lines given some k distance of separation from the diagonal (which involves finding the slope and y-intercept and has square roots in it), I instead take the diagonal line and calculate all cells that are a horizontal distance k from it. This is much easier, and works out to pretty much the same thing. It has the added advantage of letting me take in a horizontal k value for my band radius, which will let me guarantee a correct score given the number of gaps <= k.

I managed to finish up the non-vectorized version as well as make up a (I think) pretty comprehensive test suite for it. In the end, I passed all my own tests, though I did miss a "-1" in one line of code that had me banging my head for a while. I also managed to improve the memory efficiency of the score-only local alignment function in general based on a few tricks I learned while combing through the vectorized Smith-Waterman code, though I've only implemented them in my banded aligner for now. Improvements include:

  • Keeping only one read-sized row of scores instead of 2.
  • Keeping only one read-sized row to keep track of gaps in the x-axis instead of 2 rows.
  • Having just one int to keep track of the last match, instead of 2 read-sized rows.
  • Having just one int to keep track of the last gap in y, instead of 2 read-sized rows.

To do:

  • Start tackling the vectorized banded algorithm! That'll be pretty challenging and probably keep me busy all day.
  • (Much later) Add in my improvements to the original local aligner.
  • (Still much later) Polish up the different aligners and have CMake compile either the vectorized or non-vectorized versions of the aligners based on CPU support.
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2025 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback