I'm starting this journal 3 days late (whoops). Here's a brief overview of what happened during my first three days:
Line: 50 to 50
Hopefully get that to pass all the tests (i.e. make the behaviour the same as the non-vectorized version)
Take a look at optimizing for linear gap penalty if I have time.
Added:
> >
05/12/10
Chris told me that the full alignment was a lower priority, since the score-only alignment is used much more often. So, I've tried optimizing for linear gap penalty instead. I managed to improve performance by not doing a few calculations required in the affine and also improve the memory usage slightly (by not having to use one genome-length-size array).
I did a few preliminary tests, using a randomly generated genome/read sequences of length 1000 each. Over 1000 iterations, my results were:
38.836576365 seconds using affine gap penalty
28.907083597 seconds using linear gap penalty
So I get a roughly 25% decrease in calculation time. Tests using more/fewer iterations and using different genome/read lengths show more or less the same amount.
Now it's on to doing the "banded" algorithm. I'm going to try making a non-vectorized version first, then move on to vectorized. I'm not really sure how I'm going to test it though, since the results are going to be a little bit different...