Genome-wide selection of unique and valid oligonucleotides
Heikki Hyyr, Martti Juhola and Mauno Vihinen1
Nucleic Acids Research 2005 33(13):e115
Functional genomics methods are used to investigate the huge amount of information contained in genomes. Numerous experimental methods rely on the use of oligo- or polynucleotides. Nucleotide strand hybridization forms the underlying principle for these methods. For all these techniques, the probes should be unique for analyzed genes. In addition to being unique for the studied genes, the probes should fulfill a large number of criteria to be usable and valid. The criteria include for example, avoidance of self-annealing, suitable melting temperature and nucleotide composition. We developed a method for searching unique and valid oligonucleotides or probes for genes so that there is not even a similar (approximate) occurrence in any other location of the whole genome. By using probe size 25, we analyzed 17 complete genomes representing a wide range of both prokaryotic and eukaryotic organisms. More than 92% of all the genes in the investigated genomes contained valid oligonucleotides. Extensive statistical tests were performed to characterize the properties of unique and valid oligonucleotides. Unique and valid oligonucleotides were relatively evenly distributed in genes except for the beginning and end, which were somewhat overrepresented. The flanking regions in eukaryotes were clearly underrepresented among suitable oligonucleotides. In addition to distributions within genes, the effects on codon and amino acid usage were also studied.