Subject: | Analyzing the Genome |
Presenter: | Andrew Kwon |
Paper: | "Factors influencing the identification of transcription factor binding sites by cross-species comparison." |
  | by McCue LA, Thompson W, Carmack CS, and Lawrence CE. |
Abstract |
Factors influencing the identification of transcription factor binding
sites by cross-species comparison.
As the number of sequenced genomes has grown, the questions of which
species are most useful and how many genomes are sufficient for comparison
have become increasingly important for comparative genomics studies. We
have systematically addressed these questions with respect to phylogenetic
footprinting of transcription factor (TF) binding sites in the
gamma-proteobacteria, and have evaluated the statistical significance of
our motif predictions. We used a study set of 166 Escherichia coli genes
that have experimentally identified TF binding sites upstream of the gene,
with orthologous data from nine additional gamma-proteobacteria for
phylogenetic footprinting. Just three species were sufficient for ~74.0%
of the motif predictions to correspond to the experimentally reported E.
coli sites, and important characteristics to consider when choosing
species were phylogenetic distance, genome size, and natural habitat. We
also performed simulations using randomized data to determine the critical
maximum a posteriori probability (MAP) values for statistical significance
of our motif predictions (P = 0.05). Approximately 60% of motif
predictions containing sites from just three species had average MAP
values above these critical MAP values. The inclusion of a species very
closely related to E. coli increased the number of statistically
significant motif predictions, despite substantially increasing the
critical MAP value. [Supplemental material is available online at
http://www.genome.org. In addition, our motif predictions for the study
set and the entire E. coli genome are available online at
http://www.wadsworth.org/resnres/bioinfo/.]
Reference: McCue LA, Thompson W, Carmack CS, Lawrence CE. (2002).
Factors influencing the identification of transcription factor binding
sites by cross-species comparison. Genome Res. Oct; 12(10): 1523-32.
|