[HTML][HTML] Improved integration time estimation of endogenous retroviruses with phylogenetic data

H Martins, P Villesen�- PloS one, 2011 - journals.plos.org
H Martins, P Villesen
PloS one, 2011journals.plos.org
Background Endogenous retroviruses (ERVs) are genetic fossils of ancient retroviral
integrations that remain in the genome of many organisms. Most loci are rendered non-
functional by mutations, but several intact retroviral genes are known in mammalian
genomes. Some have been adopted by the host species, while the beneficial roles of others
remain unclear. Besides the obvious possible immunogenic impact from transcribing intact
viral genes, endogenous retroviruses have also become an interesting and useful tool to�…
Background
Endogenous retroviruses (ERVs) are genetic fossils of ancient retroviral integrations that remain in the genome of many organisms. Most loci are rendered non-functional by mutations, but several intact retroviral genes are known in mammalian genomes. Some have been adopted by the host species, while the beneficial roles of others remain unclear. Besides the obvious possible immunogenic impact from transcribing intact viral genes, endogenous retroviruses have also become an interesting and useful tool to study phylogenetic relationships. The determination of the integration time of these viruses has been based upon the assumption that both 5′ and 3′ Long Terminal Repeats (LTRs) sequences are identical at the time of integration, but evolve separately afterwards. Similar approaches have been using either a constant evolutionary rate or a range of rates for these viral loci, and only single species data. Here we show the advantages of using different approaches.
Results
We show that there are strong advantages in using multiple species data and state-of-the-art phylogenetic analysis. We incorporate both simple phylogenetic information and Monte Carlo Markov Chain (MCMC) methods to date the integrations of these viruses based on a relaxed molecular clock approach over a Bayesian phylogeny model and applied them to several selected ERV sequences in primates. These methods treat each ERV locus as having a distinct evolutionary rate for each LTR, and make use of consensual speciation time intervals between primates to calibrate the relaxed molecular clocks.
Conclusions
The use of a fixed rate produces results that vary considerably with ERV family and the actual evolutionary rate of the sequence, and should be avoided whenever multi-species phylogenetic data are available. For genome-wide studies, the simple phylogenetic approach constitutes a better alternative, while still being computationally feasible.
PLOS