Sequencing projects have allowed diverse retroviruses and LTR retrotransposons from different eukaryotic organisms to be characterized. It is known that retroviruses and other retro-transcribing viruses evolve from LTR retrotransposons and that this whole system clusters into five families: Ty3/Gypsy, Retroviridae, Ty1/Copia, Bel/Pao and Caulimoviridae. Phylogenetic analyses usually show that these split into multiple distinct lineages but what is yet to be understood is how deep evolution occurred in this system.
We combined phylogenetic and graph analyses to investigate the history of LTR retroelements both as a tree and as a network. We used 268 non-redundant LTR retroelements, many of them introduced for the first time in this work, to elucidate all possible LTR retroelement phylogenetic patterns. These were superimposed over the tree of eukaryotes to investigate the dynamics of the system, at distinct evolutionary times. Next, we investigated phenotypic features such as duplication and variability of amino acid motifs, and several differences in genomic ORF organization. Using this information we characterized eight reticulate evolution markers to construct phenotypic network models.
The evolutionary history of LTR retroelements can be traced as a time-evolving network that depends on phylogenetic patterns, epigenetic host-factors and phenotypic plasticity. The Ty1/Copia and the Ty3/Gypsy families represent the oldest patterns in this network that we found mimics eukaryotic macroevolution. The emergence of the Bel/Pao, Retroviridae and Caulimoviridae families in this network can be related with distinct inflations of the Ty3/Gypsy family, at distinct evolutionary times. This suggests that Ty3/Gypsy ancestors diversified much more than their Ty1/Copia counterparts, at distinct geological eras. Consistent with the principle of preferential attachment, the connectivities among phenotypic markers, taken as network-represented combinations, are power-law distributed. This evidences an inflationary mode of evolution where the system diversity; 1) expands continuously alternating vertical and gradual processes of phylogenetic divergence with episodes of modular, saltatory and reticulate evolution; 2) is governed by the intrinsic capability of distinct LTR retroelement host-communities to self-organize their phenotypes according to emergent laws characteristic of complex systems.
This article was reviewed by Eugene V. Koonin, Eric Bapteste, and Enmanuelle Lerat (nominated by King Jordan).