Tags

Type your tag names separated by a space and hit enter

Robust inference of positive selection from recombining coding sequences.
Bioinformatics. 2006 Oct 15; 22(20):2493-9.B

Abstract

MOTIVATION

Accurate detection of positive Darwinian selection can provide important insights to researchers investigating the evolution of pathogens. However, many pathogens (particularly viruses) undergo frequent recombination and the phylogenetic methods commonly applied to detect positive selection have been shown to give misleading results when applied to recombining sequences. We propose a method that makes maximum likelihood inference of positive selection robust to the presence of recombination. This is achieved by allowing tree topologies and branch lengths to change across detected recombination breakpoints. Further improvements are obtained by allowing synonymous substitution rates to vary across sites.

RESULTS

Using simulation we show that, even for extreme cases where recombination causes standard methods to reach false positive rates >90%, the proposed method decreases the false positive rate to acceptable levels while retaining high power. We applied the method to two HIV-1 datasets for which we have previously found that inference of positive selection is invalid owing to high rates of recombination. In one of these (env gene) we still detected positive selection using the proposed method, while in the other (gag gene) we found no significant evidence of positive selection.

AVAILABILITY

A HyPhy batch language implementation of the proposed methods and the HIV-1 datasets analysed are available at http://www.cbio.uct.ac.za/pub_support/bioinf06. The HyPhy package is available at http://www.hyphy.org, and it is planned that the proposed methods will be included in the next distribution. RDP2 is available at http://darwin.uvigo.es/rdp/rdp.html

Authors+Show Affiliations

Computational Biology Group, Institute of Infectious Disease and Molecular Medicine University of Cape Town, Private Bag, Rondebosch 7701, South Africa. konrad@cbio.uct.ac.zaNo affiliation info availableNo affiliation info available

Pub Type(s)

Journal Article
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

Language

eng

PubMed ID

16895925

Citation

Scheffler, Konrad, et al. "Robust Inference of Positive Selection From Recombining Coding Sequences." Bioinformatics (Oxford, England), vol. 22, no. 20, 2006, pp. 2493-9.
Scheffler K, Martin DP, Seoighe C. Robust inference of positive selection from recombining coding sequences. Bioinformatics. 2006;22(20):2493-9.
Scheffler, K., Martin, D. P., & Seoighe, C. (2006). Robust inference of positive selection from recombining coding sequences. Bioinformatics (Oxford, England), 22(20), 2493-9.
Scheffler K, Martin DP, Seoighe C. Robust Inference of Positive Selection From Recombining Coding Sequences. Bioinformatics. 2006 Oct 15;22(20):2493-9. PubMed PMID: 16895925.
* Article titles in AMA citation format should be in sentence-case
TY - JOUR T1 - Robust inference of positive selection from recombining coding sequences. AU - Scheffler,Konrad, AU - Martin,Darren P, AU - Seoighe,Cathal, Y1 - 2006/08/07/ PY - 2006/8/10/pubmed PY - 2006/11/7/medline PY - 2006/8/10/entrez SP - 2493 EP - 9 JF - Bioinformatics (Oxford, England) JO - Bioinformatics VL - 22 IS - 20 N2 - MOTIVATION: Accurate detection of positive Darwinian selection can provide important insights to researchers investigating the evolution of pathogens. However, many pathogens (particularly viruses) undergo frequent recombination and the phylogenetic methods commonly applied to detect positive selection have been shown to give misleading results when applied to recombining sequences. We propose a method that makes maximum likelihood inference of positive selection robust to the presence of recombination. This is achieved by allowing tree topologies and branch lengths to change across detected recombination breakpoints. Further improvements are obtained by allowing synonymous substitution rates to vary across sites. RESULTS: Using simulation we show that, even for extreme cases where recombination causes standard methods to reach false positive rates >90%, the proposed method decreases the false positive rate to acceptable levels while retaining high power. We applied the method to two HIV-1 datasets for which we have previously found that inference of positive selection is invalid owing to high rates of recombination. In one of these (env gene) we still detected positive selection using the proposed method, while in the other (gag gene) we found no significant evidence of positive selection. AVAILABILITY: A HyPhy batch language implementation of the proposed methods and the HIV-1 datasets analysed are available at http://www.cbio.uct.ac.za/pub_support/bioinf06. The HyPhy package is available at http://www.hyphy.org, and it is planned that the proposed methods will be included in the next distribution. RDP2 is available at http://darwin.uvigo.es/rdp/rdp.html SN - 1367-4811 UR - https://www.unboundmedicine.com/medline/citation/16895925/Robust_inference_of_positive_selection_from_recombining_coding_sequences_ L2 - https://academic.oup.com/bioinformatics/article-lookup/doi/10.1093/bioinformatics/btl427 DB - PRIME DP - Unbound Medicine ER -