Tags

Type your tag names separated by a space and hit enter

When trees grow too long: investigating the causes of highly inaccurate bayesian branch-length estimates.
Syst Biol. 2010 Mar; 59(2):145-61.SB

Abstract

A surprising number of recent Bayesian phylogenetic analyses contain branch-length estimates that are several orders of magnitude longer than corresponding maximum-likelihood estimates. The levels of divergence implied by such branch lengths are unreasonable for studies using biological data and are known to be false for studies using simulated data. We conducted additional Bayesian analyses and studied approximate-posterior surfaces to investigate the causes underlying these large errors. We manipulated the starting parameter values of the Markov chain Monte Carlo (MCMC) analyses, the moves used by the MCMC analyses, and the prior-probability distribution on branch lengths. We demonstrate that inaccurate branch-length estimates result from either 1) poor mixing of MCMC chains or 2) posterior distributions with excessive weight at long tree lengths. Both effects are caused by a rapid increase in the volume of branch-length space as branches become longer. In the former case, both an MCMC move that scales all branch lengths in the tree simultaneously and the use of overdispersed starting branch lengths allow the chain to accurately sample the posterior distribution and should be used in Bayesian analyses of phylogeny. In the latter case, branch-length priors can have strong effects on resulting inferences and should be carefully chosen to reflect biological expectations. We provide a formula to calculate an exponential rate parameter for the branch-length prior that should eliminate inference of biased branch lengths in many cases. In any phylogenetic analysis, the biological plausibility of branch-length output must be carefully considered.

Authors+Show Affiliations

Section of Integrative Biology and Center for Computational Biology and Bioinformatics, University of Texas at Austin, 1 University Station C0930, Austin, TX 78712, USA. jembrown@berkeley.eduNo affiliation info availableNo affiliation info availableNo affiliation info available

Pub Type(s)

Journal Article
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.

Language

eng

PubMed ID

20525627

Citation

Brown, Jeremy M., et al. "When Trees Grow Too Long: Investigating the Causes of Highly Inaccurate Bayesian Branch-length Estimates." Systematic Biology, vol. 59, no. 2, 2010, pp. 145-61.
Brown JM, Hedtke SM, Lemmon AR, et al. When trees grow too long: investigating the causes of highly inaccurate bayesian branch-length estimates. Syst Biol. 2010;59(2):145-61.
Brown, J. M., Hedtke, S. M., Lemmon, A. R., & Lemmon, E. M. (2010). When trees grow too long: investigating the causes of highly inaccurate bayesian branch-length estimates. Systematic Biology, 59(2), 145-61. https://doi.org/10.1093/sysbio/syp081
Brown JM, et al. When Trees Grow Too Long: Investigating the Causes of Highly Inaccurate Bayesian Branch-length Estimates. Syst Biol. 2010;59(2):145-61. PubMed PMID: 20525627.
* Article titles in AMA citation format should be in sentence-case
TY - JOUR T1 - When trees grow too long: investigating the causes of highly inaccurate bayesian branch-length estimates. AU - Brown,Jeremy M, AU - Hedtke,Shannon M, AU - Lemmon,Alan R, AU - Lemmon,Emily Moriarty, Y1 - 2009/12/10/ PY - 2010/6/8/entrez PY - 2010/6/9/pubmed PY - 2010/10/13/medline SP - 145 EP - 61 JF - Systematic biology JO - Syst Biol VL - 59 IS - 2 N2 - A surprising number of recent Bayesian phylogenetic analyses contain branch-length estimates that are several orders of magnitude longer than corresponding maximum-likelihood estimates. The levels of divergence implied by such branch lengths are unreasonable for studies using biological data and are known to be false for studies using simulated data. We conducted additional Bayesian analyses and studied approximate-posterior surfaces to investigate the causes underlying these large errors. We manipulated the starting parameter values of the Markov chain Monte Carlo (MCMC) analyses, the moves used by the MCMC analyses, and the prior-probability distribution on branch lengths. We demonstrate that inaccurate branch-length estimates result from either 1) poor mixing of MCMC chains or 2) posterior distributions with excessive weight at long tree lengths. Both effects are caused by a rapid increase in the volume of branch-length space as branches become longer. In the former case, both an MCMC move that scales all branch lengths in the tree simultaneously and the use of overdispersed starting branch lengths allow the chain to accurately sample the posterior distribution and should be used in Bayesian analyses of phylogeny. In the latter case, branch-length priors can have strong effects on resulting inferences and should be carefully chosen to reflect biological expectations. We provide a formula to calculate an exponential rate parameter for the branch-length prior that should eliminate inference of biased branch lengths in many cases. In any phylogenetic analysis, the biological plausibility of branch-length output must be carefully considered. SN - 1076-836X UR - https://www.unboundmedicine.com/medline/citation/20525627/When_trees_grow_too_long:_investigating_the_causes_of_highly_inaccurate_bayesian_branch_length_estimates_ L2 - https://academic.oup.com/sysbio/article-lookup/doi/10.1093/sysbio/syp081 DB - PRIME DP - Unbound Medicine ER -