Creator |
|
Abstract |
- BACKGROUND: As it becomes increasingly possible to obtain DNA sequences of orthologous genes from diverse sets
of taxa, species trees are frequently being inferred from multilocus data. However, the behavior of many methods for
performing this inference has remained largely unexplored. Some methods have been proven to be consistent given
certain evolutionary models, whereas others rely on criteria that, although appropriate for many parameter values,
have peculiar zones of the parameter space in which they fail to converge on the correct estimate as data sets
increase in size.
RESULTS: Here, using North American pines, we empirically evaluate the behavior of 24 strategies for species tree
inference using three alternative outgroups (72 strategies total). The data consist of 120 individuals sampled in eight
ingroup species from subsection Strobus and three outgroup species from subsection Gerardianae, spanning ∼47
kilobases of sequence at 121 loci. Each “strategy” for inferring species trees consists of three features: a species tree
construction method, a gene tree inference method, and a choice of outgroup. We use multivariate analysis
techniques such as principal components analysis and hierarchical clustering to identify tree characteristics that are
robustly observed across strategies, as well as to identify groups of strategies that produce trees with similar features.
We find that strategies that construct species trees using only topological information cluster together and that
strategies that use additional non-topological information (e.g., branch lengths) also cluster together. Strategies that
utilize more than one individual within a species to infer gene trees tend to produce estimates of species trees that
contain clades present in trees estimated by other strategies. Strategies that use the minimize-deep-coalescences
criterion to construct species trees tend to produce species tree estimates that contain clades that are not present in
trees estimated by the Concatenation, RTC, SMRT, STAR, and STEAC methods, and that in general are more balanced
than those inferred by these other strategies.
CONCLUSIONS: When constructing a species tree from a multilocus set of sequences, our observations provide a basis
for interpreting differences in species tree estimates obtained via different approaches that have a two-stage structure
in common, one step for gene tree estimation and a second step for species tree estimation. The methods explored
here employ a number of distinct features of the data, and our analysis suggests that recovery of the same results
from multiple methods that tend to differ in their patterns of inference can be a valuable tool for obtaining reliable
estimates.
|
Resource Type |
|
DOI |
|
Date Available |
|
Date Issued |
|
Citation |
- DeGiorgio, M., Syring, J., Eckert, A. J., Liston, A., Cronn, R., Neale, D. B., & Rosenberg, N. A. (2014). An empirical evaluation of two-stage species tree inference strategies using a multilocus dataset from North American pines. BMC Evolutionary Biology, 14, 67. doi:10.1186/1471-2148-14-67
|
Journal Title |
|
Journal Volume |
|
Rights Statement |
|
Funding Statement (additional comments about funding) |
- This work was supported by NSF grantsDBI-1103639, DBI-0638502, DBI-1146722, DEB-0317103, and by the MurdockCollege Science Research Program.
|
Publisher |
|
Peer Reviewed |
|
Language |
|
Replaces |
|
Additional Information |
- description.provenance : Approved for entry into archive by Erin Clark(erin.clark@oregonstate.edu) on 2014-06-16T21:09:39Z (GMT) No. of bitstreams: 3
license_rdf: 1370 bytes, checksum: cd1af5ab51bcc7a5280cf305303530e9 (MD5)
ListonAaronBotanyPlantPathologyEmpiricalEvaluationTwo-Stage.pdf: 2298594 bytes, checksum: 9e8a789bc387cb2a6d9563fc015a32a5 (MD5)
ListonAaronBotanyPlantPathologyEmpiricalEvaluationTwo-Stage_AdditionalFiles.zip: 972628 bytes, checksum: 1c6fa963522832bc39f9a9698517fc04 (MD5)
- description.provenance : Made available in DSpace on 2014-06-16T21:09:39Z (GMT). No. of bitstreams: 3
license_rdf: 1370 bytes, checksum: cd1af5ab51bcc7a5280cf305303530e9 (MD5)
ListonAaronBotanyPlantPathologyEmpiricalEvaluationTwo-Stage.pdf: 2298594 bytes, checksum: 9e8a789bc387cb2a6d9563fc015a32a5 (MD5)
ListonAaronBotanyPlantPathologyEmpiricalEvaluationTwo-Stage_AdditionalFiles.zip: 972628 bytes, checksum: 1c6fa963522832bc39f9a9698517fc04 (MD5)
Previous issue date: 2014-03-29
- description.provenance : Submitted by Erin Clark (erin.clark@oregonstate.edu) on 2014-06-16T21:09:18Z
No. of bitstreams: 3
license_rdf: 1370 bytes, checksum: cd1af5ab51bcc7a5280cf305303530e9 (MD5)
ListonAaronBotanyPlantPathologyEmpiricalEvaluationTwo-Stage.pdf: 2298594 bytes, checksum: 9e8a789bc387cb2a6d9563fc015a32a5 (MD5)
ListonAaronBotanyPlantPathologyEmpiricalEvaluationTwo-Stage_AdditionalFiles.zip: 972628 bytes, checksum: 1c6fa963522832bc39f9a9698517fc04 (MD5)
|