Graduate Thesis Or Dissertation
 

Characterizing the Accuracy of Phylogenetic Analyses that Leverage 16S rRNA Sequencing Data

Public Deposited

Downloadable Content

Download PDF
https://ir.library.oregonstate.edu/concern/graduate_thesis_or_dissertations/2801pn85f

Descriptions

Attribute NameValues
Creator
Abstract
  • Investigations of 16S rRNA gene sequences hallmark modern microbiology. These sequences provide culture-independent insight into the abundance and distribution of microbiota and serve as a principle resource through which microbial community diversity is measured. Consequently, researchers rely on 16S gene sequences to test hypotheses rooted in ecology, evolution, and disease. Within 16S gene analyses, there exist potential sources of error that are often overlooked and under considered when developing studies and interpreting data. Prior research demonstrates that methodological sources of error introduced into 16S gene studies may arise from choices in sample preservation and storage temperature, DNA extraction method, PCR, and sequencing platform. Further variation can be introduced during informatic processing that is applied post DNA sequencing. Collectively, these errors limit the power of inferences derived from 16S rRNA gene sequences. It is therefore imperative to understand how study methodology impacts nucleotide sequence data to accurately interpret results from 16S genes. I provide a summary of these methodological sources of error from literature and distill out best practices for conducting 16S rRNA studies when applicable. One widespread application of 16S rRNA sequences that microbiome studies frequently rely on is phylogenetic measures, which can assess microbial community diversity or infer evolutionary patterns. The conclusions drawn from these phylogenetic metrics assume the underlying phylogeny is reconstructed accurately; yet, the accuracy of phylogenetic trees has been shown to be dependent on a myriad of conditions, some of which remain unresolved. I describe how sequence length, region of the 16S gene, sequence diversity, and sample size effect the accuracy of 16S rRNA gene phylogenies using simulated data. Additionally, I show how incorporating full-length sequences selected from referential 16S rRNA sequence databases during phylogenetic reconstruction can improve the accuracy of 16S rRNA gene trees that are otherwise assembled from the short DNA sequences obtained by contemporary sequencing platforms. Collectively, I highlight through literature review the importance of experimental design throughout the typical steps taken during the 16S rRNA gene sequencing workflow, and I demonstrate through simulation analyses how several of these methodological choices impact the accuracy of resulting phylogenies.
License
Resource Type
Date Issued
Degree Level
Degree Name
Degree Field
Degree Grantor
Commencement Year
Advisor
Committee Member
Academic Affiliation
Rights Statement
Publisher
Peer Reviewed
Language
Embargo reason
  • Ongoing Research
Embargo date range
  • 2019-06-14 to 2021-07-15
Accessibility Feature

Relationships

Parents:

This work has no parents.

In Collection:

Items