Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy
- PMID: 17586664
- PMCID: PMC1950982
- DOI: 10.1128/AEM.00062-07
Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy
Abstract
The Ribosomal Database Project (RDP) Classifier, a naïve Bayesian classifier, can rapidly and accurately classify bacterial 16S rRNA sequences into the new higher-order taxonomy proposed in Bergey's Taxonomic Outline of the Prokaryotes (2nd ed., release 5.0, Springer-Verlag, New York, NY, 2004). It provides taxonomic assignments from domain to genus, with confidence estimates for each assignment. The majority of classifications (98%) were of high estimated confidence (> or = 95%) and high accuracy (98%). In addition to being tested with the corpus of 5,014 type strain sequences from Bergey's outline, the RDP Classifier was tested with a corpus of 23,095 rRNA sequences as assigned by the NCBI into their alternative higher-order taxonomy. The results from leave-one-out testing on both corpora show that the overall accuracies at all levels of confidence for near-full-length and 400-base segments were 89% or above down to the genus level, and the majority of the classification errors appear to be due to anomalies in the current taxonomies. For shorter rRNA segments, such as those that might be generated by pyrosequencing, the error rate varied greatly over the length of the 16S rRNA gene, with segments around the V2 and V4 variable regions giving the lowest error rates. The RDP Classifier is suitable both for the analysis of single rRNA sequences and for the analysis of libraries of thousands of sequences. Another related tool, RDP Library Compare, was developed to facilitate microbial-community comparison based on 16S rRNA gene sequence libraries. It combines the RDP Classifier with a statistical test to flag taxa differentially represented between samples. The RDP Classifier and RDP Library Compare are available online at http://rdp.cme.msu.edu/.
Figures
![FIG. 1.](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/1950982/bin/zam0160780800001.gif)
![FIG. 2.](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/1950982/bin/zam0160780800002.gif)
![FIG. 3.](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/1950982/bin/zam0160780800003.gif)
Similar articles
-
Construction & assessment of a unified curated reference database for improving the taxonomic classification of bacteria using 16S rRNA sequence data.Indian J Med Res. 2020 Jan;151(1):93-103. doi: 10.4103/ijmr.IJMR_220_18. Indian J Med Res. 2020. PMID: 32134020 Free PMC article.
-
Construction of habitat-specific training sets to achieve species-level assignment in 16S rRNA gene datasets.Microbiome. 2020 May 15;8(1):65. doi: 10.1186/s40168-020-00841-w. Microbiome. 2020. PMID: 32414415 Free PMC article.
-
Impact of training sets on classification of high-throughput bacterial 16s rRNA gene surveys.ISME J. 2012 Jan;6(1):94-103. doi: 10.1038/ismej.2011.82. Epub 2011 Jun 30. ISME J. 2012. PMID: 21716311 Free PMC article.
-
Accurate taxonomy assignments from 16S rRNA sequences produced by highly parallel pyrosequencers.Nucleic Acids Res. 2008 Oct;36(18):e120. doi: 10.1093/nar/gkn491. Epub 2008 Aug 22. Nucleic Acids Res. 2008. PMID: 18723574 Free PMC article.
-
History and impact of RDP: a legacy from Carl Woese to microbiology.RNA Biol. 2014;11(3):239-43. doi: 10.4161/rna.28306. Epub 2014 Feb 27. RNA Biol. 2014. PMID: 24607969 Free PMC article. Review.
Cited by
-
Seasonal dynamics and environmental drivers of tissue and mucus microbiomes in the staghorn coral Acropora pulchra.PeerJ. 2024 May 30;12:e17421. doi: 10.7717/peerj.17421. eCollection 2024. PeerJ. 2024. PMID: 38827308 Free PMC article.
-
Assembly processes underlying bacterial community differentiation among geographically close mangrove forests.mLife. 2023 Mar 23;2(1):73-88. doi: 10.1002/mlf2.12060. eCollection 2023 Mar. mLife. 2023. PMID: 38818341 Free PMC article.
-
Presepsin as a biomarker of bacterial translocation and an indicator for the prescription of probiotics in cirrhosis.World J Hepatol. 2024 May 27;16(5):822-831. doi: 10.4254/wjh.v16.i5.822. World J Hepatol. 2024. PMID: 38818295 Free PMC article.
-
Climate warming restructures seasonal dynamics of grassland soil microbial communities.mLife. 2022 Sep 15;1(3):245-256. doi: 10.1002/mlf2.12035. eCollection 2022 Sep. mLife. 2022. PMID: 38818216 Free PMC article.
-
Assessing mechanisms for microbial taxa and community dynamics using process models.mLife. 2023 Sep 12;2(3):239-252. doi: 10.1002/mlf2.12076. eCollection 2023 Sep. mLife. 2023. PMID: 38817815 Free PMC article.
References
-
- Ash, C., J. A. E. Farrow, S. Wallbanks, and M. D. Collins. 1991. Phylogenetic heterogeneity of the genus bacillus revealed by comparative analysis of small subunit ribosomal RNA sequences. Lett. Appl. Microbiol. 13:202-206.
-
- Audic, S., and J. M. Claverie. 1997. The significance of digital gene expression profiles. Genome Res. 7:986-995. - PubMed
-
- Brown, M. P. S. 1999. RNA modeling using stochastic context-free grammars. Ph.D. thesis. University of California, Santa Cruz.
-
- Bruno, W. J., N. D. Socci, and A. L. Halpern. 2000. Weighted neighbor joining: a likelihood-based approach to distance-based phylogeny reconstruction. Mol. Biol. Evol. 17:189-197. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials