Abstract

Intestinal diseases, such as Crohn's disease (CD), ulcerative colitis (UC) and pseudomembranous colitis (CDI), are among the most common diseases in humans and may lead to more serious pathologies, e.g. colorectal cancer (CRC). Next generation sequencing has in recent years allowed the identification of correlations between intestinal bacteria and diseases, although the formulation of universal gut microbial biomarkers for such diseases is only in its infancy. In the current study, we selected and reanalyzed a total of 3048 public datasets obtained from 16S rRNA profiling of individuals affected by CD, UC, CDI and CRC. This meta-analysis revealed possible biases in the reconstruction of the gut microbiota composition due to the use of different primer pairs employed for PCR of 16S rRNA gene fragments. Notably, this approach also identified common features of individuals affected by gut diseases (DS), including lower biodiversity compared to control subjects. Moreover, potential universal intestinal disease microbial biomarkers were identified through cross-disease comparisons. In detail, CTRL showed high abundance of the genera Barnesiella, Ruminococcaceae UCG-005, Alistipes, Christensenellaceae R-7 group and unclassified member of Lachnospiraceae family, while DS exhibited high abundance of Lactobacillus, unclassified member of Erysipelotrichaceae family and Streptococcus genera.

INTRODUCTION

In the last 50 years, the incidence and prevalence of inflammatory bowel diseases (IBD), such as Crohn's disease (CD) and ulcerative colitis (UC), has increased worldwide (Cosnes et al.2011; Molodecky et al.2012; Ananthakrishnan 2015), especially in traditionally low-incident regions such as Asia, South America as well as southern and eastern Europe (Lovasz et al.2013; Ng et al.2013). Moreover, IBD represents one of the main risk factors for the development of colorectal cancer (CRC) (Kulaylat and Dayton 2010) that is a major cause of morbidity and mortality throughout the world (Haggar and Boushey 2009). Furthermore, populations living in industrialized countries have been affected in the last 15 years by an increased incidence of Clostridium difficile infections (CDI) (Reveles et al.2014), representing one of the most common hospital-acquired infections, and being generally associated with antibiotic use and responsible for pseudomembranous colitis (Antharam et al.2013). The recent development of high throughput sequencing technologies, such as Roche 454, Ion Torrent and Illumina, allowed profiling of the bacterial population harbored by the intestinal tract, i.e. gut microbiota, and characterization of alterations in this microbiota, sometimes referred to as gut dysbiosis, associated with major intestinal diseases (Carding et al.2015).

Although many studies have investigated possible correlations between gut microbiota and intestinal diseases, thereby revealing possible gut bacterial biomarkers (Table S1, Supporting Information), most of such studies would have focused on a single pathology such as CD (Perez-Brocal et al.2015; Eun et al.2016), UC (Duranti et al.2016; Mar et al.2016), CRC (Kostic et al.2012; Geng et al.2013; Wu et al.2013; Zackular et al.2014; Burns et al.2015) and CDI (Yatsunenko et al.2012; Antharam et al.2013; Rojo et al.2015; Gu et al.2016; Khanna et al.2016; Milani et al.2016). Nonetheless, the accuracy of the majority of published microbiota analyses is very much dependent on the applied methodology (Turroni et al.2012; Milani et al.2013). One of the most critical steps for accurate 16S rRNA-based microbiota profiling is the selection of primer pairs used for amplification that may lead to under-representation or selection against single bacterial species or even complete microbial groups (Klindworth et al.2013).

Here, we evaluate the accuracy of different and currently used 16S rRNA gene-targeting PCR primers, and evaluate their impact on the profiling of the gut microbiota. Furthermore, we performed a meta-analysis of case-control studies focusing on 16S rRNA profiling of the gut microbiota of individuals affected by CD, UC, CRC and CDI. We selected a total of 3048 datasets, 1252 corresponding to control subjects (CTRL) and 1796 corresponding to individuals with intestinal diseases, retrieved from 24 public studies (Table S2, Supporting Information). In detail, we collected 359, 1457; 512 and 720 samples belonging to CRC (Kostic et al.2012; Geng et al.2013; Weir et al.2013; Wu et al.2013; Zackular et al.2014; Burns et al.2015), CD (Gevers et al.2014; Perez-Brocal et al.2015; Eun et al.2016), Clostridium difficile infections (Yatsunenko et al.2012; Antharam et al.2013; Rojo et al.2015; Gu et al.2016; Khanna et al.2016; Milani et al.2016) and UC studies (Gevers et al.2014; Duranti et al.2016; Mar et al.2016; Shah et al.2016), respectively.

MATERIALS AND METHODS

Selection of databases

All datasets included in this meta-analysis were collected from publicly available and published comparative human gut microbiota studies in the context of CD, UC, CRC and Clostridium difficile infections. For each intestinal disease, we collected 16S rRNA profiling datasets from a minimum of five studies. Illumina sequencing technology was preferred in order to ensure high data coverage and quality. Nevertheless, if Illumina datasets were not available, we included data produced by means of 454 sequencing. Moreover, selected datasets had to represent both control and diseased subjects, and obtained from fecal samples or biopsies collected from the adult human large intestine (average age of 22 ± 18).

Evaluation of primer pairs efficiency

The performance of primer pairs employed in the studies included in our meta-analysis (Table S3, Supporting Information) were evaluated through the web-tool TestPrime 1.0 (Klindworth et al.2013). The latter performs an in silico PCR using the SILVA database as template and provides the percentage of amplified sequences for each bacterial genus (Klindworth et al.2013).

16S rRNA-based microbiota analysis

To avoid biases caused by different bioinformatic analysis pipelines, the sequence read pools of each study were filtered and analyzed using the same custom script based on the QIIME software suite (Caporaso et al.2010). Quality control retained sequences with a length between 140 and 400 bp and mean sequence quality score >20, while sequences with homopolymers >7 bp and mismatched primers were omitted. 16S rRNA operational taxonomic units (OTUs) were defined at ≥97% sequence homology using UCLUST (Edgar 2010) and OTUs with less than 10 sequences were filtered. All reads were classified into the lowest possible taxonomic rank using QIIME (Caporaso et al.2010) and a reference dataset from the SILVA database v.123 (Quast et al.2013). In order to assess the bacterial complexity, the alpha diversity was evaluated based on Chao1 indexes and represented by rarefaction curves generated using 10 subsampling of the whole datasets. Furthermore, the Bray–Curtis dissimilarity index was used to estimate the beta-diversity between CTRL and individuals affected by intestinal diseases. Dissimilarities were reported through a 3D principal coordinate analysis (PCoA) representation.

QIIME and SPSS software (www.ibm.com/software/it/analytics/spss/) were used to compute statistical analyses. PERMANOVA were performed using 1000 permutations to estimate P-values for differences among populations in PCoA analyses. Furthermore, differential abundance of bacterial genera and alpha-diversity was tested by ANOVA. Moreover, covariance analysis between primer pairs and bacterial relative abundance was performed through the Pearson correlation coefficient.

RESULTS AND DISCUSSION

Homogeneity of the samples

16S rRNA-based microbiota profiling is a technique that relies on next-generation sequencing data for a cost-effective analysis of the bacterial community present in a given environmental sample. Due to its accuracy and ability to profile non-cultivable taxa, 16S rRNA-based profiling rapidly became the most widely exploited approach for gut microbiota characterization. Nevertheless, the absence of a gold standard protocol led to extensive methodological variation, with consequent output biases that might prevent reliable and meaningful comparisons between datasets derived from different studies (Milani et al.2013). In detail, possible biases may be due to study design, sample collection, transport and storage of the samples, DNA extraction and other variables related to sequencing and bioinformatics analyses (Milani et al.2013). Among the main reasons for variable data outputs is the species-specific efficiency and accuracy of the various sets of PCR primers employed to amplify part of the 16S rRNA genes that represent a given sample community (Milani et al.2013). In order to evaluate the accuracy and efficacy of the 12 primer pairs that are used in the selected datasets and that also represent the currently most frequently used PCR primers in 16S microbial profiling, we tested the primer pairs through in silico PCR. Notably, this assessment revealed rather variable amplification performances that are expected to cause genera-specific biases (Table S4, Supporting Information).

In silico evaluation of the PCR primers accuracy

Primer pairs Probio_Uni/Probio_Rev, 357F/926R, 338F/806R, 530F/926R, V5F/V6R, 341F/534R and 515F/806R showed an in silico efficacy of >90% in their ability to amplify the targeted 16S rRNA gene sequences. In contrast, primer pairs 27F/338R, 8F/357R, 8F/518R, 27F/534R and 8F/530R exhibited a predicted capacity of <32% for their ability to amplify their specifically targeted 16S rRNA sequences. We then focused on the evaluation of genus-specific amplification performances of intestinal taxa that had been determined to be present at a relative abundance of >2% in at least one sample included in our meta-analysis (Table S5, Supporting Information). The analysis of the 252 selected intestinal genera confirmed the efficiency observed regarding all bacterial 16S rRNA sequences. In detail, the bacterial sequences belonging to 248 genera were amplified by all primers examined and the primer pairs used in studies based on 454 sequencing showed the lowest efficiency (<32%) with the exception of 341F/534R (Muyzer, de Waal and Uitterlinden 1993; Juck et al.2000) (efficiency at genus level of 95.54%), 357F/926R (Liu et al.1997) (efficiency at genus level of 95.38%) and 515F/806R (Caporaso et al.2011; Walters et al.2011) (efficiency at genus level of 95.24%). Moreover, the evaluation of the primer pair-mediated amplication efficiency for each of the bacterial sequences harboring the 252 selected taxa showed that 27F/338R (Hongoh, Ohkuma and Kudo 2003; Fierer et al.2008), 8F/530R (Frank et al.2007; Perez-Brocal et al.2013), 8F/357R, 27F/534R (Ben-Dov et al.2006) and 8F/518R (Muyzer, de Waal and Uitterlinden 1993; Frank et al.2007) primer pairs elicited an amplification efficacy of >70% only for the genera Ethanoligenens, Fretibacterium and Lachnospiraceae UCG-008. Notably, Probio_Uni/Probio_Rev (Milani et al.2013) showed the highest in silico predicted PCR performances among all evaluated PCR primer pairs. In fact, the Probio_Uni/Probio_Rev primer pair was predicted to amplify the 16S rRNA gene sequences of 75.40% of the 252 selected genera with an efficiency of >95%, followed by primer pairs 341F/534R (75.00%) and 357F/926R (71.03%). Furthermore, 530F/926R (Liu et al.1997; Dowd et al.2008), 515F/806R (Caporaso et al.2011; Walters et al.2011), V5F/V6R (Cai et al.2013) and 338F/806R (el Fantroussi et al.1999; Walters et al.2011) displayed an efficiency >95% in less than 70% of the assessed 252 genera. In order to evaluate the correlation between a given primer pair and corresponding predicted relative abundance at genus level, we performed a covariance analysis through Pearson correlation coefficient based on the 3048 datasets and primer pair efficiency. This analysis indicated that 50 genera displayed a positive correlation with a given primer pair-mediated amplification efficiency (P-value <0.05), thereby indicating that the primer pair in question plays an important role in the generation of a bias in the determination of gut microbiota composition. Notably, when focusing on taxa with a relative abundance of >0.1% at least one dataset (Fig. 1), the primer pairs appear to have an impact on assessing the presence and abundance of certain taxa that are considered key gut commensal bacteria, such as Bifidobacterium, Coprococcus 3, genera belonging to Ruminococcaceae and Eubacteriaceae. These marked differences in amplification performance obtained for the tested 12 primer pairs therefore highlight the existence of biases in the reconstruction of the gut microbiota composition as reported by many published studies (Perez-Brocal et al.2015; Rojo et al.2015). This finding unfortunately prevents a reliable cross-study meta-analysis of all datasets corresponding to case and CTRL produced by different research projects. For this reason, each case-control sample processed with different laboratory protocols from several intestinal diseases, i.e. CD, UC, CRC and CDI, were analyzed separately. Subsequently, the study-specific results were evaluated together to define a global trend (increase or decrease) for each bacterial taxon in control versus disease condition.

Covariance analysis based on the selected public datasets and primer pair efficiency. A heat map illustrating relative abundance of genera with a significant positive correlation to primer pair efficiency was shown. Only the genera with a relative abundance >0.1% in at least one dataset were reported.
Figure 1.

Covariance analysis based on the selected public datasets and primer pair efficiency. A heat map illustrating relative abundance of genera with a significant positive correlation to primer pair efficiency was shown. Only the genera with a relative abundance >0.1% in at least one dataset were reported.

Meta- and cross-analysis of the gut microbiota in intestinal diseases

Quality filtering of CD, UC, CRC and CDI samples produced an average of 49 651, 66 127, 62 242 and 376 768 reads, respectively (Table S2, Supporting Information). This level of DNA sequencing depth is considered appropriate to infer a thorough analysis of the gut microbiota (Hamady and Knight 2009).

Analysis of the microbiota complexity evaluated through alpha-diversity cross-study meta-averages, e.g. averages of all the CTRL and all affected subjects for each intestinal disease analyzed, showed higher complexity in control samples compared to CD (P-value < 0.01) (Fig. 2a), CRC (P-value < 0.05) (Fig. 4a) and CDI (P-value < 0.01) (Fig. 5a) samples. In addition, the meta-analyzed studies of UC samples provided very different alpha-diversity curves and, as expected, evaluation of the control and UC cross-study meta-averages showed a P-value > 0.05. Therefore, such data may indicate that biases in taxonomic reconstruction induced by the use of different analytical protocols, such as selection of primer pairs, significantly impact on the observed biodiversity (Hamady and Knight 2009) thus precluding cross-study meta-analysis of alpha diversity (Fig. 3a).

Exploration of the diversity and bacterial composition of CD and CTRL samples. Panel a shows the rarefaction curves calculated through Chao1 index of each study and of CD and CTRL average. Panel b reports the principal coordinate analysis (PCoA) of the collected case-control belonging to CD studies. Panel c displays the bacterial composition at phylum level based on cross-study of the CD and CTRL groups. Panel d reports the bacterial genera present an average abundance variation of >0.5%.
Figure 2.

Exploration of the diversity and bacterial composition of CD and CTRL samples. Panel a shows the rarefaction curves calculated through Chao1 index of each study and of CD and CTRL average. Panel b reports the principal coordinate analysis (PCoA) of the collected case-control belonging to CD studies. Panel c displays the bacterial composition at phylum level based on cross-study of the CD and CTRL groups. Panel d reports the bacterial genera present an average abundance variation of >0.5%.

Evaluation of the microbiota composition of UC and CTRL samples. Panel a indicates the alpha-diversity curves calculated through the Chao1 index of each study and of UC and CTRL average. Panel b shows the PCoA of the UC and CTRL groups. Panel c reports a bar plot depicting the bacterial composition at phylum level based on cross-study of the UC and CTRL groups. Panel d represents the average abundance variation >0.5% at the genus level.
Figure 3.

Evaluation of the microbiota composition of UC and CTRL samples. Panel a indicates the alpha-diversity curves calculated through the Chao1 index of each study and of UC and CTRL average. Panel b shows the PCoA of the UC and CTRL groups. Panel c reports a bar plot depicting the bacterial composition at phylum level based on cross-study of the UC and CTRL groups. Panel d represents the average abundance variation >0.5% at the genus level.

Moreover, cross-study meta-PERMANOVA, i.e. PERMANOVA obtained for all CTRL samples and all affected subjects for each intestinal disease of the meta-analyzed studies, based on the Bray-Curtis dissimilarity index showed a P-value <0.001 for all comparisons indicating a taxonomical difference among the samples from control and diseased subjects (Figs 2b, 3b, 4c and 5b).

Examination of the complexity and bacterial differences between CRC and CTRL samples. Panel a displays the rarefaction curves of each study and of CRC and CTRL average. Panel b reports the beta-diversity of the collected case-control belonging to CRC studies. Panel c shows the bacterial composition at phylum level based on cross-study of the CRC and CTRL groups. Panel d reports the microbiota composition at genus level. The panel reports only the taxa that present an average abundance difference >0.5%.
Figure 4.

Examination of the complexity and bacterial differences between CRC and CTRL samples. Panel a displays the rarefaction curves of each study and of CRC and CTRL average. Panel b reports the beta-diversity of the collected case-control belonging to CRC studies. Panel c shows the bacterial composition at phylum level based on cross-study of the CRC and CTRL groups. Panel d reports the microbiota composition at genus level. The panel reports only the taxa that present an average abundance difference >0.5%.

Exploration of the of the microbiota composition of CDI and CTRL samples. Panel a displays the complexity of each studied sample and of CDI and CTRL average calculated through Chao1 and represented with rarefaction curves. Panel b reports the principal coordinate analysis (PCoA) of the collected case-control belonging to CDI studies. Panel c displays the bacterial composition at phylum level based on cross-study of the CDI and CTRL groups. Panel d reports the taxa composition at genus level reporting only bacteria with an average abundance variation >0.5%.
Figure 5.

Exploration of the of the microbiota composition of CDI and CTRL samples. Panel a displays the complexity of each studied sample and of CDI and CTRL average calculated through Chao1 and represented with rarefaction curves. Panel b reports the principal coordinate analysis (PCoA) of the collected case-control belonging to CDI studies. Panel c displays the bacterial composition at phylum level based on cross-study of the CDI and CTRL groups. Panel d reports the taxa composition at genus level reporting only bacteria with an average abundance variation >0.5%.

Cross-study meta-ANOVA of the bacterial profile at phylum level, e.g. averages of all CTRL and all affected subjects for each intestinal disease analyzed, showed predominant abundance of Bacteroidetes in control samples in mainly disease analyzed, i.e. average of 44.42% (P-value < 0.01), 45.43% (P-value < 0.01) and 34.01% (P-value < 0.01) compared to CD, UC and CRC, respectively (Figs 2c, 3c and 4c). Conversely, when compared to control samples the gut microbiota of diseased samples appears to exhibit a higher abundance of the Proteobacteria phylum, such as in the case of samples from individuals suffering from CD (average 19.11%, P-value < 0.01), CRC (average 15.11%, P-value < 0.01) and CDI (average 22.49%, P-value < 0.01) (Figs 2c, 4c and 5c).

Comparison between the gut microbiota composition of control individuals and of subjects for each intestinal disease analyzed at genus level showed higher abundance of genera belonging to the Bacteroidetes phylum in CTRL samples (Figs 2d, 3d, 4d and 5d). In contrast, genera belonging to Proteobacteria were articularly less abundant in control samples as compared to samples corresponding to each of the investigated diseases (Figs 2d, 3d, 4d and 5d).

Interestingly, comparison between the gut microbiota composition of CD and CTRL samples showed higher abundance of genera Parabacteroides (increase of 67.10%, P-value < 0.05), Faecalibacterium (increase of 18.19%, P-value < 0.05), Prevotella 9 (increase of 185.24%, P-value < 0.05) and Bacteroides (increase of 45.68%, P-value < 0.05), in CTRL samples (Fig. 2d). In contrast, genera Escherichia-Shigella (decrease of −39.08%, P-value < 0.05) and Haemophilus (decrease of −39.08%, P-value < 0.05) were less abundant in control samples as compared to samples obtained from individuals with CD (Fig. 2d). Notably, CRC samples possess a higher abundance of bacteria that have been associated with the development of intestinal diseases, such as Campylobacter (increase of 950.04%, P-value < 0.05) (Warren et al.2013; Akutko and Matusiewicz 2017), or known to be involved in the transition from eubiosis, i.e. an optimal balance of gut microbiota composition (Iebba et al.2016), to dysbiosis, such as Gemella (increase of 118.05%, P-value < 0.05) (Chen et al.2016) (Fig. 4d). Interestingly, 16S microbial profiles of CTRL samples displayed a higher abundance of members of the genus Faecalibacterium (increase of 77.91%, P-value < 0.05), which is considered a bacterial genus with a beneficial effect on the human gut (Ventura et al.2014) and which could have a role in preventing CRC (Wei et al.2016). Interestingly, also the gut microbiota profiles of CDI samples possess a higher abundance of opportunistic pathogens belonging to the phylum Proteobacteria and a lower abundance of taxa that are associated with health promoting effects, such as Bifidobacterium and Faecalibacterium (Milani et al.2014; Ventura et al.2014; Milani et al.2015) (Fig. 5d).

Moreover, the evaluation of the taxonomic trend (Tables S7, S9, S11 and S13, Supporting Information) and the differences of the gut microbiota composition at genus level across the meta-analyzed studies allowed us to identify genera that may represent suitable bacterial biomarkers of each analyzed disease. Interestingly, the relative abundance of 16, 5, 7 and 3 taxa increase, while 2, 4, 4 and 3 taxa decrease, respectively, in CD, UC, CRC and CDI subjects when compared to CTRL individuals in all meta-analyzed studies (Tables S7, S9, S11 and S13, Supporting Information). A summary of the taxa that may constitute specific disease microbial biomarkers was reported in Table 1.

Table 1.

Summary of the microbial gut biomarkers identified in the study.

Intestinal diseasesSamplesGenera
CTRLBarnesiella
Ruminococcus 2
Actinomyces
Eggerthella
Blautia
Crohn's diseasePeptoclostridium
CDFlavonifractor
Erysipelatoclostridium
Lactobacillus
Streptococcus
U. m. of Proteobacteria phylum
Barnesiella
CTRLOdoribacter
Alistipes
Faecalibacterium
Ulcerative colitisStreptococcus
UCVeillonella
U. m. of Enterobacteriaceae family
Haemophilus
U. m. of Lachnospiraceae family
Faecalibacterium
CTRLRuminococcaceae UCG-005
Subdoligranulum
Colorectal cancerAlloprevotella
Gemella
Parvimonas
CRC
Streptococcus
Leptotrichia
Campylobacter
Christensenellaceae R-7 group
CTRLU. m. of Lachnospiraceae family
Ruminococcaceae UCG-003
Clostridium difficile infection
Erysipelatoclostridium
CDIEnterococcus
Lactobacillus
Barnesiella
Ruminococcaceae UCG-005
Total CTRLAlistipes
Christensenellaceae R-7 group
Total diseases
U. m. of Lachnospiraceae family
Lactobacillus
Total diseasesStreptococcus
U. m. of Enterobacteriaceae family
Intestinal diseasesSamplesGenera
CTRLBarnesiella
Ruminococcus 2
Actinomyces
Eggerthella
Blautia
Crohn's diseasePeptoclostridium
CDFlavonifractor
Erysipelatoclostridium
Lactobacillus
Streptococcus
U. m. of Proteobacteria phylum
Barnesiella
CTRLOdoribacter
Alistipes
Faecalibacterium
Ulcerative colitisStreptococcus
UCVeillonella
U. m. of Enterobacteriaceae family
Haemophilus
U. m. of Lachnospiraceae family
Faecalibacterium
CTRLRuminococcaceae UCG-005
Subdoligranulum
Colorectal cancerAlloprevotella
Gemella
Parvimonas
CRC
Streptococcus
Leptotrichia
Campylobacter
Christensenellaceae R-7 group
CTRLU. m. of Lachnospiraceae family
Ruminococcaceae UCG-003
Clostridium difficile infection
Erysipelatoclostridium
CDIEnterococcus
Lactobacillus
Barnesiella
Ruminococcaceae UCG-005
Total CTRLAlistipes
Christensenellaceae R-7 group
Total diseases
U. m. of Lachnospiraceae family
Lactobacillus
Total diseasesStreptococcus
U. m. of Enterobacteriaceae family
Table 1.

Summary of the microbial gut biomarkers identified in the study.

Intestinal diseasesSamplesGenera
CTRLBarnesiella
Ruminococcus 2
Actinomyces
Eggerthella
Blautia
Crohn's diseasePeptoclostridium
CDFlavonifractor
Erysipelatoclostridium
Lactobacillus
Streptococcus
U. m. of Proteobacteria phylum
Barnesiella
CTRLOdoribacter
Alistipes
Faecalibacterium
Ulcerative colitisStreptococcus
UCVeillonella
U. m. of Enterobacteriaceae family
Haemophilus
U. m. of Lachnospiraceae family
Faecalibacterium
CTRLRuminococcaceae UCG-005
Subdoligranulum
Colorectal cancerAlloprevotella
Gemella
Parvimonas
CRC
Streptococcus
Leptotrichia
Campylobacter
Christensenellaceae R-7 group
CTRLU. m. of Lachnospiraceae family
Ruminococcaceae UCG-003
Clostridium difficile infection
Erysipelatoclostridium
CDIEnterococcus
Lactobacillus
Barnesiella
Ruminococcaceae UCG-005
Total CTRLAlistipes
Christensenellaceae R-7 group
Total diseases
U. m. of Lachnospiraceae family
Lactobacillus
Total diseasesStreptococcus
U. m. of Enterobacteriaceae family
Intestinal diseasesSamplesGenera
CTRLBarnesiella
Ruminococcus 2
Actinomyces
Eggerthella
Blautia
Crohn's diseasePeptoclostridium
CDFlavonifractor
Erysipelatoclostridium
Lactobacillus
Streptococcus
U. m. of Proteobacteria phylum
Barnesiella
CTRLOdoribacter
Alistipes
Faecalibacterium
Ulcerative colitisStreptococcus
UCVeillonella
U. m. of Enterobacteriaceae family
Haemophilus
U. m. of Lachnospiraceae family
Faecalibacterium
CTRLRuminococcaceae UCG-005
Subdoligranulum
Colorectal cancerAlloprevotella
Gemella
Parvimonas
CRC
Streptococcus
Leptotrichia
Campylobacter
Christensenellaceae R-7 group
CTRLU. m. of Lachnospiraceae family
Ruminococcaceae UCG-003
Clostridium difficile infection
Erysipelatoclostridium
CDIEnterococcus
Lactobacillus
Barnesiella
Ruminococcaceae UCG-005
Total CTRLAlistipes
Christensenellaceae R-7 group
Total diseases
U. m. of Lachnospiraceae family
Lactobacillus
Total diseasesStreptococcus
U. m. of Enterobacteriaceae family

Identification of universal biomarkers

In order to evaluate the existence of universal intestinal diseases biomarkers, we performed a meta-analysis for all datasets corresponding to studies regarding CD, UC, CRC and CDI. Cross-analysis of the alpha diversity showed a higher biodiversity in CTRL samples with respect to subjects affected by an intestinal disease (DS) (P-value < 0.05) (Fig. 6a). These data confirm previous observations that intestinal dysbiosis is linked to loss of microbiota diversity (Sha et al.2013; Mosca, Leclerc and Hugot 2016). Moreover, the beta-diversity cross-analysis indicated a clear division between CTRL and samples affected by intestinal diseases (meta-PERMANOVA P-value < 0.05) (Fig. 6b). Therefore, we focused at the genus level to identify the differences in gut microbiota composition between these two groups. In detail, these analyses revealed a total of 261 genera with a significantly different abundance (P-value < 0.05) (Table S14, Supporting Information), of which 20 with an average abundance variation of >0.5% (Fig. 6c).

Investigation of the microbiota composition of all subjects affected by an intestinal disease (DS) and control samples. Panel a shows alpha-diversity curves calculated through Chao1 index of each study and of DS and CTRL average. Panel b reports the PCoA of the collected case-control belonging to DS studies. Panel c indicates the bacterial composition at phylum level based on cross-study of the DS and CTRL groups. Panel d shows the bacterial genera that present an average abundance difference >0.5%.
Figure 6.

Investigation of the microbiota composition of all subjects affected by an intestinal disease (DS) and control samples. Panel a shows alpha-diversity curves calculated through Chao1 index of each study and of DS and CTRL average. Panel b reports the PCoA of the collected case-control belonging to DS studies. Panel c indicates the bacterial composition at phylum level based on cross-study of the DS and CTRL groups. Panel d shows the bacterial genera that present an average abundance difference >0.5%.

Interestingly, when focusing on the genera with a significant P-value and a taxonomic trend with a prevalence of >80% (Table S15, Supporting Information), it was possible to identify five and three taxa characteristic of CTRL and DS subjects, respectively. In detail, CTRL showed high relative abundance of the genera Barnesiella (in 90% of the studies and P-value < 0.05), Ruminococcaceae UCG-005 (in 85% of the studies and P-value < 0.05), Alistipes (in 80% of the studies and P-value < 0.05), Christensenellaceae R-7 group (in 80% of the studies and P-value < 0.05) and unclassified member of Lachnospiraceae family (in 80% of the studies and P-value < 0.05), while DS displayed high abundance of the taxa Lactobacillus (in 90% of the studies and P-value < 0.05), unclassified member of Erysipelotrichaceae family (in 80% of the studies and P-value < 0.05) and Streptococcus (in 80% of the studies and P-value < 0.05).

In previous studies, Barnesiella genus was identified only in populations living in developed countries (Mancabelli et al.2017) and was correlated with beneficial effects on human gut (Kulagina et al.2012; Ubeda et al.2013). Moreover, Ruminococcaceae UCG-005, Alistipes and unclassified member of Lachnospiraceae family have been reported to be butyrate-producing bacteria (Flint et al.2012; Chen et al.2017) that may protect healthy subjects from chronic intestinal inflammation (Lepage et al.2011). In contrast, the higher relative abundance of Streptococcus genus in DS confirm its previously reported correlation with a range of gastrointestinal diseases (Murray and Roberts 1978; Burnett-Hartman, Newcomb and Potter 2008) and renders it a valuable candidate as a universal biomarker of intestinal dysbiosis. Furthermore, bacteria belonging to Erysipelotrichaceae family were correlated with inflammation (Kaakoush 2015) and immunomodulation (Palm et al.2014) but their functional correlation with intestinal diseases is far from being fully elucidated.

Notably, the observed higher relative abundance of the non-pathogenic taxa Lactobacillus in DS may reflect lower niche-competition caused by simplification of the dysbiotic gut microbiota (Walter 2008).

CONCLUSIONS

A substantial number of studies based on 16S rRNA gene profiling have reported on the correlation between human gut diseases and microbiota composition. Nevertheless, one of the main biases in the reconstruction of the gut microbiota composition through 16S rRNA profiling is the selection of reliable and universal primer pairs. In silico PCR and covariance analysis of the 12 primer pairs used in 24 selected public gut metagenomic studies confirmed their impact on biased amplification of the targeted section of the 16S rRNA gene. To overcome this limitation, we performed a cross-study meta-analysis of 3048 public metagenomic datasets, corresponding to 1252 control (CTRL) and 1796 patient subjects, in order to identify possible bacterial biomarkers for major intestinal diseases such as CD, UC, CRC and CDI. Furthermore, we analyzed all datasets together, in order to identify possible universal gut disease microbial biomarkers. In detail, this cross-study analysis showed that Barnesiella, Ruminococcaceae UCG-005, Alistipes, Christensenellaceae R-7 group and unclassified member of Lachnospiraceae family genera correlated with a healthy state of subjects. In contrast, subjects that present an intestinal disease displayed higher abundance of genera reported to cause intestinal inflammation, such as unclassified member of Erysipelotrichaceae family and Streptococcus. The identification of novel universal biomarkers as indicators of human gut diseases may contribute to rapid diagnosis as well as to predict the course and prognosis of the disease and guide therapeutic decisions improving patient care.

SUPPLEMENTARY DATA

Supplementary data are available at FEMSEC online

Acknowledgements

Part of this research is conducted using the High Performance Computing (HPC) facility of the University of Parma.

FUNDING

This work was funded by the EU Joint Programming Initiative—A Healthy Diet for a Healthy Life (JPI HDHL, http://www.healthydietforhealthylife.eu/) to MV and DvS (Grant no. 15/JP/HDHL/3280), and the MIUR to MV. We thank GenProbio srl for financial support of the Laboratory of Probiogenomics. LM is supported by Fondazione Cariparma, Parma, Italy. DvS is a member of The APC Microbiome Institute funded by Science Foundation Ireland (SFI), through the Irish Government's National Development Plan (Grant no. SFI/12/RC/2273).

Conflict of interest. None declared.

REFERENCES

Akutko
K
,
Matusiewicz
K
.
Campylobacter concisus as the etiologic agent of gastrointestinal diseases
.
Adv Clin Exp Med
2017
;
26
:
149
54
.

Ananthakrishnan
AN
.
Epidemiology and risk factors for IBD
.
Nat Rev Gastroenterol Hepatol
2015
;
12
:
205
17
.

Antharam
VC
,
Li
EC
,
Ishmael
A
et al. 
Intestinal dysbiosis and depletion of butyrogenic bacteria in Clostridium difficile infection and nosocomial diarrhea
.
J Clin Microbiol
2013
;
51
:
2884
92
.

Ben-Dov
E
,
Shapiro
OH
,
Siboni
N
et al. 
Advantage of using inosine at the 3' termini of 16S rRNA gene universal primers for the study of microbial diversity
.
Appl Environ Microbiol
2006
;
72
:
6902
6
.

Burnett-Hartman
AN
,
Newcomb
PA
,
Potter
JD
.
Infectious agents and colorectal cancer: a review of Helicobacter pylori, Streptococcus bovis, JC virus, and human papillomavirus
.
Cancer EpidemBiomar Prev
2008
;
17
:
2970
9
.

Burns
MB
,
Lynch
J
,
Starr
TK
et al. 
Virulence genes are a signature of the microbiome in the colorectal tumor microenvironment
.
Genome Med
2015
;
7
:
55
.

Cai
L
,
Ye
L
,
Tong
AH
et al. 
Biased diversity metrics revealed by bacterial 16S pyrotags derived from different primer sets
.
PLoS One
2013
;
8
:
e53649
.

Caporaso
JG
,
Lauber
CL
,
Walters
WA
et al. 
Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample
.
P Natl Acad Sci USA
108 Suppl
2011
;
1
:
4516
22
.

Caporaso
JG
,
Kuczynski
J
,
Stombaugh
J
et al. 
QIIME allows analysis of high-throughput community sequencing data
.
Nat Methods
2010
;
7
:
335
6
.

Carding
S
,
Verbeke
K
,
Vipond
DT
et al. 
Dysbiosis of the gut microbiota in disease
.
Microb Ecol Health Dis
2015
;
26
:
26191
.

Chen
T
,
Long
W
,
Zhang
C
et al. 
Fiber-utilizing capacity varies in Prevotella- versus Bacteroides-dominated gut microbiota
.
Sci Rep
2017
;
7
:
2594
.

Chen
Y
,
Ji
F
,
Guo
J
et al. 
Dysbiosis of small intestinal microbiota in liver cirrhosis and its association with etiology
.
Sci Rep
2016
;
6
:
34055
.

Cosnes
J
,
Gower-Rousseau
C
,
Seksik
P
et al. 
Epidemiology and natural history of inflammatory bowel diseases
.
Gastroenterology
2011
;
140
:
1785
94
.

Dowd
SE
,
Callaway
TR
,
Wolcott
RD
et al. 
Evaluation of the bacterial diversity in the feces of cattle using 16S rDNA bacterial tag-encoded FLX amplicon pyrosequencing (bTEFAP)
.
BMC Microbiol
2008
;
8
:
125
.

Duranti
S
,
Gaiani
F
,
Mancabelli
L
et al. 
Elucidating the gut microbiome of ulcerative colitis: bifidobacteria as novel microbial biomarkers
.
FEMS Microbiol Ecol
2016
;
92
,
DOI: 10.1093/femsec/fiw191
.

Edgar
RC
.
Search and clustering orders of magnitude faster than BLAST
.
Bioinformatics
2010
;
26
:
2460
1
.

el Fantroussi
S
,
Verschuere
L
,
Verstraete
W
et al. 
Effect of phenylurea herbicides on soil microbial communities estimated by analysis of 16S rRNA gene fingerprints and community-level physiological profiles
.
Appl Environ Microbiol
1999
;
65
:
982
8
.

Eun
CS
,
Kwak
MJ
,
Han
DS
et al. 
Does the intestinal microbial community of Korean Crohn's disease patients differ from that of western patients?
BMC Gastroenterol
2016
;
16
:
28
.

Fierer
N
,
Hamady
M
,
Lauber
CL
et al. 
The influence of sex, handedness, and washing on the diversity of hand surface bacteria
.
P Natl Acad Sci USA
2008
;
105
:
17994
9
.

Flint
HJ
,
Scott
KP
,
Duncan
SH
et al. 
Microbial degradation of complex carbohydrates in the gut
.
Gut Microbes
2012
;
3
:
289
306
.

Frank
DN
,
St Amand
AL
,
Feldman
RA
et al. 
Molecular-phylogenetic characterization of microbial community imbalances in human inflammatory bowel diseases
.
Proc Natl Acad Sci USA
2007
;
104
:
13780
5
.

Geng
J
,
Fan
H
,
Tang
X
et al. 
Diversified pattern of the human colorectal cancer microbiome
.
Gut Pathog
2013
;
5
:
2
.

Gevers
D
,
Kugathasan
S
,
Denson
LA
et al. 
The treatment-naive microbiome in new-onset Crohn's disease
.
Cell Host Microbe
2014
;
15
:
382
92
.

Gu
S
,
Chen
Y
,
Zhang
X
et al. 
Identification of key taxa that favor intestinal colonization of Clostridium difficile in an adult Chinese population
.
Microbes Infect
2016
;
18
:
30
8
.

Haggar
FA
,
Boushey
RP
.
Colorectal cancer epidemiology: incidence, mortality, survival, and risk factors
.
Clin Colon Rectal Surg
2009
;
22
:
191
7
.

Hamady
M
,
Knight
R
.
Microbial community profiling for human microbiome projects: tools, techniques, and challenges
.
Genome Res
2009
;
19
:
1141
52
.

Hongoh
Y
,
Ohkuma
M
,
Kudo
T
.
Molecular analysis of bacterial microbiota in the gut of the termite Reticulitermes speratus (Isoptera; Rhinotermitidae)
.
FEMS Microbiol Ecol
2003
;
44
:
231
42
.

Iebba
V
,
Totino
V
,
Gagliardi
A
et al. 
Eubiosis and dysbiosis: the two sides of the microbiota
.
New Microbiol
2016
;
39
:
1
12
.

Juck
D
,
Charles
T
,
Whyte
LG
et al. 
Polyphasic microbial community analysis of petroleum hydrocarbon-contaminated soils from two northern Canadian communities
.
FEMS Microbiol Ecol
2000
;
33
:
241
9
.

Kaakoush
NO
.
Insights into the role of Erysipelotrichaceae in the human host
.
Front Cell Infect Microbiol
2015
;
5
:
84
.

Khanna
S
,
Montassier
E
,
Schmidt
B
et al. 
Gut microbiome predictors of treatment response and recurrence in primary Clostridium difficile infection
.
Aliment Pharmacol Ther
2016
;
44
:
715
27
.

Klindworth
A
,
Pruesse
E
,
Schweer
T
et al. 
Evaluation of general 16S ribosomal RNA gene PCR primers for classical and next-generation sequencing-based diversity studies
.
Nucleic Acids Res
2013
;
41
:
e1
.

Kostic
AD
,
Gevers
D
,
Pedamallu
CS
et al. 
Genomic analysis identifies association of Fusobacterium with colorectal carcinoma
.
Genome Res
2012
;
22
:
292
8
.

Kulagina
EV
,
Efimov
BA
,
Maximov
PY
et al. 
Species composition of Bacteroidales order bacteria in the feces of healthy people of various ages
.
Biosci Biotechnol Biochem
2012
;
76
:
169
71
.

Kulaylat
MN
,
Dayton
MT
.
Ulcerative colitis and cancer
.
J Surg Oncol
2010
;
101
:
706
12
.

Lepage
P
,
Hasler
R
,
Spehlmann
ME
et al. 
Twin study indicates loss of interaction between microbiota and mucosa of patients with ulcerative colitis
.
Gastroenterology
2011
;
141
:
227
36
.

Liu
WT
,
Marsh
TL
,
Cheng
H
et al. 
Characterization of microbial diversity by determining terminal restriction fragment length polymorphisms of genes encoding 16S rRNA
.
Appl Environ Microbiol
1997
;
63
:
4516
22
.

Lovasz
BD
,
Golovics
PA
,
Vegh
Z
et al. 
New trends in inflammatory bowel disease epidemiology and disease course in Eastern Europe
.
Dig Liver Dis
2013
;
45
:
269
76
.

Mancabelli
L
,
Milani
C
,
Lugli
GA
et al. 
Meta-analysis of the human gut microbiome from urbanized and pre-agricultural populations
.
Environ Microbiol
2017
;
19
:
1379
90
.

Mar
JS
,
LaMere
BJ
,
Lin
DL
et al. 
Disease severity and immune activity relate to distinct interkingdom gut microbiome states in ethnically distinct ulcerative colitis patients
.
MBio
2016
;
7
:
1
11
.

Milani
C
,
Hevia
A
,
Foroni
E
et al. 
Assessing the fecal microbiota: an optimized ion torrent 16S rRNA gene-based analysis protocol
.
PLoS One
2013
;
8
:
e68739
.

Milani
C
,
Lugli
GA
,
Duranti
S
et al. 
Genomic encyclopedia of type strains of the genus Bifidobacterium
.
Appl Environ Microbiol
2014
;
80
:
6290
302
.

Milani
C
,
Ticinesi
A
,
Gerritsen
J
et al. 
Gut microbiota composition and Clostridium difficile infection in hospitalized elderly individuals: a metagenomic study
.
Sci Rep
2016
;
6
:
25945
.

Milani
C
,
Lugli
GA
,
Duranti
S
et al. 
Bifidobacteria exhibit social behavior through carbohydrate resource sharing in the gut
.
Sci Rep
2015
;
5
:
15782
.

Molodecky
NA
,
Soon
IS
,
Rabi
DM
et al. 
Increasing incidence and prevalence of the inflammatory bowel diseases with time, based on systematic review
.
Gastroenterology
2012
;
142
:
46
54
e42; quiz e30
.

Mosca
A
,
Leclerc
M
,
Hugot
JP
.
Gut microbiota diversity and human diseases: should we reintroduce key predators in our ecosystem?
Front Microbiol
2016
;
7
:
455
.

Murray
HW
,
Roberts
RB
.
Streptococcus bovis bacteremia and underlying gastrointestinal disease
.
Arch Intern Med
1978
;
138
:
1097
9
.

Muyzer
G
,
de Waal
EC
,
Uitterlinden
AG
.
Profiling of complex microbial populations by denaturing gradient gel electrophoresis analysis of polymerase chain reaction-amplified genes coding for 16S rRNA
.
Appl Environ Microbiol
1993
;
59
:
695
700
.

Ng
SC
,
Tang
W
,
Ching
JY
et al. 
Incidence and phenotype of inflammatory bowel disease based on results from the Asia-pacific Crohn's and colitis epidemiology study
.
Gastroenterology
2013
;
145
:
158
65
e152
.

Palm
NW
,
de Zoete
MR
,
Cullen
TW
et al. 
Immunoglobulin A coating identifies colitogenic bacteria in inflammatory bowel disease
.
Cell
2014
;
158
:
1000
10
.

Perez-Brocal
V
,
Garcia-Lopez
R
,
Nos
P
et al. 
Metagenomic analysis of Crohn's disease patients identifies changes in the virome and microbiome related to disease status and therapy, and detects potential interactions and biomarkers
.
Inflamm Bowel Dis
2015
;
21
:
2515
32
.

Perez-Brocal
V
,
Garcia-Lopez
R
,
Vazquez-Castellanos
JF
et al. 
Study of the viral and microbial communities associated with Crohn's disease: a metagenomic approach
.
Clin Transl Gastroenterol
2013
;
4
:
e36
.

Quast
C
,
Pruesse
E
,
Yilmaz
P
et al. 
The SILVA ribosomal RNA gene database project: improved data processing and web-based tools
.
Nucleic Acids Res
2013
;
41
:
D590
6
.

Reveles
KR
,
Lee
GC
,
Boyd
NK
et al. 
The rise in Clostridium difficile infection incidence among hospitalized adults in the United States: 2001–2010
.
Am J Infect Control
2014
;
42
:
1028
32
.

Rojo
D
,
Gosalbes
MJ
,
Ferrari
R
et al. 
Clostridium difficile heterogeneously impacts intestinal community architecture but drives stable metabolome responses
.
ISME J
2015
;
9
:
2206
20
.

Sha
S
,
Xu
B
,
Wang
X
et al. 
The biodiversity and composition of the dominant fecal microbiota in patients with inflammatory bowel disease
.
Diagn Microbiol Infect Dis
2013
;
75
:
245
51
.

Shah
R
,
Cope
JL
,
Nagy-Szakal
D
et al. 
Composition and function of the pediatric colonic mucosal microbiome in untreated patients with ulcerative colitis
.
Gut Microbes
2016
;
7
:
384
96
.

Turroni
F
,
Peano
C
,
Pass
DA
et al. 
Diversity of bifidobacteria within the infant gut microbiota
.
PLoS One
2012
;
7
:
e36957
.

Ubeda
C
,
Bucci
V
,
Caballero
S
et al. 
Intestinal microbiota containing Barnesiella species cures vancomycin-resistant Enterococcus faecium colonization
.
Infect Immun
2013
;
81
:
965
73
.

Ventura
M
,
Turroni
F
,
Lugli
GA
et al. 
Bifidobacteria and humans: our special friends, from ecological to genomics perspectives
.
J Sci Food Agric
2014
;
94
:
163
8
.

Walter
J
.
Ecological role of lactobacilli in the gastrointestinal tract: implications for fundamental and biomedical research
.
Appl Environ Microbiol
2008
;
74
:
4985
96
.

Walters
WA
,
Caporaso
JG
,
Lauber
CL
et al. 
PrimerProspector: de novo design and taxonomic analysis of barcoded polymerase chain reaction primers
.
Bioinformatics
2011
;
27
:
1159
61
.

Warren
RL
,
Freeman
DJ
,
Pleasance
S
et al. 
Co-occurrence of anaerobic bacteria in colorectal carcinomas
.
Microbiome
2013
;
1
:
16
.

Wei
Z
,
Cao
S
,
Liu
S
et al. 
Could gut microbiota serve as prognostic biomarker associated with colorectal cancer patients' survival? A pilot study on relevant mechanism
.
Oncotarget
2016
;
7
:
46158
72
.

Weir
TL
,
Manter
DK
,
Sheflin
AM
et al. 
Stool microbiome and metabolome differences between colorectal cancer patients and healthy adults
.
PLoS One
2013
;
8
:
e70803
.

Wu
N
,
Yang
X
,
Zhang
R
et al. 
Dysbiosis signature of fecal microbiota in colorectal cancer patients
.
Microb Ecol
2013
;
66
:
462
70
.

Yatsunenko
T
,
Rey
FE
,
Manary
MJ
et al. 
Human gut microbiome viewed across age and geography
.
Nature
2012
;
486
:
222
7
.

Zackular
JP
,
Rogers
MA
,
Ruffin
MT
et al. 
The human gut microbiome as a screening tool for colorectal cancer
.
Cancer Prev Res (Phila)
2014
;
7
:
1112
21
.