What is Full-length 16S Amplicon Sequencing?
Sequence variation in the 16S ribosomal RNA (rRNA) gene is widely used to characterize taxonomic diversity presenting in microbial communities. A short-read sequencing platform for reading partial regions of the 16S rRNA gene is most commonly used by reducing the cost burden of next-generation sequencing (NGS), but misclassification at the species level due to its length being too short to consider sequence similarity remains a challenge. Full-length 16S amplicon sequencing analysis could overcome the microbial misidentification caused by different sequence similarity in each 16S variable region through comparison the identification accuracy[1].
Figure 1 Bacterial 16S rRNA gene sequence composition and primer selection.
Application of Full-length 16S Amplicon Sequencing
- Medical field: the relationship between human microbiome and human health/disease, etc.
- Animal: rumen and animal health/nutrient digestion, etc.
- Agronomic field: microbial interactions with plants, etc.
Full-length 16S Sequencing Process
CD Genomics Data Analysis Pipeline
Figure 2 Data Analysis Pipeline.
Bioinformatics Analysis Content
Data processing and statistics | |
Feature table construction | |
Species annotation and taxonomic analysis | Taxonomy distribution histogram of all samples |
Species abundance heatmap | |
Classification tree | |
Metastats | |
Alpha diversity analysis | Statistical data of alpha diversity |
Rarefaction curve | |
Chao1 curve | |
Shannon curve | |
Rank abundance | |
Beta diversity analysis | Different algorithm distance matrix (jaccard, bray Curtis, weighted unifrac and unweighted unifrac) |
PCA analysis | |
PCoA analysis | |
LEfSe | |
※ Anosim/Adonis analysis | |
UPGMA analysis | |
PICRUST2 |
Experienced teams of scientists, researchers, and technicians, we provide fast turnaround, high-quality data reports at competitive prices for worldwide customers. Customers can contact our employees directly and we will respond promptly. If you are interested in our services, please contact us or online inquiry for more detailed information.
Table 1 The software table
Software | Version | Analysis_module | Website |
---|---|---|---|
Trimmomatic | 0.33 | Data process | https://github.com/usadellab/Trimmomatic |
Cutadapt | 1.9.1 | Data process | http://cutadapt.readthedocs.org/ |
USEARCH | 10.0.240_i86 | Data process | http://drive5.com/usearch |
FASTX Toolkit | 0.0.14 | Data process | https://anaconda.org/bioconda/fastx_toolkit |
Flash | 1.2.11 | Data process | http://ccb.jhu.edu/software/FLASH/index.shtml |
Fastp | 0.23.1 | Data process | https://github.com/OpenGene/fastp |
QIIME2 | 2020.6.0 | -- | https://qiime2.org |
DADA2 | 1.20.0 | Denoise | https://benjjneb.github.io/dada2/ |
KronaTools | 2.6 | Krona analysis | https://github.com/marbl/Krona/releases |
Mothur | 1.34.4 | Alpha diversity | http://www.mothur.org |
Minimap2 | 2.14 | Taxonomy anotation | https://github.com/lh3/minimap2 |
LEfSe | 1.1.1 | Difference analysis | https://github.com/SegataLab/lefse/tree/master/lefse |
VSEARCH | 2.8.1 | Sequence clustering | https://github.com/torognes/vsearch |
Picrust2 | 2.3.0 | Function prediction | https://huttenhower.sph.harvard.edu/picrust |
Bugbase | 0.1.0 | Function prediction | https://github.com/knights-lab/BugBase |
FAPROTAX | 1.2.6 | Function prediction | http://www.loucalab.com/archive/FAPROTAX/lib/php/ |
Tax4Fun | 0.3.1 | Function prediction | http://tax4fun.gobics.de/ |
Funguild | 1.0 | Function prediction | http://www.stbates.org/funguild_db.php |
What Are the Advantages of Our Services?
Expert Data Analysis:
Advanced statistical tools and expertise in epigenetics data analysis ensure robust and meaningful results, minimizing artifacts and maximizing insights.
Dedicated Bioinformatician:
A dedicated bioinformatician curates data, selects appropriate statistical methods, and provides biological interpretation, supported by an experienced team.
Interactive Data Analysis Reports:
Searchable, interactive visual reports with peer-reviewed analysis methods and results, ideal for research publication.
Post-Report Follow-Up:
A review call with the dedicated bioinformatician to discuss results and answer any questions about the report.
Dedicated Project Manager:
A single point of contact ensures smooth and efficient project management from data transfer to report delivery.
Large Capacity Computing:
Access to large capacity computing and secure data storage facilities for raw and analyzed data, as well as the data analysis report.
Example Data Analysis Report
To demonstrate the quality and detail of a CD Genomics report for Full-length 16S Amplicon data analysis, we have a sample report available. Contact us to request our Full-length 16S Amplicon data report. You can also refer to a client-published article, "The effects of atrazine on the microbiome of the eastern oyster: Crassostrea virginica" which includes some of the data we provided (DNA Sequence Accession number: PRJNA575277).
How It Works
What Does Analysis of Full-length 16S Amplicon Sequences Show?
Data statistics of the quality control
The number of sample sequences in each stage was statistically processed to evaluate the data quality. The data were evaluated mainly by counting the sequence number, sequence length and other parameters in each stage. The evaluation results of sequencing data of each sample are shown in the following table: Sample ID is sample name; Raw-CCS is the number of CCS identified for the sample. Clean CCS is the number of sequences after identifying and removing primers; Effective-CCS is the number of sequences used for subsequent analysis after length filtering and removal of chimeras. AvgLen (bp) is the average sequence length of samples. Effective (%) is the percentage of effective-CCS over Raw-CCS.
Table 2 Sample sequencing data processing results statistics
Sample ID | Raw CCS | Clean CCS | Effective CCS | AvgLen(bp) | Effective (%) |
---|---|---|---|---|---|
KSFBL | 14,692 | 13,755 | 13,224 | 1,455 | 90.01 |
KSVBL | 12,836 | 12,791 | 12,760 | 1,460 | 99.41 |
Species Annotation and Taxonomic Analysis
In the next sections we will begin to explore the taxonomic composition of the samples and compare samples to the metadata. The first step in this process is to assign taxonomy to the sequences in our QIIME 2 artifact using a pre-trained Naive Bayes classifier and the plugin. This classifier was trained on the Silva 138 99% OTUs. We will apply this classifier to sample sequences and generate a visualization of the resulting mapping from sequence to taxonomy.
Figure 3 The taxonomy distribution of all sample in Phylum classification level. Other classification levels can be found in the taxonomy folder.
Classification Tree
The classification tree is a bifurcating tree that represents a hierarchical clustering of features. The hierarchical clustering uses ward hierarchical clustering based on the degree of proportionality between features.
Figure 4 Phylogenetic tree. The legend in the upper right corner is the species name at the phylum level, and the inner circle is the phylogenetic tree. The same phylum in the inner circle shows the same color. The outer circles indicate the relative abundance proportion of the species in different samples/groups.
Alpha Diversity Analysis
Microbial diversity can be assessed within a community (alpha diversity) or between the collections of samples (beta diversity). Four different metrics were calculated to assess the alpha diversity: Chao1 and Ace simply estimate the number of species in a community; Shannon and Simpson account for both richness and evenness of a community. Larger the Chao1, Ace and Shannon indices correspond to a smaller Simpson index value, indicating greater diversity of species.
Table 3 Alpha Diversity Analysis
Sample ID | ACE | Chao1 | Simpson | Shannon |
---|---|---|---|---|
KSFBL | 52.238 | 52.0 | 0.5304 | 2.2672 |
KSVBL | 91.4318 | 101.5 | 0.5584 | 2.3816 |
Beta Diversity Analysis
Principal coordinates analysis (PCoA) is an ordination technique similar to PCA, which picks up the main elements and structure from reduced multi-dimensional database series of eigenvalues and eigenvectors. It starts with a similarity matrix or dissimilarity matrix (distance matrix) and assigns for each item a location in a low-dimensional space.
Figure 5 PCoA analysis based on weighted unifrac. Each point represents a sample, plotted by a principal component on the X- axis and another principal component on the Y- axis, which was colored by group. The percentage on each axis indicates the contribution value to discrepancy among samples.
Title: Gut microbiota predicts severity and reveals novel metabolic signatures in acute pancreatitis
Publication: Gut
Main Methods: full-length 16S rRNA gene sequencing, metagenomic sequencing,
Abstract: The study examines whether the microbiome can predict acute pancreatitis (AP) severity early in the disease. Buccal and rectal swabs from 424 patients were sequenced, revealing significant differences in the intestinal microbiome related to severity, mortality, and hospital stay. A predictive classifier using 16 species achieved an AUROC of 85%, outperforming existing severity scores. Functional profiling indicated increased short-chain fatty acid production in severe AP cases. The findings suggest that the orointestinal microbiome could serve as a valuable predictor of AP severity and a target for new diagnostic and therapeutic strategies.
Research Workflow:
Figure 2 Study protocol and study population.
Research Results:
RAC is associated with early alterations of rectal microbiome
The study explores the association between the orointestinal microbiome and the revised Atlanta classification (RAC) for acute pancreatitis (AP) severity. Despite no significant differences in Alpha- and Beta-diversity for buccal samples, rectal samples showed distinct microbial signatures for RAC III compared to RAC I and II, identified through Bray-Curtis distance. Differential abundance analyses supported these findings. Although the three RAC subgroups differed in seven potential confounding factors, five impacted microbial composition and were included in stratified PERMANOVA analysis. This highlights the role of the microbiome in predicting AP severity.
Mortality is associated with early alterations of rectal microbiome
The study examines the association of the orointestinal microbiome with secondary endpoints in acute pancreatitis (AP) patients, focusing on mortality and microbial diversity. Among the 424 patients, 10 died within 30 days of AP diagnosis or during hospitalization. Deceased patients exhibited significantly lower observed species in rectal samples (p=0.041), but not in buccal samples. Alpha-diversity metrics did not show significant differences. However, the rectal microbiome significantly differed between survivors and deceased patients in Bray-Curtis distances (p=0.006) and other Beta-diversity indices. Deceased patients were older and had lower BMI, influencing microbial composition. Stratified PERMANOVA, accounting for age and BMI, confirmed these differences, suggesting the rectal microbiome's potential in predicting mortality in AP patients.
Figure 3 Association of rectal microbiome with primary and secondary endpoints. (A) PCoA (B) Differential abundances. (C) Bray-Curtis distances and (D) differential abundances. (E) The β-diversity distances. (F) β-diversity were calculated by PERMANOVA. Length of hospital stay was rank-transformed for PERMANOVA tests.
Post hoc definition of severe versus non-severe acute pancreatitis
The study defines severe acute pancreatitis (AP) as having persistent organ failure (>48 hours) and/or pancreatic collections requiring drainage. Among 424 patients, 30 were classified as severe AP, showing significantly higher mortality compared to non-severe cases. Severe AP patients also had more frequent systemic inflammatory response syndrome (SIRS), higher BISAP and HAPS scores, longer hospital stays, and higher rates of organ failure, ICU admissions, necrotic AP, and infected collections. Normalized microbial data from severe AP patients revealed significant differences compared to non-severe cases, suggesting the microbiome's role in predicting AP severity and associated complications.
Figure 4 Association of rectal microbiome data with severity. LEfSe, linear discriminant analysis effect size; PCoA, principal coordinate analysis
Disease severity is associated with microbial shift in rectal microbiome
The study analyzed the Alpha- and Beta-diversity of the microbiome in severe versus non-severe acute pancreatitis (AP) patients. Alpha-diversity indices showed no significant differences in buccal or rectal samples. However, beta-diversity (Bray-Curtis distance) was significantly different in rectal samples (p=0.008) but not buccal samples. Differential abundance analyses identified several species differing between severity groups. Confounding variables were tested, revealing severity remained significant in distance-based redundancy analysis (db-RDA) for rectal samples (p=0.022) even after accounting for ten confounding variables. Stratified PERMANOVA confirmed significant differences (p=0.013), though severity accounted for moderate variance compared to other factors like the sample's country of origin.
Conclusions:
The orointestinal microbiome predicts clinical hallmark features of AP, and SCFAs may be used for future diagnostic and therapeutic concepts
1. What are the advantages of Full-length 16S amplicon sequencing?
The full 16S gene provides better taxonomic resolution; Circular consensus sequencing (CCS) combined with sophisticated denoising algorithms can to remove PCR and sequencing error.
2、How long is the Full-length 16S rRNA sequence?
The 16S rRNA gene sequence is about 1,550 bp long and is composed of both variable and conserved regions.
3、What is Full-length 16S rRNA sequencing for?
Full-length 16s rRNA sequencing is a culture-free method to identify and compare bacterial diversity from complex microbiomes or environments that are difficult to study. It is commonly used to identify bacteria present within a given sample down to the species level.
4、Why is Full-length 16s rRNA a good target for sequencing?
The 16s rRNA sequence is ubiquitous in bacteria and archaea, it can be used to identify a wide diversity of microbes within a single sample and single workflow. Through 16s rRNA sequencing, one can identify taxa present in a sample.
5、What is the difference between 16S and metagenomics analysis?
16S rRNA sequencing primarily investigates the species composition, evolutionary relationships between species, and diversity within a community. In contrast, shotgun metagenomic sequencing goes beyond 16S analysis to delve into gene and functional aspects.
6、How were the Full-length 16S Amplicon sequencing data analyzed?
The process begins with raw data undergoing paired-end merging, quality checks, filtering, and chimera removal to form ASV clusters. It then proceeds to alpha diversity (e.g., rarefaction, Chao1, Shannon curves), species annotation (e.g., KRONA, phylogenetic trees), beta diversity (e.g., UniFrac distances, PCA), and diversity statistics (e.g., NMDS, DCA). Advanced analyses include Metastats, LEfSe, and RDA/CCA.
Reference
- Johnson, J.S., et al., Evaluation of 16S rRNA gene sequencing for species and strain-level microbiome analysis. Nature Communications, 2019. 10(1).