What is Full-length 16S Amplicon Sequencing?

Sequence variation in the 16S ribosomal RNA (rRNA) gene is widely used to characterize taxonomic diversity presenting in microbial communities. A short-read sequencing platform for reading partial regions of the 16S rRNA gene is most commonly used by reducing the cost burden of next-generation sequencing (NGS), but misclassification at the species level due to its length being too short to consider sequence similarity remains a challenge. Full-length 16S amplicon sequencing analysis could overcome the microbial misidentification caused by different sequence similarity in each 16S variable region through comparison the identification accuracy^[1].

Figure 1 Bacterial 16S rRNA gene sequence composition and primer selection.

Application of Full-length 16S Amplicon Sequencing

Medical field: the relationship between human microbiome and human health/disease, etc.
Animal: rumen and animal health/nutrient digestion, etc.
Agronomic field: microbial interactions with plants, etc.

Full-length 16S Sequencing Process

Figure 1 Full-length 16S Sequencing Process.

Sample Submission Guidelines

CD Genomics Data Analysis Pipeline

Figure 2 Data Analysis Pipeline.

Bioinformatics Analysis Content

Data processing and statistics
Feature table construction
Species annotation and taxonomic analysis	Taxonomy distribution histogram of all samples
	Species abundance heatmap
	Classification tree
	Metastats
Alpha diversity analysis	Statistical data of alpha diversity
	Rarefaction curve
	Chao1 curve
	Shannon curve
	Rank abundance
Beta diversity analysis	Different algorithm distance matrix (jaccard, bray Curtis, weighted unifrac and unweighted unifrac)
	PCA analysis
	PCoA analysis
	LEfSe
	※ Anosim/Adonis analysis
	UPGMA analysis
	PICRUST2

Experienced teams of scientists, researchers, and technicians, we provide fast turnaround, high-quality data reports at competitive prices for worldwide customers. Customers can contact our employees directly and we will respond promptly. If you are interested in our services, please contact us or online inquiry for more detailed information.

Table 1 The software table

Software	Version	Analysis_module	Website
Trimmomatic	0.33	Data process	https://github.com/usadellab/Trimmomatic
Cutadapt	1.9.1	Data process	http://cutadapt.readthedocs.org/
USEARCH	10.0.240_i86	Data process	http://drive5.com/usearch
FASTX Toolkit	0.0.14	Data process	https://anaconda.org/bioconda/fastx_toolkit
Flash	1.2.11	Data process	http://ccb.jhu.edu/software/FLASH/index.shtml
Fastp	0.23.1	Data process	https://github.com/OpenGene/fastp
QIIME2	2020.6.0	--	https://qiime2.org
DADA2	1.20.0	Denoise	https://benjjneb.github.io/dada2/
KronaTools	2.6	Krona analysis	https://github.com/marbl/Krona/releases
Mothur	1.34.4	Alpha diversity	http://www.mothur.org
Minimap2	2.14	Taxonomy anotation	https://github.com/lh3/minimap2
LEfSe	1.1.1	Difference analysis	https://github.com/SegataLab/lefse/tree/master/lefse
VSEARCH	2.8.1	Sequence clustering	https://github.com/torognes/vsearch
Picrust2	2.3.0	Function prediction	https://huttenhower.sph.harvard.edu/picrust
Bugbase	0.1.0	Function prediction	https://github.com/knights-lab/BugBase
FAPROTAX	1.2.6	Function prediction	http://www.loucalab.com/archive/FAPROTAX/lib/php/
Tax4Fun	0.3.1	Function prediction	http://tax4fun.gobics.de/
Funguild	1.0	Function prediction	http://www.stbates.org/funguild_db.php

What Are the Advantages of Our Services?

Expert Data Analysis:

Advanced statistical tools and expertise in epigenetics data analysis ensure robust and meaningful results, minimizing artifacts and maximizing insights.

Dedicated Bioinformatician:

A dedicated bioinformatician curates data, selects appropriate statistical methods, and provides biological interpretation, supported by an experienced team.

Interactive Data Analysis Reports:

Searchable, interactive visual reports with peer-reviewed analysis methods and results, ideal for research publication.

Post-Report Follow-Up:

A review call with the dedicated bioinformatician to discuss results and answer any questions about the report.

Dedicated Project Manager:

A single point of contact ensures smooth and efficient project management from data transfer to report delivery.

Large Capacity Computing:

Access to large capacity computing and secure data storage facilities for raw and analyzed data, as well as the data analysis report.

Example Data Analysis Report

To demonstrate the quality and detail of a CD Genomics report for Full-length 16S Amplicon data analysis, we have a sample report available. Contact us to request our Full-length 16S Amplicon data report. You can also refer to a client-published article, "The effects of atrazine on the microbiome of the eastern oyster: Crassostrea virginica" which includes some of the data we provided (DNA Sequence Accession number: PRJNA575277).

How It Works

What Does Analysis of Full-length 16S Amplicon Sequences Show?

Data statistics of the quality control

The number of sample sequences in each stage was statistically processed to evaluate the data quality. The data were evaluated mainly by counting the sequence number, sequence length and other parameters in each stage. The evaluation results of sequencing data of each sample are shown in the following table: Sample ID is sample name; Raw-CCS is the number of CCS identified for the sample. Clean CCS is the number of sequences after identifying and removing primers; Effective-CCS is the number of sequences used for subsequent analysis after length filtering and removal of chimeras. AvgLen (bp) is the average sequence length of samples. Effective (%) is the percentage of effective-CCS over Raw-CCS.

Table 2 Sample sequencing data processing results statistics

Sample ID	Raw CCS	Clean CCS	Effective CCS	AvgLen(bp)	Effective (%)
KSFBL	14,692	13,755	13,224	1,455	90.01
KSVBL	12,836	12,791	12,760	1,460	99.41

Species Annotation and Taxonomic Analysis

In the next sections we will begin to explore the taxonomic composition of the samples and compare samples to the metadata. The first step in this process is to assign taxonomy to the sequences in our QIIME 2 artifact using a pre-trained Naive Bayes classifier and the plugin. This classifier was trained on the Silva 138 99% OTUs. We will apply this classifier to sample sequences and generate a visualization of the resulting mapping from sequence to taxonomy.

Figure 3 The taxonomy distribution of all sample in Phylum classification level. Other classification levels can be found in the taxonomy folder.

Classification Tree

The classification tree is a bifurcating tree that represents a hierarchical clustering of features. The hierarchical clustering uses ward hierarchical clustering based on the degree of proportionality between features.

Figure 4 Phylogenetic tree. The legend in the upper right corner is the species name at the phylum level, and the inner circle is the phylogenetic tree. The same phylum in the inner circle shows the same color. The outer circles indicate the relative abundance proportion of the species in different samples/groups.

Alpha Diversity Analysis

Microbial diversity can be assessed within a community (alpha diversity) or between the collections of samples (beta diversity). Four different metrics were calculated to assess the alpha diversity: Chao1 and Ace simply estimate the number of species in a community; Shannon and Simpson account for both richness and evenness of a community. Larger the Chao1, Ace and Shannon indices correspond to a smaller Simpson index value, indicating greater diversity of species.

Table 3 Alpha Diversity Analysis

Sample ID	ACE	Chao1	Simpson	Shannon
KSFBL	52.238	52.0	0.5304	2.2672
KSVBL	91.4318	101.5	0.5584	2.3816

Beta Diversity Analysis

Principal coordinates analysis (PCoA) is an ordination technique similar to PCA, which picks up the main elements and structure from reduced multi-dimensional database series of eigenvalues and eigenvectors. It starts with a similarity matrix or dissimilarity matrix (distance matrix) and assigns for each item a location in a low-dimensional space.

$PCoA analysis based on weighted unifrac. Each point represents a sample, plotted by a principal component on the X- axis and another principal component on the Y- axis, which was colored by group. The percentage on each axis indicates the contribution value to discrepancy among samples.$ Figure 5 PCoA analysis based on weighted unifrac. Each point represents a sample, plotted by a principal component on the X- axis and another principal component on the Y- axis, which was colored by group. The percentage on each axis indicates the contribution value to discrepancy among samples.

Title: Gut microbiota predicts severity and reveals novel metabolic signatures in acute pancreatitis

Publication: Gut

Main Methods: full-length 16S rRNA gene sequencing, metagenomic sequencing,

Abstract: The study examines whether the microbiome can predict acute pancreatitis (AP) severity early in the disease. Buccal and rectal swabs from 424 patients were sequenced, revealing significant differences in the intestinal microbiome related to severity, mortality, and hospital stay. A predictive classifier using 16 species achieved an AUROC of 85%, outperforming existing severity scores. Functional profiling indicated increased short-chain fatty acid production in severe AP cases. The findings suggest that the orointestinal microbiome could serve as a valuable predictor of AP severity and a target for new diagnostic and therapeutic strategies.

Research Workflow:

Figure 2 the protocol of study and study population. Figure 2 Study protocol and study population.

Research Results:

RAC is associated with early alterations of rectal microbiome

The study explores the association between the orointestinal microbiome and the revised Atlanta classification (RAC) for acute pancreatitis (AP) severity. Despite no significant differences in Alpha- and Beta-diversity for buccal samples, rectal samples showed distinct microbial signatures for RAC III compared to RAC I and II, identified through Bray-Curtis distance. Differential abundance analyses supported these findings. Although the three RAC subgroups differed in seven potential confounding factors, five impacted microbial composition and were included in stratified PERMANOVA analysis. This highlights the role of the microbiome in predicting AP severity.

Mortality is associated with early alterations of rectal microbiome

The study examines the association of the orointestinal microbiome with secondary endpoints in acute pancreatitis (AP) patients, focusing on mortality and microbial diversity. Among the 424 patients, 10 died within 30 days of AP diagnosis or during hospitalization. Deceased patients exhibited significantly lower observed species in rectal samples (p=0.041), but not in buccal samples. Alpha-diversity metrics did not show significant differences. However, the rectal microbiome significantly differed between survivors and deceased patients in Bray-Curtis distances (p=0.006) and other Beta-diversity indices. Deceased patients were older and had lower BMI, influencing microbial composition. Stratified PERMANOVA, accounting for age and BMI, confirmed these differences, suggesting the rectal microbiome's potential in predicting mortality in AP patients.

Figure 3 Association of rectal microbiome with primary and secondary endpoints. (A) PCoA (B) Differential abundances. (C) Bray-Curtis distances and (D) differential abundances. (E) The β-diversity distances. (F) β-diversity were calculated by PERMANOVA. Length of hospital stay was rank-transformed for PERMANOVA tests.

Post hoc definition of severe versus non-severe acute pancreatitis

The study defines severe acute pancreatitis (AP) as having persistent organ failure (>48 hours) and/or pancreatic collections requiring drainage. Among 424 patients, 30 were classified as severe AP, showing significantly higher mortality compared to non-severe cases. Severe AP patients also had more frequent systemic inflammatory response syndrome (SIRS), higher BISAP and HAPS scores, longer hospital stays, and higher rates of organ failure, ICU admissions, necrotic AP, and infected collections. Normalized microbial data from severe AP patients revealed significant differences compared to non-severe cases, suggesting the microbiome's role in predicting AP severity and associated complications.

Figure 4 Association of rectal microbiome data with severity. LEfSe, linear discriminant analysis effect size; PCoA, principal coordinate analysis

Disease severity is associated with microbial shift in rectal microbiome

The study analyzed the Alpha- and Beta-diversity of the microbiome in severe versus non-severe acute pancreatitis (AP) patients. Alpha-diversity indices showed no significant differences in buccal or rectal samples. However, beta-diversity (Bray-Curtis distance) was significantly different in rectal samples (p=0.008) but not buccal samples. Differential abundance analyses identified several species differing between severity groups. Confounding variables were tested, revealing severity remained significant in distance-based redundancy analysis (db-RDA) for rectal samples (p=0.022) even after accounting for ten confounding variables. Stratified PERMANOVA confirmed significant differences (p=0.013), though severity accounted for moderate variance compared to other factors like the sample's country of origin.

Conclusions:

The orointestinal microbiome predicts clinical hallmark features of AP, and SCFAs may be used for future diagnostic and therapeutic concepts

1. What are the advantages of Full-length 16S amplicon sequencing?

The full 16S gene provides better taxonomic resolution; Circular consensus sequencing (CCS) combined with sophisticated denoising algorithms can to remove PCR and sequencing error.

2、How long is the Full-length 16S rRNA sequence?

The 16S rRNA gene sequence is about 1,550 bp long and is composed of both variable and conserved regions.

3、What is Full-length 16S rRNA sequencing for?

Full-length 16s rRNA sequencing is a culture-free method to identify and compare bacterial diversity from complex microbiomes or environments that are difficult to study. It is commonly used to identify bacteria present within a given sample down to the species level.

4、Why is Full-length 16s rRNA a good target for sequencing?

The 16s rRNA sequence is ubiquitous in bacteria and archaea, it can be used to identify a wide diversity of microbes within a single sample and single workflow. Through 16s rRNA sequencing, one can identify taxa present in a sample.

5、What is the difference between 16S and metagenomics analysis?

16S rRNA sequencing primarily investigates the species composition, evolutionary relationships between species, and diversity within a community. In contrast, shotgun metagenomic sequencing goes beyond 16S analysis to delve into gene and functional aspects.

6、How were the Full-length 16S Amplicon sequencing data analyzed？

The process begins with raw data undergoing paired-end merging, quality checks, filtering, and chimera removal to form ASV clusters. It then proceeds to alpha diversity (e.g., rarefaction, Chao1, Shannon curves), species annotation (e.g., KRONA, phylogenetic trees), beta diversity (e.g., UniFrac distances, PCA), and diversity statistics (e.g., NMDS, DCA). Advanced analyses include Metastats, LEfSe, and RDA/CCA.

Reference

Johnson, J.S., et al., Evaluation of 16S rRNA gene sequencing for species and strain-level microbiome analysis. Nature Communications, 2019. 10(1).

Full-length 16S Amplicon Sequencing Data Analysis