Long-Read Metagenomics Data Analysis

Metagenomics takes the entire microbial community in a specific habitat as a research object, without separating and cultivating, and directly extracts DNA from environmental samples for high-throughput measurement. Metagenome sequencing research eliminated the limitation of pure culture of microorganism isolation, expanded the utilization space of microbial resources, and is effective when studying the environmental microbial communities.

Nanopore sequencing from Oxford Nanopore Technologies (ONT) and Pacific BioSciences (PacBio) single-molecule real-time (SMRT) long-read isoform sequencing (Iso-Seq) are revolutionizing the way Metagenomics are analyzed. Using Oxford Nanopore Technologies (ONT) and PacBio SMRT technology, we can obtain more genomic information, more accurate species classification, more species identification, and more comprehensive characterization of all aspects of microbial communities, which can restore the information of microbial communities in the environment as realistically as possible, so that we can deeply explore their importance to the whole ecosystem (Bharti & Grimm, 2021).

Application Field

Medical field: the relationship between human microbiome and human health/disease, etc.

Animal: rumen and animal health/nutrient digestion, etc.

Agronomic field: microbial interactions with plants, etc.

Environmental field: haze treatment, sewage treatment, etc.

CD Genomics Data Analysis Pipeline

Long-Read Metagenomics Data Analysis

Bioinformatics Analysis Content

  • Data Quality Control
  • Assembly
  • Metaphlan Species Annotation
  • Taxonomy Distribution Histogram of All Samples
  • Functional Database Annotation
    Kyoto Encyclopedia of Genes and Genomes (KEGG); Version: 2018.01;
    Evolutionary genealogy of genes: Non-supervised Orthologous Groups (eggNOG); Version: 4.5;
    Non-Redundant Protein Sequence Database (NR);
    UniProt Knowledge base (UniProt);
    Virulence Factors Database (VFDB);
    Transporter Classification Database (TCDB);
    Pathogen Host Interactions Database (PHI);
    Carbohydrate-Active enZYmes Database (CAZy);
    The Comprehensive Antibiotic Resistance Database (CARD);
  • Alpha Diversity Analysis
    Statistical Data of Alpha Diversity
    Rarefaction curve
    Chao1 curve
    Shannon curve
    Rank Abundance
  • Beta Diversity Analysis
    Different algorithm distance matrix (jaccard, bray Curtis, weighted unifrac and unweighted unifrac)
    PCA Analysis
    PCoA Analysis
    ※ LEfSe
    ※ Anosim/Adonis Analysis
    UPGMA Analysis
  • ※ Binning

How It Works

Table 1 Partial software and database list

Software or database Uses Link
QIMME2 Species Annotation and Taxonomic Analysis
MetaPhlAn2 Taxonomy profiling
HUMAnN2 Functional profiling


  1. Bharti, R., & Grimm, D. G. (2021). Current challenges and best-practice protocols for microbiome analysis. Brief Bioinform, 22(1), 178-193. doi:10.1093/bib/bbz155
