Long-read Sequencing Data Analysis

Long-read Sequencing Data Analysis

Online Inquiry

The advent of long-read sequencing technology has broadened the application of genomics and transcriptomics, greatly overcoming the limitations of short-read length. However, the most advanced short-read data processing pipeline cannot be fully adapted to the needs of long-read Seq data analysis. CD Genomics combines sequencing data analysis with omics, not only developing a standardized long-read Seq bioinformatics pipeline, but also enabling advanced data mining to provide integration services for your genomic, transcriptomic and epigenomic data, supporting new discoveries that were not possible before.

What Do We Analyze?


Current long-read sequencing platforms (PacBio SMRT and Oxford Nanopore sequencing) have overcome limitations in accuracy and throughput, producing reads typically exceeding 10 kb. While enabling improved de novo assembly, identification of structural variants and transcriptional isoforms, and performing whole human genome dating to identify co-inherited alleles, haplotype information, and de novo mutations, they also place greater demands on the analytical tools used to analyze length data and pipeline for analyzing length data.

In particular, the current mainstream long-read sequencing, PacBio SMRT and ONT technology, produce different data than that of next-generation sequencing. However, what is the same is that both short-read and long-read sequencing technologies generate large amounts of data, are computationally intensive, and have complex processes. In addition to the standard analysis workflow, multiple data processing combinations and tools need to be developed for different purposes or research projects. This undoubtedly increases the difficulty and computational volume of data analysis again.

Overview of long-read sequencing data analysis tools and pipelines (Amarasinghe S L et al., 2022)Overview of long-read sequencing data analysis tools and pipelines (Amarasinghe S L et al., 2022)

Long-Read Seq Data Analysis Services

The principles of PacBio SMRT and Nanopore sequencing are completely different and therefore different tools have been developed for data analysis.

CD Genomics has developed custom solutions for long-read seq data analysis. We provide bioinformatics analysis services combining short-read and long-read seq data for different needs. We have the ability to provide the solutions and pipelines of complex bioinformatics for data analysis to support the acquisition of new information-rich insights to help you discover informative features from massive data to advance your project or know the next design step.

Primary/secondary/tertiary sequencing data analysis are provided, from the evaluation of raw sequencing data, to alignment and in-depth analysis and biological interpretation.

Our long-read sequencing data analysis services (but not limited to)

Primary Analysis Basecalling
Quality control (FastQC / PycoQC / MinIONQC)
Read filtering / trimming / adapter removal
Secondary Analysis Genome Assembly
Consensus Sequences & Error correction
Variants Calling
Tertiary Analysis Structural Variant Analysis
SNP/CNV Analysis
Gene Regulation Analysis
Alternative Splicing Analysis
Base Modification analysis
Gene Ontology Enrichment Analysis
Pathway Enrichment Analysis
Protein-Protein Interaction Network
Co-Expression Network Analysis

Advantages of CD Genomics

  • Fast and accurate bioinformatics analysis covering both short-read and long-read Seq data
  • Experienced in processing highly complex multi-omics datasets
  • In addition to standard pipelines, we have the ability to develop custom pipelines and propose appropriate analysis approaches according to your purposes and needs
  • Real-time continuous communication to ensure troubleshooting and problem solving
  • Advanced analysis and visualization of individual results data on demand
  • Complete confidentiality process to ensure the security and privacy of your data and results

Explore Our Bioinformatics Service

Genomics Data Analysis

Transcriptomics Data Analysis

Microbial Genomics and Bioinformatics

Epigenetics Data Analysis

Comparative Genomics Analysis

NGS Data Analysis

How It Works

CD Genomics provides accurate and cost-effective bioinformatics analysis services. We have a proven analytical pipeline in database mining and analysis. We can handle many file types, including FASTQ, BED, SAM, BAM, VCF, etc.

Even if you don't have data yet, we can help you plan your research, provide expert experimental advice, and can schedule your data generation.

How It Works

If you have any questions about our bioinformatics services, such as how we promote your research and build your reports, please contact us for more detailed information.


  1. Amarasinghe S L, Su S, Dong X, et al. Opportunities and challenges in long-read sequencing data analysis. Genome biology, 2020, 21(1): 1-16.
* For Research Use Only. Not for use in diagnostic procedures.
Online Inquiry