As a bioinformatics data analysis provider, CD Genomics is experienced in Nanopore cDNA & Iso-seq Full-length Transcriptome Analysis and our high-quality data analysis platform will be used to generate high-quality analysis results in a fast analysis cycle.

Introduction

Sequencing reads of NGS are much shorter than the length of mature messenger RNAs (mRNAs) and often simply do not provide enough information for accurate inference of the structure and relative abundances of the complete transcripts.

Nanopore sequencing from Oxford Nanopore Technologies (ONT) and Pacific BioSciences (PacBio) single-molecule real-time (SMRT) long-read isoform sequencing (Iso-Seq) are revolutionizing the way transcriptomes are analyzed. It can directly measure the full length of transcripts, accurately identify isoforms, accurately analyze alternative splicing, fusion genes, allele expression, and other regulation events (Logsdon, Vollger, & Eichler, 2020).

Fig.1 Overview of long-read sequencing technologies (Logsdon et al., 2020).

Application Field

Agronomy
Complete genome/transcriptome information: complex gene structure, transcript structure and function analysis.
Differential/dynamic transcriptome: stress treatment or different temporal transcriptome changes, animal and plant domestication mechanisms.
Compare transcriptomes: relatedness between closely related species, mRNA sequence differences.
Medical
Elucidation of fusion phenomena: cancer pathogenesis of fusion genes.
Complex gene transcript studies: complex spliced forms of genes involved in disease or cancer.
Quantitative transcriptome: gene/transcript level quantification, mining conditional specific genes/transcripts.

CD Genomics Data Analysis Pipeline

Fig.2 Nanopore cDNA & Iso-seq Full-length Transcriptome bioinformatics workflow.

Bioinformatics Analysis Content

Data quality control
Mapping transcriptome and Transcript quantification
Full-Length Transcript Identification
Identification of complex alternative splicing (AS)
Fusion transcripts
Alternative polyadenylation (APA) events
New Gene Analysis
Long non-coding RNA (lncRNA) discovery
Gene Function Annotation
SSR Analysis
Transcription Factor Analysis

How It Works

CD Genomics is a high-tech company specializing in multiomic data analysis. We provide services such as project design, data analysis, and database construction. With a focus on developing breakthrough products and services, we are a pioneer in the biotechnology industry, serving researchers and partners worldwide.

How It Works

Table 1 Partial software and database list

Software or database	Versions	Uses	Link
NanoFilt	2.8.0	TGS data filtering	https://github.com/wdecoster/nanofilt
minimap2	2.17	Mapping	https://github.com/lh3/minimap2
samtools	1.11	Sorting	https://github.com/samtools/samtools
seqkit	0.12.0	FASTA/Q tool	https://github.com/shenwei356/seqkit
StringTie	2.1.4	Reconstruct transcripts	http://ccb.jhu.edu/software/stringtie/

1. Does the full-length transcriptome need to be interrupted and spliced?

The full-length transcriptome is based on PacBio or Nanopore third-generation sequencing platform, which can directly obtain the complete transcript containing the 5 ', 3 'UTR and poly A tail without interruption of splicing, so as to accurately analyze the structural information such as alternative splicing and fusion genes of the reference genome species. To overcome the problem of short splicing and incomplete information of transcripts from species without reference genomes.

2. How accurate is Nanopore cDNA & Iso-seq Full-length Transcriptome?

The base error rate of the TGS is not biased. For the Pacbio platform, CCS (circular consensus sequencing read) sequencing mode is used to study the full-length transcriptome, and the generated reads are self-corrected using CCS algorithm. In addition, the second-generation data auxiliary correction was added to the analysis process to effectively improve the base accuracy. For the Nanopore platform, the use of reference genome information for sequence correction can also effectively improve the performance of base correction.