As a bioinformatics data analysis provider, CD Genomics is experienced in Nanopore cDNA & Iso-seq Full-length Transcriptome Analysis and our high-quality data analysis platform will be used to generate high-quality analysis results in a fast analysis cycle.
Sequencing reads of NGS are much shorter than the length of mature messenger RNAs (mRNAs) and often simply do not provide enough information for accurate inference of the structure and relative abundances of the complete transcripts.
Nanopore sequencing from Oxford Nanopore Technologies (ONT) and Pacific BioSciences (PacBio) single-molecule real-time (SMRT) long-read isoform sequencing (Iso-Seq) are revolutionizing the way transcriptomes are analyzed. It can directly measure the full length of transcripts, accurately identify isoforms, accurately analyze alternative splicing, fusion genes, allele expression, and other regulation events (Logsdon, Vollger, & Eichler, 2020).
Fig.1 Overview of long-read sequencing technologies (Logsdon et al., 2020).
Complete genome/transcriptome information: complex gene structure, transcript structure and function analysis.
Differential/dynamic transcriptome: stress treatment or different temporal transcriptome changes, animal and plant domestication mechanisms.
Compare transcriptomes: relatedness between closely related species, mRNA sequence differences.
Elucidation of fusion phenomena: cancer pathogenesis of fusion genes.
Complex gene transcript studies: complex spliced forms of genes involved in disease or cancer.
Quantitative transcriptome: gene/transcript level quantification, mining conditional specific genes/transcripts.
CD Genomics Data Analysis Pipeline
Fig.2 Nanopore cDNA & Iso-seq Full-length Transcriptome bioinformatics workflow.
Bioinformatics Analysis Content
- Data quality control
- Mapping transcriptome and Transcript quantification
- Full-Length Transcript Identification
- Identification of complex alternative splicing (AS)
- Fusion transcripts
- Alternative polyadenylation (APA) events
- New Gene Analysis
- Long non-coding RNA (lncRNA) discovery
- Gene Function Annotation
- SSR Analysis
- Transcription Factor Analysis
How It Works
CD Genomics is a high-tech company specializing in multiomic data analysis. We provide services such as project design, data analysis, and database construction. With a focus on developing breakthrough products and services, we are a pioneer in the biotechnology industry, serving researchers and partners worldwide.
Table 1 Partial software and database list
|Software or database||Versions||Uses||Link|
|NanoFilt||2.8.0||TGS data filtering||https://github.com/wdecoster/nanofilt|
1. Does the full-length transcriptome need to be interrupted and spliced?
The full-length transcriptome is based on PacBio or Nanopore third-generation sequencing platform, which can directly obtain the complete transcript containing the 5 ', 3 'UTR and poly A tail without interruption of splicing, so as to accurately analyze the structural information such as alternative splicing and fusion genes of the reference genome species. To overcome the problem of short splicing and incomplete information of transcripts from species without reference genomes.
2. How accurate is Nanopore cDNA & Iso-seq Full-length Transcriptome?
The base error rate of the TGS is not biased. For the Pacbio platform, CCS (circular consensus sequencing read) sequencing mode is used to study the full-length transcriptome, and the generated reads are self-corrected using CCS algorithm. In addition, the second-generation data auxiliary correction was added to the analysis process to effectively improve the base accuracy. For the Nanopore platform, the use of reference genome information for sequence correction can also effectively improve the performance of base correction.