Genome-wide Assembly and Annotation Service

Genome-wide Assembly and Annotation Service

Online Inquiry

CD Genomics, as one of the most competitive service providers in sequencing and data analysis, we use bioinformatics to help you explore the genetic information and decipher the essence of life. Our skills and experience in data analysis can meet customers' personalized data analysis needs.

Relying on hundreds of success delivery of high-quality genomic assemblies, CD Genomics leverages our experience to maximize assembly quality. We offer a complete workflow, from raw data to publishable reports.

If you want to build a complete de novo assembly, we can also meet your needs. We have extensive experience in data processing for next generation sequencing (NGS) and long read sequencing systems (Pacbio SMRT and Nanopore Sequencing).

What Do We Analyze?

We Can Help Our Clients With

Assist in the study of virulence and bacterial and fungal pathogenesis.

Annotation of the genome to construct a genetic map.

Disclose different pathways that work in an organism.

Providing perspectives into the process of evolution.

Verify observations from experiments.

Aid to comprehend structural differences and intricate rearrangements such as variants in insertions, inversion, translocation, and copy number.

Assessment of levels of alternative splicing and gene expression studies.

CD Genomics Genome-wide Assembly and Annotation Pipeline

CD Genomics Genome-wide Assembly and Annotation Pipeline

What We Offer

Gene Prediction Classifying the genomic DNA regions that encrypt genes
Non-redundant Gene Catalog Assist a more thorough knowledge of the task of microorganisms
KEGG Functional annotation, comprehending the biological system's high-level operations and utilities
CAZY Functional annotation, family categorization focused on sequence, connecting the sequence with the specificity and 3D structure of the enzymes that organize
Taxonomic Annotation Analyses with existing, previously annotated sequences of predicted genes

How It Works

CD Genomics is a high-tech company specializing in multiomic data analysis. We provide services such as project design, data analysis, and database construction. With a focus on developing breakthrough products and services, we are a pioneer in the biotechnology industry, serving researchers and partners worldwide.

How It Works

Introduction of Genome-wide Assembly and Annotation

The price of creating short reads of genomes of new species has been reduced significantly by the latest technological innovations in next-generation sequencing (NGS). Small-scale labs have been capable of performing evaluations such as preprocessing, de novo assembly, gene prediction, and functional study through latest events in the various bioinformatics instruments utilized for sequencing data.

While processes for sequencing have been made simple, genomic analysis has become more tough and complicated. For this, several variables are accountable. First, NGS methods generate short reads; the accuracy of the assembled base sequences typically decreases to the rate of a draft genome when these reads are used for de novo assembly. Second, there are no gene models to represent as a guideline for newly sequenced genomes; thus it is hard to verify the accuracy of the annotation. Third, the annotation of the same genome is conducted using multiple analysis tools and annotation techniques by various research organizations.  This requires all the outcomes to be combined to produce a reliable annotation of consensus. Fourth, scientists who have little expert knowledge in bioinformatics and computational biology often perform genomic analysis on a small scale. While the small-scale genomic assessment is now within the grasp of non-experts, it remains a difficult task.

Application of Genome-wide Assembly and Annotation

Genome-wide Assembly and Annotation can widely adapt to Short-Read and Long-Read Next Generation Sequencing. Short reads are high-quality, cost-effective, and offer deep sequencing coverage, but in areas of high AT or GC content, they appear to have coverage bias. Repeats and low complexity areas are most of these high AT / GC content areas. In repeat and low complexity areas, short read lengths and biased coverage outcomes in fragmented genome assemblies offer a temporary yet critical summary of an organism's genetic composition.

Long reads are lengths are >10kb average reads, but with random errors, the quality is lower. Long-read sequencing involves starting DNA with a high molecular weight that sometimes includes knowledge in sample extraction. Generally, as contrasted to short reads, long read configurations have better continuity, large N50 values, and higher genomic coverage. However, these long-read configurations involve polishing by using short reads to right errors in random base calls. Long-read assembly uses the genome assembly method of OLC (Overlap Layout Consensus).

Comparison between short-read assembly and long read assembly. Figure 1. Comparison between short-read assembly and long read assembly. (Lee, 2014)


  1. Del Angel VD, Hjerde E, Sterck L, et al. Ten steps to get started in Genome Assembly and Annotation. F1000Research. 2018;7.
  2. Lee H, Gurtowski J, Yoo S, et al. Error correction and assembly complexity of single molecule sequencing reads. BioRxiv. 2014.
* For Research Use Only. Not for use in diagnostic procedures.
Online Inquiry