Bioinformatics Basics: The Comparison Between RRBS Data Analysis and WGBS Data Analysis

Introduction to Bisulfite Sequencing

Because promoter or gene body methylation affects gene expression, NGS-based DNA methylation assessment intends to examine genomic DNA and determine whether single cytosines or whole areas of the genome are methylated or not. NGS can be used to assess DNA methylation using a variety of criteria. The most straightforward method is to incorporate the bisulfite reaction into the sequencing process and perform Whole-Genome Bisulfite Sequencing (WGBS). Even so, adequate read depths will be required to efficiently evaluate methylation status.

As a substitute, DNA methylation detection could be restricted to a subset of the genome, lowering the file size of your experiment and, as a result, the price. Reduced Representation Bisulfite Sequencing (RRBS) is a famous method for doing so RRBS is a technique for studying DNA methylation at the single-nucleotide resolution on a genome-wide level. RRBS is a variant of entire genome bisulfite conversion sequencing that employs restriction enzyme digestion and DNA size selection to narrow the scope of the study to a subgroup of the genome where the large percentage of DNA methylation happens. WGBS creates genome-wide DNA methylation set of data at a cheaper price than concentrating on this part of the genome.

Flowchart of data analysis for WGBS and RRBS. (Liang, 2014)Figure 1. Flowchart of data analysis for WGBS and RRBS. (Liang, 2014)

Comparison between RRBS Data Analysis and WGBS Data Analysis

Direct comparative analysis of RRBS and WGBS was conducted in order to evaluate the significance of both methods in research. The RRBS set of data had greater mapping effectiveness than the WGBS dataset, but a lower overall methylation rate Because RRBS databases are intended to protect a larger percentage of promoters and genes, whereas the unbiased nature of WGBS implies that many more reads come directly from areas of badly assembled non-coding DNA, which can require long portions of repeat areas, distinctions in mapping efficiencies were anticipated.

There are three primary approaches in the WGBS data assessment. Preprocessing of sequencing reads is required first. Second, reads are plotted to a reference genome, which allows for distinctions in reads and reference sequences caused by bisulfite conversion. This can be accomplished by aligning to a three-base genome in which all cytosines have been replaced by thymines. Third, methylation levels in DNA must be quantified across the genome using reads mapped to each cytosine base. Finally, more in-depth analysis tailored to the biological question at hand is required, which usually entails identifying areas of differential DNA methylation between specimens or areas of the genome.

RRBS data assessment, on the other hand, has four primary processes. The first step is data pre-processing, which includes data quality control, statistics, adapter removal, and low-quality reads. The mapping to the reference genome is the next step. Annotation assessment, evaluating the comparison rate and sequencing depth, as well as reading allocation statistics are all part of this procedure. The next step is methylation site calling, which entails the following procedures: methylation site assessment, methylation level calculation, methylation assessment, and classification of differentially methylated regions (DMRs). After that, a differential methylation assessment will be carried out.

