A recommended practical technique for studying the openness of chromatin is the assay for transposase-accessible chromatin with high throughput sequencing (ATAC-Seq). The idea is to employ the transposase Tn5 to quickly connect to open chromatin, but Tn5 lacks the capability to cut. ATAC-Seq can capture the whole region's sequence and then sequence the DNA sequence captured by the Tn5 enzyme.
Figure 1. The basic principle of ATAC-Seq (Buenrostro 2015)
The highly active Tn5 transposase inserts the sequencing linker into the open area of chromatin, and high-throughput sequencing is used to acquire the sequencing data in this space-time, bioinformatics can be utilized to deduce the availability of chromatin areas, gather details on all active transcriptional regulatory sequences in the entire genome, and evaluate transcription factor binding data, nucleosome area information, and transcriptional regulatory element allocation. The ATAC-Seq method uses fewer cells and is easier to perform. It can detect the open state of chromatin across the entire genome.
Several chromatin-accessibility signatures are investigated using ATAC-Seq assessment. Nucleosome mapping research is the most popular application, but it can also be used to map transcription factor binding sites, adjusted to map DNA methylation locations, and integrated with sequencing techniques. High-resolution enhancer mapping can be used for a variety of purposes, including determining the evolutionary divergence of enhancer usage during growth and discovering a lineage-specific enhancer map for blood cell differentiation. ATAC-Seq was also used to map the genome-wide chromatin accessibility scenery in human cancers, uncovering a reduction in chromatin accessibility overall in macular degeneration.
Because the transposase preferentially behaves in areas of the genome where DNA was available during the experiment, those regions will have considerably more sequencing reads and construct peaks in the ATAC-seq signal that can be detected with peak calling instruments By taking into account additional data, such as their distance from a Transcription Start Site or data from other studies, these areas can be further classified into the numerous regulatory element kinds - promoters, enhancers, insulators, and so on. Sub-regions with decimated ATAC-seq signals can be found within the enriched ATAC-seq signal areas. The "footprints" of DNA-binding proteins are small subregions of DNA that are usually only a few base pairs long. These proteins will defend the DNA strand from transposase cleavage, resulting in the signal being depleted.
The ATAC-Seq data sets' bioinformatic assessment is divided into four parts: (1) data pre-processing, (2) genome mapping, and (3) peak calling, and (3) differential peak assessment.
Data pre-processing: Data quality control, data statistics, adapter removal, and low-quality read removal are all included in this process.
Mapping to reference genome: This process, on the other side, defines the following information: comparison frequency, distinctive comparison level, sequencing depth, and reads allocation statistics.
Peak calling: The most essential aspect of data assessment is the assessment of nucleosome and transcription factor binding sites, as well as the location of peak position on the chromosome, peak length allocation, and peak gene annotation allocation.
Differential peak analysis: Difference peak assessment, difference peak gene annotation, GO enhancement assessment of related genes, and KEGG pathway enhancement assessment of related genes are all performed in this final step.