Multi-omics integration analysis represents an advanced methodological approach in the field of omics research, designed to amalgamate data from genomics, epigenomics, transcriptomics, proteomics, and metabolomics. This comprehensive integration aims to elucidate the intricate molecular mechanisms underlying complex physiological regulation. Recognizing the limitations inherent in single-omics datasets for uncovering the mysteries of biological activities, multi-omics integration leverages data from multiple dimensions to provide a more thorough and profound understanding of the essence and principles of life processes. This analytical method holds significant promise for enhancing our comprehension and interpretation of the biological sciences.
You may interested in
Epigenomics is a scientific discipline that employs sequencing technologies to investigate epigenetic phenomena at the genomic and transcriptional levels. The scope of epigenomic research encompasses three-dimensional chromatin architecture, chromatin accessibility, histone modifications, transcription factor binding, DNA methylation, and RNA methylation. Utilizing advanced epigenomic techniques, precise genome-wide epigenetic maps can be generated for plants under various environmental conditions, stress treatments, across different tissues and organs, and during different developmental stages. This methodological approach provides robust support for comprehensively elucidating the characteristics and variations of plant epigenetic landscapes under diverse conditions.
Figure 1 Chromatin structure (Stefanie Rosa et al,. 2013)
ATAC-seq constitutes a sequencing methodology designed to detect regions of chromatin accessibility across the entire genome. This technique is instrumental in delineating chromatin accessibility landscapes, identifying potential enhancer sequences, and locating critical transcription factors. Additionally, it facilitates the comparison of chromatin accessibility regions between transgenic and wild-type specimens, thereby elucidating the functional attributes of target genes.
Chromatin Immunoprecipitation followed by Sequencing (ChIP-seq) is a comprehensive genomic technique designed to identify DNA segments that interact with transcription factors or histones. This methodology facilitates the identification of downstream genes regulated by transcription factors and genes associated with histone modifications. Additionally, it enables the detection of specific histone modification markers, such as H3K4me1, H3K4me3, and H3K27ac, on enhancers or promoters. Moreover, ChIP-seq allows for the comparative analysis of transcription factor binding sites and histone modification sites between transgenic and wild-type organisms, thereby elucidating the functional roles of target genes.
Targeted Cleavage and Tagmentation Technique (CUT&Tag): This sequencing analysis method identifies DNA segments interacting with transcription factors or histones across the entire genome. It serves as an alternative to ChIP-seq, covering similar content.
DNA Affinity Purification and High-throughput Sequencing (DAP-seq): This in vitro sequencing analysis technique detects transcription factor binding sites across the entire genome and is employed to identify downstream genes regulated by transcription factors.
Whole-genome Bisulfite Sequencing (WGBS/BS-seq): This sequencing analysis technique detects DNA methylation sites across the entire genome. It is utilized to identify genes associated with methylation modifications and to compare differences in DNA methylation modification sites between transgenic and wild-type materials to study the functionality of target genes.
Epigenomics is often jointly analyzed with transcriptomics to obtain target genes or identify downstream genes regulated by transcription factors. Transcriptomics herein mainly refers to RNA-seq, which detects mRNA expression levels, while whole transcriptome or non-coding RNA (e.g., miRNA, lncRNA, circRNA) analysis, and the approach of integrating epigenomics with transcriptomics follow a similar rationale as the integration of epigenomics with genomics.
Service you may need
Learn More
ATAC-Sequencing: Introduction, Applications, and Data Analysis
Bioinformatics for Transcriptomics: Overview, Applications, Tools, and Pipeline
Figure 2 Strategies for integrated analysis of epigenomics and transcriptomics
The identification of common genes relies primarily on gene IDs, involving the intersection of genes associated with epigenomic data and differentially expressed genes (DEGs) obtained from transcriptomic analysis. The visualization of this process commonly employs Venn diagrams. As an illustration, in a particular experiment, the total number of genes associated with differentially accessible regions (DARs) obtained from ATAC-seq differential analysis was 9875, while the number of DEGs revealed by RNA-seq analysis was 141. Subsequently, gene IDs were obtained based on gene annotation information to construct a Venn diagram. Comparative analysis revealed 36 common genes between the two datasets (Figure 3).
Figure 3 presents the Venn diagram depicting the overlap between ATAC-seq and RNA-seq data (Di et al., 2023).
However, the Venn diagram exhibits certain limitations in presenting content. When both omics include differential comparison groups, to elucidate the scenario where genes are commonly upregulated or downregulated in both omics, intersection operations need to be performed separately on the upregulated or downregulated gene sets in each omics. It is noteworthy, however, that some genes exhibit diametrically opposite expression patterns in the two omics or show significant differences only in one omics. In light of this circumstance, the quadrant diagram (Figure 4) would be more appropriate.
Within Quadrant 1, a significant increase in chromatin accessibility of genes is observed, coupled with an upregulation trend in their expression levels. This phenomenon suggests that transcription factors may activate the expression of these genes in this region. Conversely, genes in Quadrant 9 demonstrate characteristics of decreased chromatin accessibility and downregulation in expression levels, implying that biological processes such as replication or transcription of these genes may be inactive under specific cellular types or conditions.
In Quadrants 3 and 7, it is noted that the chromatin accessibility and expression levels of genes exhibit opposite trends. For the results in Quadrant 3, it is speculated that other transcription factors or regulatory mechanisms may activate the expression of these genes. Conversely, in Quadrant 7, it is inclined to believe that transcriptional suppressor factors may inhibit the expression of these genes, or that additional synergistic factors are required for co-regulation to activate gene expression.
Furthermore, while the expression levels of genes in Quadrants 2 and 8 show significant changes, the variations in chromatin accessibility are not apparent. Conversely, genes in Quadrants 4 and 6 demonstrate opposite trends. As for Quadrant 5, neither chromatin accessibility nor expression levels of genes show significant differences.
It is noteworthy that genes in Quadrants 2, 5, and 8 (depicted in gray) are not represented in the current presentation. Subsequently, functional enrichment analysis was conducted on genes in Quadrants 1, 3, 7, and 9, and genes with specific functions were marked using four different colors (red, yellow, green, blue) to facilitate further research and analysis.
Figure 4 illustrates the quadrant plot resulting from the integrated analysis of ATAC-seq and RNA-seq data (Wang et al., 2022). The horizontal axis represents the log2 fold change in chromatin accessibility of genes, while the vertical axis represents the log2 fold change in gene expression levels. Differential fold change (FC) values greater than 1 are depicted.
In the comprehensive gene analysis derived from Venn or quadrant diagrams, rigorous methodologies such as Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment were utilized. These methods facilitated the identification of gene sets with significant functional relevance to the research focus, thereby permitting the determination of candidate target genes.
Furthermore, to elucidate the functions of transcription factors within the shared or quadrant-specific gene sets, annotation using databases such as the Plant Transcription Factor Database (PlantTFDB) was performed. This process enabled the identification of transcription factors closely associated with the research objectives, thereby providing critical insights for subsequent investigations.
The functional enrichment analysis results were presented using clear and intuitive bar charts for GO and KEGG, thereby enhancing the visualization of enrichment outcomes. Additionally, the analysis included the depiction of metabolic pathways, regulatory pathways, gene expression levels, and epigenomic differential analysis results. These comprehensive presentations aimed to provide a thorough understanding of gene functions and regulatory mechanisms, as illustrated in Figure 5.
Service you may need
Learn More
Figure 5 depicts the dependency of crop yield on the potential to acquire assimilates (sink) and the capacity to supply assimilates (source). It illustrates the differentially methylated regions and differentially expressed genes (DEGs) associated with metabolic pathways related to the "sink-source" regulation (Wang et al., 2023).
Upon identifying the target genes, genomic visualization software was utilized to present the transcriptional levels and epigenomic peak analysis results. This approach allows for the direct observation of chromatin accessibility, histone modification sites, transcription factor binding sites, and gene expression levels at the target loci (Figure 6).
Figure 6 presents the visualization of the binding sites of SIJMJ6 and H3K27me3, along with the expression levels, for the predicted downstream regulatory genes RIN and ACS4, at the gene loci (adapted from Li et al., 2020).
2.4 Gene Regulatory Network
Gene regulatory networks or co-expression networks can be constructed using databases such as STRING (https://string-db.org/) and software tools such as Cytoscape. These networks are based on epigenome-associated genes and differentially expressed genes (DEGs) from transcriptome analyses (Figure 7). Additionally, these gene expression regulatory networks or co-expression networks can be presented in conjunction with the results of functional enrichment analysis (Figure 8).
Figure 7 illustrates the gene co-expression network (adapted from Ren et al., 2021). Two distinct regulatory networks, CBF4-CRF2 and RAV1-ERFs, were constructed using six transcription factors obtained from ATAC-seq data and differentially expressed genes (DEGs) obtained from RNA-seq data. Transcription factors are depicted as triangles, while downstream genes regulated by predicted transcription factors are represented as circles. DEGs are depicted in purple (upregulated) and orange (downregulated).
Figure 8 presents the shared Gene Ontology (GO) enrichment analysis results in the CBF4-CRF2 and RAV1-ERFs co-expression networks (Ren et al., 2021).
The aforementioned methods and strategies are commonly employed in integrated epigenomic and transcriptomic analyses. Researchers can utilize these approaches to identify potential candidate genes or predict downstream genes regulated by transcription factors. The subsequent section will elucidate their practical applications through specific case studies.
Fruit ripening is governed by a complex regulatory network. Reversible histone methylation and demethylation modulate chromatin structure and gene expression. However, the role of histone demethylases in regulating the fruit ripening process remains poorly understood.
In August 2020, a study titled "Histone demethylase SlJMJ6 promotes fruit ripening by removing H3K27 methylation of ripening-related genes in tomato" was published in New Phytologist. This research identified that Solanum lycopersicum SlJMJ6 encodes a histone lysine demethylase that specifically facilitates the demethylation of H3K27me3, thereby activating the expression of ripening-related genes and promoting tomato fruit ripening. The authors also uncovered novel connections between histone demethylation and DNA demethylation in regulating the ripening process, marking the first report on the involvement of a histone lysine demethylase in fruit ripening regulation.
By integrating RNA sequencing (RNA-seq) and ATAC-seq, the authors identified 32 downstream regulatory genes of SlJMJ6. In tomatoes overexpressing SlJMJ6, these genes exhibited increased transcription, coupled with decreased H3K27me3 modification at their loci. These genes are primarily involved in transcriptional regulation, ethylene biosynthesis, cell wall degradation, and hormone signal transduction (Figure 9). Further validation using reverse transcription quantitative PCR (RT-qPCR) and chromatin immunoprecipitation quantitative PCR (ChIP-qPCR) confirmed that 11 ripening-related genes, including RIN, ACS4, ACO1, PL, and DML2, are directly regulated by SlJMJ6 through H3K27me3 demethylation (Figure 10).
Figure 9 illustrates the integrated analysis of ChIP-seq and RNA-seq data (adapted from Li et al., 2020). Panel (a) depicts a Venn diagram showing the associated genes with SlJMJ6 binding sites (i.e., downstream regulatory genes of SlJMJ6), upregulated genes in SlJMJ6-overexpressing tomato fruits, and genes associated with H3K27me3 demethylation, indicating 32 common genes among them. Panel (b) presents the Gene Ontology (GO) enrichment analysis results for these 32 common genes.
Figure 10 depicts the association between H3K27me3 demethylation and the upregulation of maturity-related genes mediated by SlJMJ6 (adapted from Li et al., 2020). (i) Genome visualization of the maturity-related gene RIN, mediated by SlJMJ6, showing the expression levels in SlJMJ6-overexpressing and wild-type plants, the H3K27me3 modification levels at gene loci, and the binding sites of SlJMJ6. (ii) RT-qPCR analysis of RIN expression levels in wild-type and overexpressing plants. (iii) ChIP-qPCR analysis of SlJMJ6 binding on RIN loci. (iv) ChIP-qPCR analysis of H3K27me3 modification on RIN loci.
This case exemplifies the utility of integrated analysis, demonstrating how ChIP-seq combined with RNA-seq not only identified the downstream regulatory genes of SlJMJ6 but also elucidated the mechanism by which SlJMJ6 promotes tomato fruit ripening through H3K27me3 demethylation. This integrated approach reveals that SlJMJ6 activates the expression of ripening-related genes by demethylating H3K27me3, thereby facilitating the ripening process. Therefore, integrated analyses are not only effective in identifying genes but also valuable in investigating gene functions.
In August 2021, a research paper titled "Characterization of Chromatin Accessibility and Gene Expression upon Cold Stress Reveals that the RAV1 Transcription Factor Functions in Cold Response in Vitis Amurensis" was published in Plant & Cell Physiology. This study employed ATAC-seq and RNA-seq methodologies to identify cold-responsive transcription factors in Vitis amurensis, a grapevine species known for its high cold tolerance. Among the nine identified transcription factors, including CBF4, RAV1, and ERF104, VaRAV1 overexpression in grape cells was found to enhance cold tolerance. This work provides novel insights into plant responses to cold stress and demonstrates the utility of ATAC-seq and RNA-seq in rapidly identifying cold-responsive transcription factors in grapevines. The VaRAV1 transcription factor likely plays a crucial role in plant adaptation to cold environments.
Initially, chromatin accessibility maps of grapes were constructed using ATAC-seq, with Peaks or Transposase Hypersensitive Sites (THSs) representing enriched chromatin accessibility regions. Following a 2-hour cold treatment, 1376 positively enriched and 189 negatively enriched THSs were identified in grape samples. Motif analysis of these THSs, coupled with searches in plant transcription factor databases, revealed motifs corresponding to the CBF family, MYB family, and AP2/ERF superfamily transcription factors, suggesting their potential involvement in grape cold response. Integrating RNA-seq data, 31 transcription factors were identified with altered chromatin accessibility and expression levels post-cold treatment, with nine showing significant expression changes. Cold treatment notably decreased the expression of CRF2 and ESE3, while the expression levels of other genes were upregulated under cold conditions.
Figure 11 illustrates the response of candidate transcription factors to cold treatment (adapted from Ren et al., 2021). (A) Heatmap of TF gene expression after cold treatment. Transcription factors obtained from motif analysis of significantly different THSs after cold treatment were further filtered based on their expression levels after cold treatment. The transcription factor expression data from RNA-seq were normalized using Z-score normalization. (B) Nine transcription factors most likely to be involved in early cold response.
To obtain potential target genes of these transcription factors, weighted gene co-expression network analysis (WGCNA) was performed on the RNA-seq results. Based on the outcomes, genes highly correlated were enriched in the same modules. Six out of nine transcription factor modules were successfully classified into two representative co-expression networks: the CBF-CRF2-dependent and RAV1-ERFs-dependent regulatory networks (Figure 12). In the first network, CBF4 and CRF2 were identified as core genes. In the second network, RAV1, ERF1A, ERF104, and ERF4 served as core genes, suggesting these transcription factors may participate in the grapevine cold response process through at least two distinct signaling pathways.
Figure 12 displays the co-expression network of transcription factors and their target genes, along with the GO enrichment analysis results (adapted from Ren et al., 2021). (A) Common GO enrichment results in the CBF4-CRF2 and RAV1-ERFs co-expression network. (B) Specific GO enrichment results in the CBF4-CRF2 and RAV1-ERFs co-expression network. Transcription factors are represented by triangles, while downstream genes predicted to be regulated by transcription factors are represented by circles.
Given the well-established role of CBFs in cold response across various plant species, including grapes, the authors selected RAV1, ERF1A, and CRF2 as candidate genes to validate their function in grape cold tolerance, with VaRAV1 showing enhanced cold tolerance upon overexpression in grape cells.
This study did not directly investigate the overlap between ATAC-seq and RNA-seq genes. Instead, a comprehensive annotation was performed on transcription factors exhibiting significant chromatin accessibility changes in ATAC-seq results. Subsequently, based on RNA-seq data, transcription factors with significantly altered expression levels were selectively screened and functionally validated. This analytical approach, integrating ATAC-seq and RNA-seq data, serves as a valuable paradigm for identifying target genes.
Crop yield relies on the potential to assimilate sink and the capability to supply source. Optimization of the sink-source relationship holds significant implications for crop yield regulation. Cucumber serves as a typical species for the transport of raffinose family oligosaccharides (RFO), a group of sugars. DNA methylation, a common epigenetic modification in plants, yet its role in regulating the sink-source relationship remains unverified in RFO transport species.
In December 2023, the journal Plants published a research paper titled "The Sink-Source Relationship in Cucumber (Cucumis sativus L.) Is Modulated by DNA Methylation." The study, based on BS-seq and RNA-seq, provides the latest inference regarding the potential role of DNA methylation in the sink-source relationship, which will be crucial for further exploring the molecular mechanisms of enhancing yield in RFO transport plants.
The study compared leaf samples under two treatments: nonfruiting-node leaves (NFNL) and leaves of fruit setting (FNL) at the 12th node, using BS-seq and RNA-seq. BS-seq results revealed a significant enrichment of differentially methylated genes involved in photosynthesis and carbohydrate metabolism processes. The combined analysis of BS-seq and RNA-seq indicated that many differentially methylated region-associated differentially expressed genes (DEGs) participate in the metabolism of auxin, ethylene, and jasmonic acid, sucrose metabolism, and the RFO synthesis pathway associated with the regulation of the sink-source relationship.
Figure 13 illustrates the differentially methylated regions and Differentially Expressed Genes (DEGs) associated with the source-sink regulation-related metabolic pathways (adapted from Wang et al., 2023).
While the article did not validate the function of the target gene, its employed integrated analysis method provides valuable clues for identifying target genes. Based on these clues, subsequent gene function validation work can be pursued.
References: