Why Perform Multi-omics Data Mining?

Multi-omics data mining is vital for elucidating the intricate networks that govern biological processes and disease mechanisms. By integrating diverse omics data—such as genomics, transcriptomics, proteomics, and metabolomics—researchers can uncover molecular interactions and regulatory pathways that single-omics approaches may overlook.Through multi-omics integration, scientists can explore complex biological functions, identify potential biomarkers, and investigate gene-protein-metabolite networks. The growing importance of multi-omics mining is evident in fields like cancer research, metabolic disorders, and microbiome studies.

What Can We Offer?

We offer a comprehensive multi-omics data mining service that includes integrated analysis of genomic, transcriptomic, proteomic, metabolomic, and lipidomic data. We specialize in normalizing and correlating data from these diverse biomolecular levels to uncover relationships and regulatory mechanisms. Our services also encompass advanced bioinformatics analysis, pathway enrichment studies, and the identification of key biomarkers, all aimed at facilitating deeper insights into biological processes and enhancing experimental applications. Our offerings are enhanced by artificial intelligence and machine learning capabilities, allowing for predictive modeling and trend identification. Additionally, we offer tailored solutions and educational resources to empower researchers in utilizing multi-omics effectively.

Workflow for Multi-omics Data Mining

Multi-omics Data Download

As multi-omics data grows in complexity, the need for advanced computational methods to effectively integrate and analyze heterogeneous datasets has surged. To address this, numerous databases have been developed, offering researchers the tools to systematically manage and interpret multi-omics datasets. Below is a list of some key databases that support these analyses.We can download the required data from these databases for data mining, and data from other databases can also be accessed.

Database	Data types	Link address
The Cancer Genome Atlas (TCGA)	RNA-Seq, DNA-Seq, miRNA-Seq, SNV, CNV, DNA methylation, clinical data	https://www.cancer.gov/ccg/
Cancer Cell Line Encyclopedia (CCLE)	Gene expression, copy number, and sequencing data	https://sites.broadinstitute.org/ccle
PeptideAtlas repository	Proteomics	https://peptideatlas.org/repository/
MetaboLights	Metabolomics datasets	https://www.ebi.ac.uk/metabolights/
National Center for Biotechnology Information(NCBI)	Genomics and transcriptomics	https://www.ncbi.nlm.nih.gov

Multi-omics Analysis Content

Genome + Transcriptome	Elucidate gene regulation and expression patterns
	Genomic Variant Analysis,Differential Eexpression Analysis,Multi-omics Correlation, Pathway Enrichment.
Proteome + Transcriptome	Uncover the relationship between gene expression and protein synthesis
	Differential Protein Expression Analysis,DEG analysis,Integration and Correlation Analysis
Metabolome + Transcriptome	Explore the interactions between metabolic processes and gene expression
	Metabolite Profiling，Transcriptomic Analysis，Statistical Correlation Analysis，Pathway Enrichment Analysis，Network Analysis
Metabolome + Proteome	Highlight critical metabolic pathways, proteins, and metabolites
	GO functional analysis, metabolic pathway enrichment, and molecular interaction analyses
Metabolome + Microbiome	Uncover the complex interactions between microorganisms and host metabolism.
	Differential Microbes ,Differential Metabolites,CCA Analysis, Correlation Analysis
Transcriptome + Proteome + Metabolome	Provide insight into biological processes by linking gene expression to protein function and metabolic outcomes.
	Nnormalization, Comparative analysis,Correlation Analysis

What Are the Advantages of Our Service?

Extensive Expertise in Multi-omics Data Mining

CD Genomics has over 20 years of rich experience in multi-omics data mining, successfully completing more than 1,00 related projects. This expertise allows us to leverage cutting-edge technologies and methodologies to provide high-quality data analysis, helping clients solve complex biological problems. We are committed to rigorous research and detailed analysis, ensuring that each project effectively supports our clients' research objectives.

Multidisciplinary Expert Team

Our team consists of over 1,00 experts specializing in genomics, proteomics, metabolomics, and more. This interdisciplinary background enables us to analyze multi-omics data from various scientific perspectives, providing comprehensive insights and innovative solutions. Our experts possess extensive research experience and can tailor data analysis plans to meet specific research needs.

Robust Data Integration Capabilities

CD Genomics excels at integrating data from various omics levels.We have a powerful data processing platform capable of efficiently handling and analyzing multi-omics datasets, revealing complex relationships between molecules. This integration capability enhances the depth and breadth of research, improving our clients' understanding of biological processes and aiding in the discovery of new biomarkers and therapeutic targets.

Global Collaboration Network

We have established extensive partnerships with leading research institutions and medical centers worldwide, promoting innovation and development in multi-omics research. Through these collaborations, CD Genomics gains access to the latest technologies and resources, providing better services to our clients.

Transparent Data Delivery

All delivered data and reports are provided through a secure online platform, allowing clients to access and download the required information at any time. We also include detailed explanations of the data processing and analysis methods used, giving clients clarity on each step and scientific rationale, enhancing the transparency and traceability of the results.

Choosing our company's Multi-omics Data Mining Service greatly enhances the depth and precision of research by providing comprehensive, integrated insights across multiple biological layers, accelerating breakthroughs in complex biological systems and driving innovative discoveries.

What Does Multi-omics Data Mining Reveal?

Transcriptomic and Proteomic Profiles Unveil Metabolic Reprogramming in Endometrial Cancer

The study identified metabolic reprogramming in endometrial cancer (EC) through principal component analysis, revealing distinct metabolic profiles between normal and tumor tissues. Gene set enrichment analysis highlighted consistent activation of glycolysis/gluconeogenesis and folate biosynthesis pathways across multiple datasets.

Principal component analysis (PCA) of metabolic genes in EC cohorts, venn diagram of pathways, and GSEA plots of metabolic reprogramming. Figure 1.Metabolic reprogramming of EC was uncovered through transcriptomic and proteomic profiling.(A, B) PCA plots of metabolic genes in the TCGA (A) and GSE17025 (B) cohorts. (C) Venn diagram showing shared upregulated metabolic pathways among TCGA, GSE106191, and GSE17025 (left) and GSEA plots highlighting glycolysis/gluconeogenesis and folate biosynthesis in the TCGA cohort (right). (D) PCA of metabolic genes at the protein level in the CPTAC cohort. (E) Bar plots of GSEA for differentially expressed metabolic genes in CPTAC, comparing tumor and normal samples.(Liu, 2024)

Genomic and Transcriptomic Profiling Reveals Architectures and Subtypes of Bladder Cancer

The authors analyzed 26 significantly mutated genes(SMGs) and 34 somatic copy number alterations(SCNAs) from bladder cancer databases to infer the genomic architecture, finding that SMGs had higher median cancer cell fractions than SCNAs. Bayesian non‐negative matrix factorization analysis identified two molecular subgroups (Cluster A and B) with distinct clinical outcomes, with Cluster A showing poorer survival. Multivariate analysis confirmed the genomic architecture as an independent predictor of patient outcomes. Cluster A had a higher SCNA burden and frequent whole-genome doubling, while Cluster B was enriched with Asian patients.Additionally, transcriptomic data via ssGSEA were used to compare the relative expression levels of immune-related genes between the two clusters. In Dataset 1, Cluster B patients showed significantly higher infiltration levels of Th17 cells and CD8+ T cells.

Genomic architecture and transcriptomic immune scores analysis of bladder cancer subtypes, showing Kaplan–Meier survival curves and SCNA score differences. Figure 2. Bladder cancer prognostic subtypes defined by distinct genomic architectures and transcriptomic immune scores. (A) Genomic architecture analysis of 505 bladder cancer samples identified two subtypes, Cluster A and Cluster B, using NMF based on the CCFs of 26 SMGs and frequent SCNAs. (B) Kaplan–Meier survival analysis showed poorer survival for Cluster A. (C) Multivariate Cox regression revealed clonal subtype, race, WGD, MSig cluster, age, gender, and TNM stage as independent survival predictors. (D) SCNA score differences between subtypes were analyzed at genome-wide, chromosome, and focal levels. (E) ssGSEA showed immune cell infiltration differences between the two subtypes, with significance tested via the Wilcoxon rank-sum test.(Zhang,2023)

Microbiome and Metabolome Associations Analysis Uncovers Relationships Between Colonic Microbiota and Serum Metabolites in Mice

Heatmap analysis revealed associations between colonic microbiota and serum metabolites in genetically prone (3xTg) mice and normal diet(ND) controls. Bifidobacterium and S24.7 were negatively correlated with lactate, malate, and amino acids, while positively correlated with low cholesterol, low-density lipoproteins, and fatty acids.

Heatmap showing correlations between colonic microbiome and serum metabolites in ND and 3xTg groups. Figure 3. Connections between the colonic microbiome and serum metabolome. The heatmap illustrates correlations between metabolite profiles and bacterial abundance in ND and 3xTg groups. Colors from red to blue indicate positive to negative associations. Pearson's correlations were used, confirmed by the Shapiro-Wilk test (*p < 0.05). (Sanguinetti,2018)

Title:Untangling Determinants of Gut Microbiota and Tumor Immunologic Status Through a Multi-Omics Approach in Colorectal Cancer

Publication:Pharmacological Research

Main Methods:Microbiome, metabolome, transcriptomics, data integration

Abstract:This study explored the intricate relationships between gut microbiota and tumor immune status in colorectal cancer (CRC) using multi-omics data, including microbiome, metabolome, transcriptomics. Integrating these datasets, the researchers identified distinct microbiota compositions in CRC patients and developed a machine-learning model with 28 biomarkers to detect CRC with high accuracy. The gut microbiota from CRC patients was found to suppress immune response, highlighting potential therapeutic targets. Fecal microbiota transplantation (FMT) from healthy donors restored immune function and enhanced immunotherapy outcomes in CRC.

Research Results:

Analysis of Gut Microbial Diversity in CRC Patients

The authors first collected 16S rRNA sequencing data from CRC patients and healthy controls from PubMed, revealing significant differences in bacterial composition between the two groups. At the phylum level, CRC patients were predominantly characterized by Bacteroidetes, Fusobacteria, and Firmicutes. At the genus level, a total of 59 differential bacteria were identified between CRC patients and healthy controls, with 28 genera exhibiting higher expression and 34 genera exhibiting lower expression in CRC patients compared to the control group.

Alterations in gut microbial diversity and composition in CRC patients, showing diversity indices and differential genera analysis. Figure 4.Alterations of gut microbial diversity and composition in CRC patients.(A-B) α-diversity indices (Shannon, Inv Simpson) for healthy controls and CRC patients. (C-D) β-diversity via PCoA with Bray-Curtis and Jaccard (D) metrics. (E) Circos diagram links samples to main phyla. (F) Manhattan plot of phylum-level differential genera, with significance at P < 0.05 and FDR < 0.20 (edgeR).(Li,2023)

Machine Learning for Feature Selection and CRC Detection

This study employed LASSO and SVM-RFE algorithms to classify bacteria at the genus level, identifying 32 and 49 significant bacteria, respectively. By intersecting the two sets, 28 bacteria were retained. The area under the curve (AUC) and decision curve analysis (DCA) results confirmed that these bacteria have good clinical applicability in distinguishing CRC patients from healthy controls.

Feature selection for CRC diagnosis using LASSO and SVM-RFE algorithms, including Venn diagram and ROC curves. Figure 5.Two algorithms were utilized for selecting features and creating a diagnostic model to differentiate CRC from healthy controls. (A) LASSO and (B) SVM-RFE algorithms applied to the discovery cohort.(C)Venn diagram highlighting key genera chosen by LASSO or SVM-RFE algorithms. (D) ROC curves for genera selected by both LASSO and SVM-RFE. (E) Decision curve analysis for these genera. (Li,2023)

Genus-Level Gut Microbe Analysis Using WGCNA

To better describe the interactions between gut microbes at the genus level, the authors constructed a WGCNA gut interaction network, identifying 24 independently functioning modules. Correlating these modules with clinicopathological features such as age, gender, and AJCC stage revealed significant associations with tumor status, MSI progression, and AJCC stage.

WGCNA analysis illustrating 24 gut microbe modules, showing correlations with clinical traits and color-coded significance levels. Figure 6.WGCNA analysis of gut microbes at the genus level. (A) Gut microbes were grouped into 24 modules, represented by tree branches. The figure's bottom displays module colors and the number of bases. (B) Module-trait correlations: each cell shows a module-clinical trait correlation coefficient and its p-value. Color intensity reflects correlation strength; red signifies positive and green negative correlations. (Li,2023)

Microbiome and Metabolome Correlation Analysis

The authors explored whether changes in the gut microbiome impact functionality by analyzing microbiome and metabolome data. PLS-DA showed distinct clustering of microbial functional profiles between CRC patients and healthy controls. A total of 92 differentially expressed metabolites were identified, and mapped to 360 genes using the HMDB database. An additional 4814 differentially expressed genes between tumor and normal tissues were identified from the TCGA-COAD and READ cohorts, with 32 overlapping targets. These targets mainly participated in alanine, aspartate, and amino acid biosynthesis. Among CRC patients, NOS1 had the highest mutation frequency, followed by BCHE, CHAT, and HDX. Metabolites may be vital mediators between the gut microbiome and host tumor immunity in CRC.

Figure 7.Alterations in the gut metabolome of CRC patients compared to healthy controls.(A) PLS-DA plot showing distinct metabolite profiles between CRC patients and healthy controls. (B) Volcano plot of metabolites differing between these groups. (C)Venn diagram of overlapping genes from TCGA and HMDB. (D) Pathway enrichment via Metascape. (E) Waterfall plot of 32 gene mutations in 399 TCGA-COAD patients.(Li,2023)

Correlation Analysis of Gut Microbiota with Various Immune Indicators

The authors then used the MCP-counter algorithm to estimate the immune cell infiltration level for each sample and examined the relationship between immune infiltration and gut microbial abundance through Spearman correlation analysis. The results indicated that, compared to the controls, bacteria associated with immune cell infiltration were more abundant in CRC patients. Additionally, there was a negative correlation between gut microbiota and numerous immune modulators.

Heatmap illustrating the relationship between gut microbiota and immune cell infiltration in CRC, highlighting tumor immunogenicity. Figure 8.The gut microbiome influenced immune responses and enhanced tumor immunogenicity.Heatmap depicting the correlations between gut microbes and infiltration levels of tumor-associated immune cells in CRC patients using CIBERSORT and MCP counter algorithms.(Li,2023)

Conclusion

This study revealed the substantial impact of gut microbiota on tumor immunity in CRC. Altering gut microbiota through healthy donor fecal microbial transplantation might be a viable approach to boost immunotherapy response, offering potential therapeutic options for CRC patients. The comprehensive analysis highlights the effectiveness of combining multi-omics data to explore the intricate interactions between gut microbiota and tumor immunity, establishing a strong foundation for discovering new biomarkers and therapeutic strategies in CRC.

1.Why is Multi-omics Data Mining Important for Research?

Multi-omics data mining is crucial for studying the networks that regulate biological processes and disease mechanisms. By examining multiple biological layers, researchers gain deeper insights into complex diseases, enabling biomarker discovery and new therapeutic target identification, particularly in cancer, metabolic disorders, and microbiome studies.

2.What Services Does CD Genomics Offer in Multi-omics Data Mining?

CD Genomics offers a wide range of multi-omics data mining services, including integrated analyses across different omics layers, data normalization, bioinformatics analysis, pathway enrichment, and AI-enhanced predictive modeling. Customized solutions are also provided to meet specific research requirements.

3.What Are the Main Applications of Multi-omics Data Mining?

Multi-omics data mining finds applications in fields such as cancer research, where it explores genetic mutations and therapeutic targets, metabolic diseases for understanding gene-protein-metabolite interactions, and microbiome studies to investigate how microbial communities influence host health and immune responses.

4.What Types of Omics Data Are Integrated in Multi-omics Analysis?

Multi-omics analysis integrates data from several biological layers, including genomics (DNA), transcriptomics (RNA), proteomics (proteins), metabolomics (metabolites), and lipidomics (lipids). Each layer provides unique information, contributing to a more comprehensive understanding of biological functions and processes.

5.What Databases Can Be Accessed for Multi-omics Data Mining?

We can access data from major public multi-omics databases, including TCGA for cancer data, CCLE for gene expression, PeptideAtlas for proteomics, MetaboLights for metabolomics, and NCBI for genomics and transcriptomics, enabling thorough data mining and analysis.

References

Liu, X., et al. Metabolism pathway-based subtyping in endometrial cancer: An integrated study by multi-omics analysis and machine learning algorithms. Molecular therapy. Nucleic acids.2024,35(2), 102155.
Zhang, B., et al. Integrated analysis of racial disparities in genomic architecture identifies a trans-ancestry prognostic subtype in bladder cancer. Molecular oncology.2023, 17(4), 564–581.
Sanguinetti, E., et al. Microbiome-metabolome signatures in mice genetically prone to develop dementia, fed a normal or fatty diet. Scientific reports.2018, 8(1), 4907.
Li, J. J. et al. Untangling determinants of gut microbiota and tumor immunologic status through a multi-omics approach in colorectal cancer. Pharmacological research.2023,188, 106633.