Gene set enrichment analysis (GSEA) is a computational method for interpreting gene expression data through the lens of gene sets (essentially groups of genes that share a common biological function, chromosomal location, or regulation). The technique provides a holistic view of changes in biological states, effectively addressing the limitations inherent in single-gene approaches. Instead of focusing on individual genes that exhibit the highest degree of variation, GSEA assesses the collective behavior of a predefined set of genes. This powerful tool enhances our understanding of system-level biological function and disease progression beyond traditional gene-by-gene analysis. We provide professional gene set enrichment analysis services to help our clients efficiently and accurately detect gene expression levels from massive amounts of gene expression information, identify potential gene functions, and reveal their roles in life processes.
Fig. 1. A GSEA overview illustrating the method. (Subramanian A, et al., 2005)
When genes show a positive correlation in GSEA, they are expressed at relatively higher levels in one phenotype (e.g., treated group) than in another (e.g., untreated group). In other words, these genes are upregulated. This upregulation means that most of the genes in the pathway associated with the phenotype of interest (e.g., a specific disease state) are expressed at higher levels.
For example, a group of genes that are positively correlated in cancer studies may be responsible for promoting cell proliferation, suggesting that expression of this pathway is upregulated in cancer cells compared to normal cells. This insight is invaluable for developing targeted therapies designed to disrupt the upregulated pathway and halt disease progression.
In contrast, a negative correlation in GSEA indicates that one phenotype has lower or down-regulated levels of gene expression compared to the other. Essentially, these genes are less active or repressed in a given environment. If certain genes are negatively correlated with a disease state, it indicates that these genes are downregulated in their expression in the disease state.
For example, in a study investigating an autoimmune disease, genes associated with the immune response may be negatively correlated. In this case, the autoimmune disease may be characterized by downregulation of immune-related pathways, indicating suppression of the immune response. Therefore, therapeutic strategies could aim to reactivate these pathways to restore normal immune function.
Positive and negative correlations in GSEA analysis provide valuable insights into the underlying biological processes of different phenotypic states. This perspective allows researchers to better understand the interactions of genes in biological pathways, emphasizing the impact of their collective behavior rather than focusing solely on individual genes. As bioinformatics continues to evolve, tools like GSEA will remain central to our exploration of the genetic basis of health and disease.
Reference: