Interpretability, statistical effectiveness, and possibilities to comprehend causal molecular pathways in disease regulation are all advantages of genomic methods that borrow information from multiple data sources, also known as "omic" assays. TWAS (transcriptome-wide associations studies) combine genetic data into functionally relevant testing units that correspond to genes and their expression in a trait-relevant tissue. This gene-based approach combines the effects of many regulatory variants into a single testing unit, boosting study power and making trait-associated genomic loci more interpretable. Traditional TWAS methods, on the other hand, concentrate on transcriptional regulation at the local genetic level. These methods disregard large amounts of heritable expression that can be related to distal genetic variants, which could imply complex gene regulation mechanisms.
Figure 1. The basic analysis content of transcriptome-wide association study (Gusev, 2018)
According to recent research in transcriptional regulation, distal genetic traits account for up to 70% of gene expression heritability. These findings support Boyle et al.'s omnigenic model, which proposes that regulatory networks are so interconnected that the majority of genetic variants in the genome, whether local or distal, have indirect effects on gene expression levels. In fact, the research found a significant enrichment of significant genetic signals near genes associated with relevant pathways for biologically simple traits, even for phenotypes that were previously thought to be simpler than complex diseases. These findings show that the majority of phenotype heritability, even for traits commonly thought to be simpler than complex diseases like cancer, is motivated by thousands of variants dispersed across most of the genome, instead of by variants in core genes.
(1) TWAS - Multiple genes are associated with significant loci: GWAS is well known for identifying blocks of associated variants in LD rather than single variant–trait associations. TWAS, on the other hand, frequently finds multiple hit genes per locus.
(2) False hits may occur if people's expressions are similar.
(3) False hits may also be caused by related predicted expressions: TWAS looks for links to genetically predicted expression rather than total expressions. The genetic component involves participation from common cis eQTLs, rare cis eQTLs, and trans eQTLs, and the environmental and technical components involve participation from common cis eQTLs, rare cis eQTLs, and trans eQTLs. Predicted expression accounts for a minor portion of total expression.
(4) GWAS variants that are shared may result in false positives: More broadly, even if the expected expression correlation is low, pairs of gene models may share variants (or at least LD partners) because other variants different between the models may'dilute' the correlation.
(5) Bias with non-trait-related tissue expression panels: Even when they are mechanistically unrelated to the trait, tissues with large expression panels (whole blood or lymphoblastoid) are frequently used to maximize power.
(6) TWAS enhances the prioritization of causal genes.
References