gA genome-wide association study (GWAS) involves testing markers across the complete sets of DNAs from many individual organisms to find genetic variations associated with traits such as disease susceptibility or resistance. Once novel genetic associations are identified, better strategies for plant breeding, animal breeding, and human disease treatment and clinical care can be developed.
Figure 1. Genome-wide association study (GWAS) workflow. (Reed, 2015)
Generally, GWAS should maintain sample sizes in the thousands to have sufficient statistical power and detect modest associations such as odds ratios 1.5, while evaluating huge numbers of SNPs. A genome-wide association study features two groups of participants: plants/ animals/ humans with the disease being studied and similar organisms without the disease. Researchers extracted DNA from each organism, and then placed DNA on tiny chips and scanned on automated laboratory machines. The organism's genome is then screened for strategically selected markers of genetic variation, which can be single nucleotide polymorphisms (SNPs) or single sequence repeats (SSR). Extracted DNA could also be screened for variation using PCR-based methods.
The variations are said to be "associated" with the disease if certain genetic variations are found to be significantly more frequent in people with the disease compared to people without the disease. The associated genetic variations can serve as powerful pointers to the region of the organism’s genome where the disease/ insect susceptibility or resistance resides. The significant difference in frequency of the SNP screened between controls and cases is indicated by a p-value which confirms if the allele is associated with the trait. Results are usually displayed in a Manhattan plot with a p-value plotted against the genome locus. After taking note of the confirmed alleles, it is important to note, however, that the associated variants themselves may not directly cause the disease as they may just be "tagging along.” For this reason, researchers often need to take additional steps, such as sequencing DNA base pairs in that genome locus to identify the specific genetic change involved in the disease/ insect susceptibility or resistance.
GWAS facilitates the identification of novel SNV-trait associations, the discovery of novel biological mechanisms, and application on low-frequency rare variants and structural variation. It is also publicly accessible and straightforward. Conversely, it cannot elucidate true signals, epistasis, causal variants, and predict disease progression. Due to these limitations, a focus on post-GWAS experiments such as functional studies and gene network analysis has been proposed. GWAS can be also expensive due to huge sample sizes; hence, multistage study designs where subsets of the population are genotyped in the discovery stage. The resulting most strongly associated SNPs are genotyped with less expensive platforms.
Criticisms on GWAS’s limitations have resulted in skepticism among funding agencies and organizations. However, genome-wide association studies have elucidated a lot of questions on human diseases. In 2005, studies found that a common form of blindness is associated with variation in the gene that produces a protein involved in regulating inflammation. One implication of GWAS results supports genetic testing which can give insights on genetic variations that contribute to the risk of type 2 diabetes, heart disorders, obesity, prostate cancer, as well as those that influence drug response were also identified.
The bioinformatics analysis department of CD Genomics provides novel solutions for data-driven innovation aimed at discovering the hidden potential in biological data, tapping new insights related to life science research, and predicting new prospects.