Phenotype Databases

Phenotype Databases

Online Inquiry

Phenotype Databases. Genotype, also known as genetic type, refers to the composition of all genetic material of an organism. But more specifically, it is the composition of alleles at individual or a few gene loci. Phenotype refers to the performance of individual or few traits or even all traits of an organism. Genotype is the internal cause of the phenotype of an organism under appropriate environmental conditions; phenotype is the result of the combined effect of genotype and environmental conditions. The association analysis of genotype and phenotype can provide genetic data basis for the research of diseases, especially hereditary diseases. Here, CD Genomics lists some common phenotype databases.

Phenotype Databases - CD Genomics

1. dbGaP

− The Database of Genotypes and Phenotypes (dbGaP) is a NIH-sponsored repository charged to archive, curate and distribute information produced by studies investigating the interaction of genotype and phenotype. Information in dbGaP is organized in a hierarchical structure and includes the accessioned objects, phenotypes (as variables and datasets), various molecular assay data (SNP and Expression Array data, Sequence and Epigenomic marks), analyses and documents.

2. HPO

− The Human Phenotype Ontology provides a standardized vocabulary of phenotypic abnormalities encountered in human disease. Each term in the HPO describes a phenotypic abnormality. HPO currently contains over 13,000 terms and over 156,000 annotations to hereditary diseases. The HPO project and others have developed software for phenotype-driven differential diagnostics, genomic diagnostics, and translational research. The HPO is a flagship product of the Monarch Initiative, an NIH-supported international consortium dedicated to semantic integration of biomedical and model organism data with the goal of improving biomedical research. The HPO, as a part of the Monarch Initiative, is a central component of one of the 13 driver projects in the Global Alliance for Genomics and Health (GA4GH) strategic roadmap.

3. GWAS Central

− GWAS Central (previously the Human Genome Variation database of Genotype-to-Phenotype information) is a database of summary level findings from genetic association studies, both large and small. GWAS Central is built upon a basal layer of Markers that comprises all known SNPs and other variants from public databases such as dbSNP and the DBGV. GWAS Central contains 70,566,447 associations between 3,251,694 unique SNPs and 1,451 unique MeSH disease / phenotype descriptions.

4. EGA

− The European Genome-phenome Archive is a service for permanent archiving and sharing of all types of personally identifiable genetic and phenotypic data resulting from biomedical research projects.

5. MonarchInit

− The Monarch Initiative is a collaborative, open science effort that aims to semantically integrate genotype-phenotype data from many species and sources in order to support precision medicine, disease modeling, and mechanistic exploration. It provides integrated knowledge graph, analytic tools, and web services enable diverse users to explore relationships between phenotypes and genotypes across species.

6. PHI-base

− The Pathogen Host Interactions base includes data from mutant genes to phenotypes. The mission of PHI-base is to provide expertly curated molecular and biological information on genes proven to affect the outcome of pathogen-host interactions. Information is also given on the target sites of some anti-infective chemistries.

7. GenomeRNAi

− RNA interference (RNAi) represents a powerful method to systematically study loss-of-function phenotypes on a large scale with a wide variety of biological assays, constituting a rich source for the assignment of gene function. RNAi phenotype data from human and Drosophila, extracted from the literature, is available in GenomeRNAi database ( It also provides RNAi reagent information, along with an assessment as to their efficiency and specificity.


− The Cellular Microscopy Phenotype Ontology (CMPO) provides a species-neutral controlled vocabulary for describing phenotypic qualities relating to the whole cell, cellular components, cellular processes and cell populations. Terms from CMPO are being used to annotate phenotype descriptions from high-content screening databases and cellular image repositories. Annotating data with ontologies adds value to the data that can then be exploited computationally to support smarter querying and to facilitate data integration.


  1. Tryka K A, et al. NCBI's Database of Genotypes and Phenotypes: dbGaP[J]. Nuclc Acids Research.(D1):975-9.
  2. Mungall C J, et al. The Monarch Initiative: an integrative data and analytic platform connecting phenotypes to genotypes across species[J]. Nuclc Acids Research, 2017(D1):D712-D722.
  3. Schmidt E E, et al. GenomeRNAi: a database for cell-based and in vivo RNAi phenotypes, 2013 update[J]. Nuclc Acids Research (D1):1021-6.
* For Research Use Only. Not for use in diagnostic procedures.
Online Inquiry