The basic research in life science and biology, including basic medicine and pharmacy, often involves small molecules as research targets. Before the actual experiment is initiated, mining and analysis on the existing data of the molecules are often necessary. For example, we might want to find out the proteins or targets that interact with our small molecule. Here, we have listed several commonly used interaction databases for studying interactions between small molecules.
− IntAct Molecular Interaction Database is an open-source, open data molecular interaction database populated by data either curated from the literature or from direct data depositions. Two levels of curation are now available within the database, with both IMEx-level annotation and less detailed MIMIx-compatible entries currently supported. So far, the database includes 21,639 publications, 1,063,382 interactions and 117,448 interactors.
− The Biological General Repository for Interaction Datasets (BioGRID) is an open access database dedicated to the curation and archival storage of protein, genetic and chemical interactions for all major model organism species and humans. As so far, BioGRID contains records for 1,598,688 biological interactions manually annotated from 55,809 publications for 71 species, as classified by an updated set of controlled vocabularies for experimental detection methods. BioGRID also houses records for >700,000 post-translational modification sites. BioGRID now captures chemical interaction data, including chemical-protein interactions for human drug targets drawn from the DrugBank database and manually curated bioactive compounds reported in the literature. A new dedicated aspect of BioGRID annotates genome-wide CRISPR/Cas9-based screens that report gene-phenotype and gene-gene relationships. An extension of the BioGRID resource called the Open Repository for CRISPR Screens (ORCS) database currently contains over 500 genome-wide screens carried out in human or mouse cell lines. All data in BioGRID is made freely available without restriction, can be directly downloadable in standard formats, and can be readily incorporated into existing applications via our web service platforms. BioGRID data are also freely distributed through partner model organism databases and meta-databases.
− ChEMBL is a manually curated database of bioactive molecules with drug-like properties. It brings together chemical, bioactivity and genomic data to aid the translation of genomic information into effective new drugs.
− STITCH is an aggregated database of interactions connecting over 300,000 chemicals and 2.6 million proteins from 1133 organisms to facilitate the study of interactions between proteins and chemicals. The database can be accessed interactively through a web interface, displaying interactions in an integrated network view. It is also available for computational studies through downloadable files and an API.
− The DrugBank is a web-enabled database containing comprehensive molecular information about drugs (mechanisms, interactions, and targets). So far, the database contains 13,645 drug entries including 2,645 approved small molecule drugs, 1,402 approved biologics (proteins, peptides, vaccines, and allergenics), 131 nutraceuticals and over 6,380 experimental (discovery-phase) drugs.
− The Database of Interacting Proteins (DIP) documents experimentally determined protein-protein interactions. It provides the scientific community with an integrated set of tools for browsing and extracting information on protein interaction networks.
− The STRING database provides information on all direct and indirect interactions between proteins in a given cell, as well as the functional annotation analysis system. So far, the database contains 5090 organizations,24.6 mio proteins and over 2,000 mio interactions.
− The Human Protein Reference Database represents a centralized platform to visually depict and integrate information pertaining to domain architecture, post-translational modifications, interaction networks and disease association for each protein in the human proteome. All the information in HPRD has been manually extracted from the literature by professional biologists who read, interpret and analyze the published data. HPRD has been created using an object oriented database in Zope, an open source web application server that provides versatility in query functions and allows data to be displayed dynamically.