Home
Public Database
Summary of The Most Complete Non-Coding RNA Database

Summary of The Most Complete Non-Coding RNA Database

Article Overviews

01 Long Non-Coding RNA (lncRNA) Databases 02 MicroRNA Database 03 Circular RNA (circRNA) Databases

Long Non-Coding RNA (lncRNA) Databases

Long non-coding RNAs (lncRNAs) represent RNA molecules with lengths surpassing 200 nucleotides. Extensive research has underscored the crucial functions of lncRNAs across diverse biological processes, encompassing dosage compensation, epigenetic regulation, cell cycle modulation, and cellular differentiation. Consequently, lncRNAs have assumed central importance within the realm of genetic investigations. Presented below is relevant information concerning databases specifically dedicated to long non-coding RNAs (lncRNAs).

You may interested in

Transcriptomics Data Analysis
LncRNA Data Analysis

Learn more

A Detailed Protocol of LncRNA Data Analysis

LncRNA Disease datebase

http://www.cuilab.cn/lncrnadisease

Figure 1. Overview of The LncRNA and Disease Database

The LncRNADisease database curates experimental evidence-based resources on the associations between lncRNAs and diseases. It also integrates tools for predicting novel lncRNA-disease associations, aimed at providing comprehensive functional annotations of human lncRNAs. Moreover, LncRNADisease facilitates interactions of lncRNAs at various levels, including proteins, RNAs, miRNAs, and DNA, currently offering predictive outcomes for 1564 human lncRNAs associated with diseases.

IncRNASNP

http://bioinfo.life.hust.edu.cn/lncRNASNP2

Overview of lncRNASNP2 database. Figure 2. Introduction to the lncRNASNP2 Database:
(A) Cataloging Variants (SNPs, TCGA Cancer Mutations, and CosmicNCVs) within lncRNAs and their Influence on lncRNA Structure or Function.
(B) Illustration of miRNA Target Augmentation Caused by a Variant.
(C) Demonstration of lncRNA Expression Patterns in Cancers Presented in Histograms.

IncRNASNP is a comprehensive database offering a wealth of resources on single nucleotide polymorphisms (SNPs) within long non-coding RNAs (lncRNAs) of both humans and mice. It encompasses SNP loci within lncRNAs, explores their impact on lncRNA structure, mutations within lncRNAs, and the binding of lncRNAs to miRNAs. Additionally, it analyzes the influence of SNP loci on the interaction between lncRNAs and miRNAs. The current iteration of the lncRNASNP2 database includes updated counts of 141,353 human lncRNAs and 10,205,295 SNPs.

Lnc2Cancer

http://www.bio-bigdata.net/lnc2cancer

Figure 3. Interface of Lnc2Cancer 3.0. The dataset encompasses regulatory mechanisms, biological functions, and clinical implications of both lncRNAs and circRNAs in cancer. Furthermore, a suite of tools has been devised for the exploration, visualization, and analysis of lncRNAs at both high-resolution and RNA-seq levels.

Lnc2Cancer stands as a meticulously curated database, where the author has systematically gathered over 6500 publications from PUBMED, elucidating the associations between LncRNA and cancer. Through comprehensive annotation, this resource refines and annotates the links between LncRNA and cancer, enabling users to assess the experimental support for these associations. Moreover, it facilitates the scoring of LncRNA or circRNA associations with human cancers, alongside offering access to high-throughput experimental data on the LncRNA landscape in cancer.

RNALocate

http://www.rna-society.org/rnalocate

Overview of the RNALocate v2.0 database. Figure 4. Interface of RNALocate

RNALocate serves as an efficient repository facilitating the processing, exploration, and analysis of RNA subcellular localization. The current version of RNALocate encompasses over 190,000 entries related to RNA subcellular localization, providing both experimental and predictive evidence. This compilation spans more than 105,000 RNAs across 65 species, including Homo sapiens, Mus musculus, and Saccharomyces cerevisiae. It encompasses over 21,800 RNAs of various types (e.g., mRNA, miRNA, lncRNA) and 42 subcellular localizations (primarily nucleus, cytoplasm, endoplasmic reticulum, and ribosomes).

LNCipedia

https://lncipedia.org/

Figure 5. illustrates the generation process of LNCipedia, which unfolds through a multistep procedure involving importation, nomenclature, analysis, and visualization of lncRNA genes. Import scripts, designed for FASTA, BED, and GFF file formats, meticulously process lncRNA transcripts while identifying redundancy. The nomenclature of lncRNAs ensues the establishment of lncRNA transcript clusters, necessitating information regarding the nearest protein-coding gene on the same DNA strand. Subsequent to this, each lncRNA transcript undergoes comprehensive analysis using multiple algorithms, with the findings seamlessly integrated into the database. The creation of a web interface using Perl facilitates lncRNA visualization and database queries.

LNCipedia stands as a public database dedicated to storing lengthy sequences and annotations of non-coding RNA (lncRNA). This repository amalgamates information from various human lncRNA databases, thereby significantly mitigating the fragmented nature of lncRNA databases. Integrated databases encompass LncRNAdb, Broad Institute, Ensembl, Gencode, Refseq, among others, each endowed with a unified identifier. Additionally, it includes information on ncRNA transcript genomic positions, lengths, structures, miRNA binding, and relevant records of lncRNA in other databases. Users can input, search, and download lncRNA-related information from this database, which has now been upgraded to version 5.3.

AnnoLnc

http://annolnc.gao-lab.org/

The web interface of AnnoLnc. Figure 6. Depicts the web interface of AnnoLnc, facilitating seamless navigation from sequence submission to receipt of annotation outcomes. (a) The "Home" page serves as the starting point for users to submit lncRNA sequences, initiating on-the-fly analysis. (b) The "Overview" page provides insights into the status of the run and essential details concerning processed lncRNAs. (c) An exemplar annotation result page showcases comprehensive annotations, complemented by a sidebar for swift navigation.

AnnoLnc serves as an integrated platform for systematically annotating novel human lncRNAs. Leveraging over 700 data resources and diverse toolchains, its comprehensive annotation encompasses genome positioning, secondary structure, expression patterns, transcriptional regulation, miRNA interactions, protein interactions, genetic correlations, and evolutionary insights.

MicroRNA Database

MicroRNAs (miRNAs) are endogenous small RNAs, approximately 20-24 nucleotides in length, which play diverse regulatory roles within cells. Each miRNA can target the expression of multiple genes, while several miRNAs can also regulate the expression of the same gene. It is estimated that miRNAs regulate one-third of human genes.

Learn more

Bioinformatics 101: miRNA Target Prediction

Presented below are databases dedicated to miRNAs:

YM500v2

http://ngs.ym.edu.tw/ym500v2/index.php

A representation of example of 'novel miRNA' Figure 7. A depiction illustrating an example of a 'novel miRNA', showcasing the alignment results of the reads obtained from smRNA-seq (A) and CLIP-seq (B).

YM500v2 is an integrated database designed for the quantification of miRNAs in individual small RNA sequencing datasets, serving as a resource for both miRNA identification and novel miRNA prediction. Developed as an extension of YM500, YM500v2 incorporates novel algorithms related to miRNAs and encompasses over 8,000 small RNA sequencing datasets associated with cancer. This database facilitates microRNA research by providing expression profiles and associated analyses of miRNAs. YM500v2 enables various analyses, including target gene prediction and inter-group differential expression studies.

Starbase

http://starbase.sysu.edu.cn/

System overview of starBase core framework. Figure 8. The core framework of starBase provides a comprehensive system overview. All outcomes produced by starBase are archived in MySQL relational databases and showcased through a visual browser and web interface.

StarBase, leveraging gene expression data from 10,882 RNA-seq and 10,546 miRNA-seq datasets across 32 cancer types, enables researchers to conduct comprehensive pan-cancer analyses of RNA-RNA and RBP-RNA interactions. Additionally, ENCORI provides a platform for survival and differential expression analysis of miRNA, lncRNA, pseudogenes, and mRNA. Not only does it present multiple miRNA target gene prediction results, but it also incorporates diverse functional information of miRNAs and their expression patterns in tumors.

miRWalk

http://mirwalk.umm.uni-heidelberg.de/

Overview of query output. Fig 9. The overview of the query output displays the results obtained from querying multiple target genes. Users have the option to refine the query output by setting various filter options. The table output includes several links to additional databases, such as miRBase for miRNA-IDs, Ensemble for Ensembl Transcript IDs, and NCBI for Genesymbols.

miRWalk serves as a comprehensive miRNA target gene database, encompassing miRNA target gene information from various species including Human, Mouse, Rat, Dog, Cow, and others. It not only catalogs miRNA binding sites along the full-length sequences of genes but also integrates this information with predictions from twelve existing miRNA target prediction programs, enhancing the comprehensiveness of the database.

DIANA tools

http://diana.imis.athena-innovation.gr/DianaTools/index.php

Fig 10. Interface of DIANA-TarBase v7.0

DIANA Tools is a comprehensive database tailored for miRNA and lncRNA-related research endeavors. Its primary objective is to furnish an algorithmic database and software suite designed to interpret and archive data within a systematic framework, spanning from expression regulation analyses derived from deep sequencing data. This database facilitates analyses encompassing miRNA-target gene interactions, miRNA-signaling pathway associations, and miRNA-lncRNA correlations. Additionally, it offers automated data analysis functionalities and enables direct prediction of target genes based on sequences (novel miRNAs). Furthermore, users can access relevant publications concerning miRNA and explore information on miRNA-associated promoters, regulatory factors, and transcription factors.

SomamiR

http://compbio.uthsc.edu/SomamiR/home.php

Overview of SomamiR 2.0. Figure 11.Overview of data integration in SomamiR 2.0.

SomamiR is a specialized database focusing on somatic mutations within microRNA (miRNA) and their target sites in cancerous cells. It integrates various types of data aimed at investigating the impact of somatic and lineage mutations on miRNA functionality within cancer. The database also furnishes information on genes associated with tumor-related pathways that harbor somatic mutations in miRNA target sequences.

miRNEST

http://rhesus.amu.edu.pl/mirnest/copy/

The pipeline of miRNEST used for large-scale miRNA discovery Figure 12. The pipeline used for large-scale miRNA discovery from sRNA deep-sequencing data.

miRNEST stands as a comprehensive repository amalgamating microRNA data across animals, plants, and viruses, constituting an integrated resource for microRNAs. At its core, the database encompasses microRNA predictions derived from expressed sequence tag (EST) data spanning 225 animal and 202 plant species. It includes validated microRNA sequences, small RNA sequencing data, expression profiles, diversity metrics, target information, and links to external microRNA resources.

TargetScan

http:/ /www.targetscan.org/vert_71/

miRNAs target prediction by TargetScan database Figure 13. TargetScan for miRNAs target prediction

TargetScan stands as a software tool designed to predict miRNA binding sites, exhibiting robust performance in forecasting miRNA binding sites within mammalian species. Prior to predicting miRNA target genes, the delineation of the 3'UTR region of transcripts is imperative. TargetScan database employs a sequencing technique called 3P-seq to ascertain the 3'UTR regions corresponding to transcripts (mammalian miRNAs exert post-transcriptional regulatory effects by binding to the 3'UTR regions of target transcripts). Integrating the analytical outcomes of this technique with existing 3'UTR annotations in NCBI, TargetScan provides a comprehensive sequence of the 3'UTR region.

miRcode

http://bioinfo.life.hust.edu.cn/lncRNASNP

miRcode Workflow for mapping of microRNA target sites in the long non-coding transcriptome Fig. 14. Workflow for mapping of conserved putative microRNA target sites in lncRNAs

miRcode, grounded in comprehensive GENCODE gene annotation, furnishes a panorama of human microRNA target predictions across the entire transcriptome, encompassing the full expanse of GENCODE-annotated transcripts, including 10,419 registered lncRNAs. Transcript annotations derive from Gencode v11 and are categorized into distinct classes. Additionally, miRcode encompasses coding genes, including atypical regions such as 5'UTR and CDS. In comparison with TargetScan, miRcode primarily extends its search to ncRNAs and non-3'UTR regions.

Circular RNA (circRNA) Databases

circRNA databases offer repositories dedicated to a unique class of non-coding RNA molecules, occasionally expressed in vivo, representing the latest frontier in RNA research. Distinguished from conventional linear RNAs (which possess 5' and 3' ends), circRNA molecules adopt closed-loop structures, rendering them impervious to RNA exonucleases, thereby ensuring heightened stability and resistance to degradation. The surge in circRNA research ignited around 2010 with the advent of RNA-seq technology and the subsequent development of specialized computational pipelines.

You may interested in

Circular RNA Analysis

Learn More

Bioinformatics Analysis of Circular RNA Sequencing: Introduction, Workflow, and Analysis Contents

Bioinformatics 101: circRNA Sequencing Data Analysis

Presented below are databases focused on circRNA:

circBase

http://www.circbase.org/

Overview of circBase Fig. 15. (A) circBase table browser. (B) circBase results page. (C) Reads mapped to a head-to-tail splice junction. (D) circBase single record page. (E) UCSC genome browser on doRiNA, a database for post-transcriptional regulatory elements.

circBase serves as a comprehensive database dedicated to circular RNA (circRNA), aggregating information from various species. Employing the find_circ software, circBase predicts circRNAs from ribosome-depleted RNA-seq libraries. Users can explore circRNAs individually or in list format, and the entire circRNA dataset is available for download and deployment on local servers. Additionally, circBase facilitates sequence alignment using BLAT, akin to the functionality provided by UCSC Genome Browser.

CIRCpedia v2

http://www.picb.ac.cn/rnomics/circpedia

Figure 16. Homepage of CIRCpedia v2. A. Summary of annotated circRNAs in CIRCpedia v2. B. Datasets used in CIRCpedia v2.

CIRCpedia v2 represents an updated comprehensive database featuring circRNA annotations derived from over 180 RNA-seq datasets across six different species, including humans, mice, fruit flies, and zebrafish. It encompasses the identification of 262,782 circular RNAs. Users can explore circRNAs within CIRCpedia based on species, cell lines, gene names, or genomic positions. The database provides detailed information on circular RNA IDs, their source genes, corresponding linear transcripts, expression levels, exon start and end positions, cell lines, conservation status, and more. Furthermore, expression levels of circular RNAs across different tissues or cell lines are visualized using heatmaps or scatter plots.

circRNADisease

http://cgga.org.cn:9091/circRNADisease/

Graphical Abstract of circRNADisease v2.0. Figure 17. circRNADisease v2.0 represents an improved and dependable database providing experimentally validated associations between circular RNAs (circRNAs) and diverse diseases.

circRNADisease is an online database built upon experimentally validated associations between circRNAs and diseases. It systematically verifies over 800 published literature sources, cataloging and organizing 330 circRNAs and 48 diseases. Each entry in circRNADisease includes comprehensive details regarding circRNA-disease associations, encompassing circRNA and disease names, circRNA expression patterns, brief functional descriptions of circRNAs, and other annotation details. The annotated species primarily focus on humans.

References:

Miao YR, Liu W, Zhang Q, Guo AY. lncRNASNP2: an updated database of functional SNPs and mutations in human and mouse lncRNAs. Nucleic Acids Res. 2018
Gao Y, Shang S, Guo S, Li X, Zhou H, Liu H, Sun Y, Wang J, Wang P, Zhi H, Li X, Ning S, Zhang Y. Lnc2Cancer 3.0: an updated resource for experimentally supported lncRNA/circRNA cancer associations and web tools based on RNA-seq and scRNA-seq data. Nucleic Acids Res. 2021
Cui T, Dou Y, Tan P, Ni Z, Liu T, Wang D, Huang Y, Cai K, Zhao X, Xu D, Lin H, Wang D. RNALocate v2.0: an updated resource for RNA subcellular localization with increased coverage and annotation. Nucleic Acids Res. 2022
Volders PJ, Helsens K, Wang X, Menten B, Martens L, Gevaert K, Vandesompele J, Mestdagh P. LNCipedia: a database for annotated human lncRNA transcript sequences and structures. Nucleic Acids Res. 2013
Hou, M., Tang, X., Tian, F. et al. AnnoLnc: a web server for systematically annotating novel human lncRNAs. BMC Genomics 17, 931 (2016).
Cheng WC, Chung IF, Tsai CF, Huang TS, Chen CY, Wang SC, Chang TY, Sun HJ, Chao JY, Cheng CC, Wu CW, Wang HW. YM500v2: a small RNA sequencing (smRNA-seq) database for human cancer miRNome research. Nucleic Acids Res. 2015
Yang JH, Li JH, Shao P, Zhou H, Chen YQ, Qu LH. starBase: a database for exploring microRNA-mRNA interaction maps from Argonaute CLIP-Seq and Degradome-Seq data. Nucleic Acids Res. 2011
Sticht C, De La Torre C, Parveen A, Gretz N. miRWalk: An online resource for prediction of microRNA binding sites. PLoS One. 2018
Bhattacharya A, Cui Y. SomamiR 2.0: a database of cancer somatic mutations altering microRNA-ceRNA interactions. Nucleic Acids Res. 2016
Szczesniak MW, Makalowska I. miRNEST 2.0: a database of plant and animal microRNAs. Nucleic Acids Res. 2014
Xiaonan fu, Daoyuan Dong. Bioinformatic Analysis of MicroRNA Sequencing. Transcriptome Data Analysis (pp.109-125) 2018
Jeggari A, Marks DS, Larsson E. miRcode: a map of putative microRNA target sites in the long non-coding transcriptome. Bioinformatics. 2012
Glažar P, Papavasileiou P, Rajewsky N. circBase: a database for circular RNAs. RNA. 2014
Dong R, Ma XK, Li GW, Yang L. CIRCpedia v2: An Updated Database for Comprehensive Circular RNA Annotation and Expression Comparison. Genomics Proteomics Bioinformatics. 2018

* For Research Use Only. Not for use in diagnostic procedures.