Data Visualization of Bioinformatics

Online Inquiry

Information Visualization (IV) techniques are computerized procedures that require selecting, transforming, and representing data in a visual form that allows humans to interact with the data in order to explore and understand it. The use of IV techniques is intended to accomplish two major objectives. For starters, it enables humans to view a large amount of data at once, which they would not be able to do otherwise. Second, by identifying patterns and trends, it aids in the retrieval of useful knowledge from large amounts of data.

Use of Information Visualization in Bioinformatics

Genome and sequence annotation

The enormous volumes and complexity of any organism's genome make it complicated to preserve a global view of the entire genome while also obtaining interesting details. To meet this need, highly dynamic IV techniques with a variety of interactive functionalities have been developed. The majority of genomes are viewed as linear data with additional information such as exons, introns, genes, cytogenetic bands, and other annotations along the DNA sequence. Genomes are typically directly mapped to linear representations in one aspect of visual space, with various kinds of annotation data represented as parallel tracks along the DNA sequence. Then different glyphs can be created to give visual hints about the information on the tracks' properties.

Sequence analysis

Another area of bioinformatics where IV techniques have been widely used in sequence analysis. Aligning sequences side by side as lines of text with highlighted consensus segments could be an easy way to visualize sequence alignment. Adding another plot about the similarity measure along the aligned sequences is a more sophisticated method of visualizing sequence alignment.

Ontology, taxonomy, and phylogeny

There is a hierarchical structure in ontology, taxonomy, and phylogeny. As a result, their visualization methods are similar. Taxonomy and phylogenies are commonly organized as trees, with items arranged in a hierarchy and each item having only one parent, such as a virus phylogenic tree. However, GO, the most widely used ontology in bioinformatics is coordinated as a Directed Acyclic Graph (DAG), in which nodes represent concepts and relationships between nodes are structured in a directed hierarchy, with each node having multiple parents. If duplicate redundant nodes are permitted, a DAG can be reshaped into a tree. The most basic representation of a tree is to display child nodes in wedge-like formations beneath their parents.

Expression profile

High-throughput assessment methods like DNA microarray and 2D gel electrophoresis have made it possible to get a snapshot of a gene expression profile in nearly the entire genome. For evaluating the high-dimensional information collected by these methods, IV techniques have become indispensable. IV techniques have been widely used to aid in the clustering of microarray expression profiles. The outcome of hierarchical clustering can be visualized using a colored mosaic or a dendrogram, which is a common technique.

Another common application of IV in the analysis of expression profiles is mapping gene expression profiles to gene annotated functions, allowing functional profiles to be generated from gene expression profiles. Based on GO's hierarchical configuration, the majority of these equipment map gene expression to the widely accepted annotation, Gene Ontology (GO). To demonstrate gene expression profiles in the sense of gene regulation pathways and gene chromosome locations, IV techniques were used to map gene expression data to gene regulation pathways and chromosome locations.

Molecular pathway

The intricacy of molecular pathways, such as metabolic pathways, gene regulation pathways, and signal transduction networks, necessitates the use of IV techniques to display pathway data. Despite the fact that various sorts of pathways have been visualized, the techniques for doing so are quite related, i.e., a network structure in 2-D visual space with various nodes and edges covering various molecules and relations, respectively.

About CD Genomics Bioinformatics Analysis

The bioinformatics analysis department of CD Genomics provides novel solutions for data-driven innovation aimed at discovering the hidden potential in biological data, tapping new insights related to life science research, and predicting new prospects.

References

Saraiya P, North C, Duca K. An insight-based methodology for evaluating bioinformatics visualizations. IEEE transactions on visualization and computer graphics. 2005, 11(4).
Tao Y, Liu Y, Friedman C, Lussier YA. Information visualization techniques in bioinformatics during the postgenomic era. Drug Discovery Today: BIOSILICO. 2004, 2(6).

* For Research Use Only. Not for use in diagnostic procedures.