RNA Sequencing Derived Co-expression Network Analysis Accelerates Functional Genomics Research

DGE analysis is complemented by co-expression network analysis. A gene co-expression network is represented as an undirected graph in which each node represents a gene and two nodes are linked if their co-expression is significant. Building co-expression networks can help to derive important biological modules that are tightly affiliated within a specific biological process because co-expressed genes are often functionally related, governed by the same set of transcriptional factors, or work together within the same pathway.

Since the microarray era, the co-expression network has been thoroughly researched, and with the advent of NGS technology, which has been examined using RNA-seq. Studies comparing RNA-seq co-expression networks to microarray data-derived networks indicated that correlations from RNA-seq data are much higher due to the greater sensitivity and dynamic scope of RNA-seq data. Although both co-expression networks have scale-free properties, hub-like genes have little overlap. The low correlation between microarray and RNA-seq data, particularly for high- and low-transcript abundances, can explain these situations.

Protocol and Bioinformatics

The quality of RNA-seq-derived co-expression networks is influenced by sample size and read depth. Larger sample sizes and deeper read depths can improve network functional connectivity. At least 20 specimens with a total number of reads greater than 10 million per sample are recommended as the minimum experimental criteria for achieving performance comparable to microarrays. Meta-analysis across multiple data sets is a good way to improve individual co-expression networks' relatively poor performance. The performance of even the largest individual co-expression networks in a single experiment can be significantly improved by aggregating data from multiple experiments. To acquire the "gold standard" co-expression networks, however, thousands of samples from various conditions are required.

A large meta-analysis indicated the significant quality of the co-expression network, promising biologists and clinicians the power of a functional genomics tool. Discovering co-expression modules in one condition and then experimenting if these modules display different co-expression in other conditions can help in understanding regulatory change under disease conditions, in addition to building gene co-expression networks under defined conditions. Gene set co-expression analysis was suggested to assess differential co-expression of known pathways by looking at changes in co-expression across all of the pathway's gene pairs. A small highly co-expressed subnetwork was discovered to be a good indicator of disease onset or other biological processes based on theoretical analysis. This discovery was confirmed with real data, indicating that a small group of genes clustered within a highly correlated subnetwork can offer a substantial warning signal just before disease onset. The current method for developing dynamic network biomarkers is to use population data. It might be informative to use self-correlation or synchronization to construct a co-expression of network with time-series data from the same subject, so that it can be used to anticipate disease onset for diagnosis and personalized medicine.

As more RNA-seq data becomes publicly available, new algorithms to construct both the global and local characteristics of co-expression networks, particularly those dynamic changes affiliated with biological processes, are urgently needed. The development of RNA-seq co-expression methodologies still requires a lot of work. So far, only a few documented statistical studies have examined metrics for expression profile resemblance with RNA-seq data. Si et al devised a method for clustering genes by assessing differential expression correlations across treatments using model-based statistical methods for RNA-seq data based on either Poisson or NB models, with the mean expression level as a reference, rather than directly treating RNA-seq data. There is little overlap between co-expression networks created using different expression measurements, such as raw counts, RPKM, or variance-stabilizing transformation. As a result, new metrics for establishing co-expression networks are urgently needed.


  1. Ha J, Kang YG, Lee T, et al. Comprehensive RNA sequencing and co-expression network analysis to complete the biosynthetic pathway of coumestrol, a phytoestrogen. Scientific reports. 2019 Feb 13;9(1).
  2. Zou X, Liu A, Zhang Z, et al. Co-expression network analysis and hub gene selection for high-quality fiber in upland cotton (Gossypium hirsutum) using RNA sequencing analysis. Genes. 2019 Feb;10(2).
  3. Han Y, Gao S, Muegge K, et al. Advanced applications of RNA sequencing and challenges. Bioinformatics and biology insights. 2015 Jan;9.
* For Research Use Only. Not for use in diagnostic procedures.

Research Areas
Copyright © CD Genomics. All rights reserved.