Comprehensive Overview of Transcriptome

The transcriptome is represent all the transcripts (i.e., mRNA, tRNA, rRNA, and siRNA, incRNA, piRNA, miRNA, etc.) in a cell or tissueto be studied, and is an essential basis for our study of how genes are expressed in cells and how they are controlled. By using NGS technology and combining it with bioinformatics tools, we can understand the mechanism of the gene-to-RNA delivery process, thus helping us to determine the activity and functions of genes in specific physiological responses, rapidly changing environment or complex disease conditions.

Transcriptome and Genome: Different Perspectives, Common Goals

Although transcriptomes and genomes differ in the focused information they reveal, they are both to reveal the key figures of how organisms control life activities through tens of thousands of genes or how they interact with the environment to regulate life activities. Here are the main differences between them:

  • Object of study: The main target of genomic research is DNA. Studying of genome is to accurately map the various genetic information contained in organisms ( all genetic materials ), each gene's structure in the group, gene-gene interactions, and how to use genes. But transcriptome in order to research RNA. Studying transcriptome can design to discover the transcriptional profiles and diverse regulatory and modification states of genes in the cell as a whole. It is meant to reveal how genes are transcribed into RNA and thus function to make the organism regulates life activities.
  • Objective: The primary goal of the genome is to examine the entire DNA sequence of an organism, to determine what genes are present on the chromosomes, where these genes are present, and to determine what functions they have and to draw an accurate genetic map. However, transcriptome is to reveal which gene is expressed or which regulatory fragments function during life activities or in response to changes in the external environment, the types and amounts of these RNAs, and the events such as splicing, mutations, and so forth, that occur on these RNAs.
  • Temporal and spatial variability: The genome is relatively stable in organisms. Except for individual gene mutations for some reasons, the genome will not change in the life of  organisms. However, the transcriptome is a collection of RNA synthesized by DNA as templates, but genetic information expression in same cell or tissue is not exactly the same in different growth periods or growth environments. Thus, the transcriptome changes at specific stages of an organism's growth and development, at times of environmental change, in different tissues and cells, and even when the same cell is in a different state of stimulation.

Transcriptome vs Proteome

Proteome is a holistic study of all proteins. Its core goal is to systematically study the type, quantity and location of proteins, and more importantly, to study the efficiency of translation into proteins, protein interactions, modifications contained on them, and what functions they have. Proteomic studies aims to analyze the composition and activity of proteins in an organism, tissue or cell in general level, so as to reveal the basic laws and essence of biological life activities.The transcriptome and proteome are upstream and downstream, but there are some differences between them:

  • Objectives: Proteome reflects the level of the actual proteins synthesized with the organism's mRNA as a template. Because not all transcripts can be translated into proteins, some non-coding RNAs, such as rRNAs, long-stranded non-coding RNAs and miRNAs, play equally important roles in cellular life activities, but they cannot be translated into proteins.
  • Regulation: Theoretically, there should be a high correlation between their information obtained from cells, tissues, or organs in the same growth conditions and state, however, numerous studies have shown that transcriptome and proteome expression trends do not correlate, and even show a negative relevance between the two in the presence of rapid environmental changes. The reason for this low relevance because there is not a one-to-one relationship between the transcriptome and proteome when they are translated into proteomes, which involves very complicated regulatory networks such as transcriptional, post-transcriptional, translational, and post-translational mechanisms, especially translational and post-translational regulation have a significant impact on processes such as protein synthesis. So although proteins are translated from mRNA, the transcriptome abundance does not reveal the proteome abundance. And similar to RNA, proteins shows spatio-temporal specificity, with significant differences in different organs (e.g., heart, lungs, and kidneys), and in different growth stages (e.g., insect larvae vs. adults, frog larvae vs. adults, etc.).
  • Dynamic: The transcriptome changes significantly and instantly with passage of time and environmental stimuli, so transcriptome can respond quickly to environmental stimuli, but compared with RNA, which is synthesized and degraded more rapidly, but protein turnover is slower., so the proteome tends to be more stable than the transcriptome in a short period of time, and is not able to instantly reflect the response of an organism to a stimulus.

Transcriptome vs Proteome vs GenomeRelationships of the genome, transcriptome, proteome, and metabolome (Spandan Chaudhary et al,. 2021)

Transcriptome characteristics

  • Temporal specificity: The transcriptome of a single cell changes significantly be in different growth periods or when the organism responds to complex and dynamic environmental signals, at different points in time, i.e., the transcriptomes are all subject to large changes, and the gene expression of a single cell has a unique temporal sequence, so sampling at different points in time will help us to information on transcripts describing different periods of time of a single cell.
  • Tissue specificity: Different tissues of an organism have unique transcriptome characteristics, and certain RNAs are highly expressed in certain tissues, while they are poorly expressed or hardly expressed in other tissues.
  • Spatial specificity: Genes are expressed different in different cellular structures, especially in multicellular organisms where the state of gene expression is different in each single cell or in different regions within a single cell.

Transcriptome Function

  • Research on the body's regulatory mechanisms: for example, biological aging is an unstoppable process, but organisms have also evolved self-repair mechanisms in an attempt to slow down aging and increase cellular activity in the body. However, scientists have found that these repair mechanisms also inevitably fail with aging, leading to organism aging and organ damage. In order to explore more mechanisms of aging as well as repair in organisms, the aging transcriptomes of worms, humans, rats, mice, and yeast and flies were utilized to analyze DEGs during aging as well as the regulation that occurs, and it was found that in addition to exogenous factors that mediate aging, internal factors such as the mTOR pathway, DNA damage, and innate immune responses also contribute to the onset of aging. Simultaneous analysis of multiple transcriptomes also suggests that the development of senescence is also facilitated during the promotion of reproductive adaptations, such as the use of life-prolonging drugs that impair the reproductive health of the organism.
  • Unraveling the stress network: preslaughter operations responses in meat-producing animals (e.g., transportation, water and food deprivation, and confinement with other animals) can adversely affect meat quality, and previous studies have shown that this effect may be related to altered activity as well as levels of associated proteins, To explore this further, the authors utilized a comparison of the two different skeletal muscles of Norman dairy cows before and after stressing the transcriptome. Analyzing the date to identify DEGs in response to stress and eight identical TFs changes were found between the two. Pathway analysis of those DEGs revealed that they all mediated the core stress response and that 25 cis-transcriptional modules were overexpressed due to DEGs, nine of which were common to both sets of data. The authors speculated that it was due to preslaughter operations affecting the stress response as well as the transcription factor regulatory network in the muscle, which led to changes in the stress response as well as the physiological response in the muscle tissue leading to adverse meat reactions.
  • Disease Research: Cancer is characterized by the uncontrolled and unlimited proliferation of cells and their ability to invade surrounding healthy tissues. It is also a major disease that has plagued mankind for a long time and for which there is no good treatment. To investigate what genetic changes occur during cancer development, the authors used 13,400 RNA-seq samples from 18 different cancers and 19 normal tissues to reveal whether different cancer diseases share the same transcriptomic profile. Various functional genes, RNA splicing, and IncRNA expression were analyzed and feedforward neural networks were constructed in all tissues. It was found that the expression of these genes differed significantly between cancerous and healthy tissues. In cancerous tissues, the protein coding genes and the differences in RNA splicing and IncRNA expression were common, i.e., these differences can help us to identify healthy and cancerous tissues, and also suggest that the changes in RNA may be important in causing cancer development.
  • Agricultural research: Sweet corn has been widely loved since its introduction, but flooding is an important environmental factor affecting the production of sweet corn during the cultivation of sweet corn species. In order to improve the flooding tolerance mechanism of sweet corn, the authors analyzed the transcriptome, physiological and biochemical responses of two sweet corn lines, D81 and D120. It was found that various traits such as root length, number of adventitious roots, root surface area, and antioxidant system capacity were higher in the flood-tolerant variety D120 than in D81. Transcriptome analysis of 4h and 8h treatments identified about 2490 versus 2350 DEGs, which are involved in ROS scavenging, adventitious root formation, and photosynthesis, and may be responsible for the flooding tolerance of D120 variety. Integrating transcriptome data with the differential chromosome fragmentation data, the authors found that ZmERF055, a gene located on sweet corn chromosome 9, directly mediated the flooding tolerance response and could be a potential target for improving flooding tolerance in sweet corn.
  • Drug development: Prediction of drug-target interactions (DTI) can help us develop and study new drugs. However, traditional methods are time-consuming and ineffective. The authors extracted 400,000 transcriptome data of more than 4,000 genes from the L1000 database of the LINCS project to develop a new algorithmic framework for DTI prediction. And the DTI data of hundreds of drug and target predictions were extracted from platforms (DrugBank, CTD, DGIdb, and STITCH) respectively to validate the accuracy of the model. The data show that the authors' DTI constructed from the transcriptome has 98% applicability and can continue to integrate new drug and gene transcriptome data.
  • Predicting phenotypes: genotype to phenotype is a complex network mechanism, the same species often have different or the same genotypes resulting in the same or different phenotypes, predicting genetic variation between phenotypes helps us to get the best subset of genes. The authors used a prediction method based on transcriptome to characterize 57 traits in the below-ground and aboveground parts of rice with transcriptional data from roots and leaves. In order to improve the prediction ability, it is not possible to use all genes for comparison but it is necessary to select the appropriate subset of genes (1. selection of transcripts based on tissue specificity, 2. combination of GO analysis, 3. co-expression network analysis of selected genes), for example, using root transcript data resulted in a better prediction of the root phenotypes: root diameter, etc., thus demonstrating that selecting the appropriate transcriptome data is more likely to help us to predict the phenotypes of traits that are controlled by multiple genes very well. traits controlled by multiple genes.

Transcriptome Analysis Methods

  • Microarray Chip: RNA-array is a technique to detect the expression status of RNA using hybridization, which is mainly realized using microarrays. The principle is to immobilize known genes as templates (i.e., pre-designed probes) on vectors, and then extract the RNA of the experimental group and the control group, invert it into cDNA after labeling with different fluorescent dyes, and hybridize the templates of the genes to elute the unbound cDNAs, i.e., use the grid of probes (short DNA sequences) to capture the specific RNA sequences of the samples. The presence or absence of fluorescence or difference in color on the microarray reflects the expression status of the gene of interest and can identify DEGs between two sets of samples.Although microarray technology is still widely used, it can only detect transcripts of known genes and has limitations in detecting novel transcripts, and accuracy is limited by the design of the probe, prior knowledge of the sequence, and the affinity of the hybridized sequence.
  • RNA-Seq: Currently RNA-Seq is an vital tool for analyzing DEGs in organisms at the whole transcriptome level and for studying differential mRNA splicing. This technology due to advanced NGS that allow direct sequencing of cDNAs generated by reverse transcription of extracted total RNA.It doesn't require prior knowledge of the gene sequence, so it can detect new transcripts and unknown splices and variants.
  • Quantitative PCR (qPCR): qPCR is a fast and instant way to measure the expression level of a specific RNA in a sample to be tested. It is often used to validate the datas of RNA-Seq or microarray analyses because it allows targeted count expression levels of individual gene. However, it is not applicable to large-scale genome-wide expression analysis.

If you want more technical details from Eun-Jae Lee, please refer to the "Transcriptome Analysis Methods".

Transcriptome Sequencing

Transcriptome Sequencing refers to the direct sequencing of cDNA using second-generation NGS technology, which is able to thoroughly and rapidly obtain almost all the transcribe informations of a special organ or tissue of a species in a certain state. RNA-seq has now turn into the basis of almost all kinds of biological research, whether it is the study of disease mechanisms, crop resistance improvement or biophysiological regulation and other aspects are widely used.

To delve deeper into this transformative technology, we can explore two key perspectives: What is Transcriptome Sequencing (RNA-Seq), and how does Whole Transcriptome Sequencing or mRNA seq fit into this expanding field?

RNA-seq process includes (1) RNA extraction: extract total RNA from the tissue or cell to be tested and a control group is set up to detect RNA concentration and purity; (2) construction of sequencing libraries: mRNA must be purified and fragmented using oligo dT beads, cDNA is synthesized by reverse transcription reaction and be repaired at the end and add polyA tail to the 3' end, and the successful sequencing library is constructed by high-fidelity polymerases amplification using specific junction sequences to connect both ends of cDNA; (3) DNA Cluster amplification: denature the constructed library with sodium hydroxide to produce single-stranded DNA fragments, add specific primers to fix the cDNA on the chip, and the DNA amplicon will be amplified by the cDNA. chip, and the DNA amplicon is linearized to become single-stranded; (4) NGS:The constructed cDNA library is subjected to NGS, and In this procedure, each transcript is sequenced multiple times to obtain sufficient coverage and accuracy, and the obtained data are analyzed.

Bias in RNA-seq Library Preparation: Current Challenges and Solutions.Simplified protocol of RNA-seq experiment and sources of bias(Shi H et al,.2021)

RNA-seq has the following advantages: (1) It can directly obtain almost all the data at the transcriptional level including RNA structure as well as structural variation information of a certain period or a certain state of a test sample, rather than just obtaining the variances in gene expression level and gene expression amount; (2) No need to know the reference genome information or pre-designed probes, can directly detect the known genes or detect new transcripts, the detection range is more extensive; (3) Wide sequencing coverage, which is capable of detecting low abundance transcripts, and sequencing depth continues to increase and can extend to a wider dynamic detection range; (4) By combining DEG analysis, GO analysis and KEGG pathway analysis, we can comprehensively understand the roles of DEGs in different physiological processes and signaling processes involved, and then explore their potential functions; (5) RNA-seq can be jointly analyzed with other histological datas (e.g. genomics, proteomics, metabolomics) to synergistically reveal complex biological phenomena.

RNA-seq provides a comprehensive approach to studying the transcriptome by detailing these critical processes, while whole transcriptome sequencing further extends this analysis, offering insights into the broad dynamics of gene expression, as introduced in the subsequent article, "Whole Transcriptome Sequencing: Brief Introduction, Workflow, Advantages and Applications".

References:

  1. Myers AJ."The age of the "ome": genome, transcriptome and proteome data set collection and analysis." Brain Res Bull. 2012;88(4):294-301.https://doi.org/10.1016/j.brainresbull.2011.11.015.
  2. Moffitt JR, Lundberg E, Heyn H."The emerging landscape of spatial profiling technologies."Nat Rev Genet. 2022;23(12):741-759. https://doi.org/10.1038/s41576-022-00515-3.
  3. Perez-Gomez A, Buxbaum JN, Petrascheck M. "The aging transcriptome: read between the lines." Curr Opin Neurobiol. 2020;63:170-175. 10.1016/j.conb.2020.05.001.
  4. Cassar-Malek I, Pomiès L, de la Foye A, Tournayre J, Boby C, Hocquette JF. "Transcriptome profiling reveals stress-responsive gene networks in cattle muscles."PeerJ. 2022;10:e13150.
    10.7717/peerj.13150
  5. Jha A, Quesnel-Vallières M, Wang D, Thomas-Tikhonenko A, Lynch KW, Barash Y. "Identifying common transcriptome signatures of cancer by interpreting deep learning models." Genome Biol. 2022;23(1):117. https://doi.org/10.1186/s13059-022-02681.
  6. Feng F, Wang Q, Jiang K, Lei D, Huang S, Wu H, Yue G, Wang B. "Transcriptome analysis reveals ZmERF055 contributes to waterlogging tolerance in sweetcorn." Plant Physiol Biochem. 2023;204:108087. https://doi.org/10.1016/j.plaphy.2023.108087
  7. Xie L, He S, Song X, Bo X, Zhang Z. "Deep learning-based transcriptome data classification for drug-target interaction prediction." BMC Genomics. 2018;19(Suppl 7):667. https://doi.org/10.1186/s12864-018-5031-0  
  8. Tanaka R, Kawai T, Kawakatsu T, Tanaka N, Shenton M, Yabe S, Uga Y. "Transcriptome-based prediction for polygenic traits in rice using different gene subsets." BMC Genomics. 2024;25(1):915. https://doi.org/10.1186/s12864-024-10803-3
* For Research Use Only. Not for use in diagnostic procedures.


Inquiry
RNA
Research Areas
Copyright © CD Genomics. All rights reserved.
Top