Whole Transcriptome Sequencing or mRNA seq

The distinction between whole transcriptome sequencing and mRNA sequencing (mRNA-seq) warrants careful consideration within the scientific community. In recent years, persistent advancements in risk stratification and chemotherapy protocols have significantly enhanced the long-term survival rates of pediatric leukemia patients. Notably, data from the CCCG-ALL 2015 collaborative group indicate an impressive overall survival rate (OS) of 90.5% for children afflicted with acute lymphoblastic leukemia (ALL). Despite these advancements, the ultimate objective for researchers and clinicians remains the complete cure of pediatric patients. To achieve this, there is an ongoing quest to develop more effective diagnostic techniques that can comprehensively decipher abnormal indicators and facilitate the formulation of individualized therapeutic strategies.

With the ongoing evolution of omics research and technological innovation, numerous novel diagnostic methodologies have emerged. Among these, transcriptome sequencing has demonstrated outstanding efficacy in detecting fusion genes. This cutting-edge technology has transitioned from being a purely research tool to a pivotal element in clinical applications, thereby bolstering precise classification and targeted treatment approaches for hematological malignancies. Furthermore, transcriptome sequencing has the potential to uncover additional fusion gene markers of clinical relevance, thus propelling further advancements in the diagnosis and treatment of hematological tumors.

Whole transcriptome sequencing and mRNA-seq each possess distinct advantages, necessitating a judicious balance based on specific research objectives and clinical requirements.

RNA

Ribonucleic Acid (RNA) serves as a critical vector for transmitting genetic information. It is synthesized as a single-stranded molecule using DNA as a template, following the principles of complementary base pairing. The primary function of RNA is to facilitate the precise expression of genetic information at the protein level, acting as an indispensable conduit for translating genetic Instructions into phenotypic characteristics. Based on its abundance and functional attributes, RNA can be categorized into various types.

Transcriptomics

In a broad sense, the whole transcriptome encompasses the complete set of transcript information produced by specific tissues or cells under particular physiological conditions. This extensive collection includes not only messenger RNA (mRNA) but also a variety of non-coding RNAs, such as transfer RNA (tRNA), ribosomal RNA (rRNA), small nuclear RNA (snRNA), microRNA (miRNA), small interfering RNA (siRNA), and long non-coding RNA (lncRNA). Collectively, these elements provide an intricate depiction of gene expression complexity.

Conversely, in a more restrictive definition, the term "whole transcriptome" specifically refers to the entirety of mRNA. This narrower focus targets the totality of protein-coding information, underscoring the quantitative aspects of gene expression related to protein synthesis.

Figure 1. Overview of RNA types in the whole transcriptome, including mRNA and non-coding RNA.Figure 1. Overview of the whole transcriptome.

Transcriptome Sequencing

Transcriptome sequencing, also known as RNA-Seq, utilizes next-generation high-throughput sequencing technologies to reverse transcribe RNA from tissues or cells into cDNA libraries, followed by comprehensive and rapid sequencing. This sophisticated process accurately captures nearly all transcripts present in a specific organ or tissue of a species at a given state. It facilitates a holistic exploration of gene functions and structures, thereby elucidating the molecular mechanisms underpinning specific biological processes and the onset and progression of various diseases.

To learn more, please refer to "What is Transcriptome Sequencing (RNA-Seq)?".

Advantages of Transcriptome Sequencing

Transcriptome sequencing technology boasts multiple functionalities. It not only precisely localizes transcript positions and identifies single nucleotide polymorphisms (SNPs) and splice variants but also uncovers the existence of novel transcripts while conducting in-depth quantitative analyses of gene expression. In clinical settings, its distinct advantage lies in its ability to comprehensively capture gene fusion events and RNA-level mutations with a single assay. It emphasizes identifying potential fusion genes. Although established techniques such as PCR and FISH can achieve these objectives, each method has unique characteristics and focal points, necessitating a flexible and case-dependent selection of the optimal clinical strategy.

Table 1: Comparative Analysis of Related Technologies

Technology Detection Level Advantages Disadvantages
FISH DNA Strong specificity Low throughput, detects one gene at a time;
Unable to classify fusion genes
PCR RNA High sensitivity and specificity Detects only known fusion variants;
Cannot discover novel fusion genes; low throughput
DNA Targeted Panel DNA High sequencing depth, detects low-frequency variants Detects only known gene variants;
High throughput Cannot discover unknown gene variants
Whole Transcriptome Sequencing RNA Broad coverage, high throughput; no need for reference Detects known genes and discovers new transcripts;
genome information, no probe design required Analyzes differential gene expression

Workflow of Transcriptome Sequencing

Figure 2. Transcriptome sequencing workflow showing RNA enrichment and gene analysis.Figure 2. Process of transcriptome sequencing.

Figure 2 illustrates the workflow of transcriptome sequencing technologies. The fundamental distinction between mRNA-Seq and whole transcriptome sequencing lies in the mechanisms of RNA enrichment and the types of RNA being enriched. This distinction subsequently leads to marked differences in the information obtained from each method. In a 2018 article published in the journal Nature, Shanrong Zhao and colleagues conducted a comparative study using RNA from normal human blood and colon tissue. They employed two library preparation methods: poly(A) selection and rRNA removal. The findings revealed that the rRNA removal method significantly improves the detection of expressed genes, particularly in blood samples, where the number of genes detected using rRNA removal surpasses that with poly(A) selection—exceeding by over twice, with approximately 41,000 genes compared to 18,000 genes.

Figure 3. Gene expression comparison in blood and colon tissues via RNA sequencing.Figure 3. Comparative graph of gene expression in normal human blood and colon tissue

Table 2: Comparison of mRNA-Seq and Whole Transcriptome Sequencing Technologies

Feature mRNA-Seq (Poly A) Whole Transcriptome (rRNA Depletion)
Principle Utilizes the poly(A) tail of mature mRNA, enriching mRNA through hybridization and elution with oligo dT probe beads. Involves hybridization of DNA probes with rRNA, followed by rRNA removal using RNase H or beads to enrich mRNA.
Sample Requirements Relies on intact poly(A) tails, demanding high-quality samples with RIN > 8.0. Based on molecular hybridization and rRNA-specific removal, allowing effective analysis of degraded samples.
Sequence Bias Unable to capture sequences far from the poly(A) tail, potentially presenting a 3' end bias. Ribosomal depletion reduces 3' end bias significantly.
Data Requirements 6-8 GB 10-15 GB
Coverage Characteristics Provides superior coverage of exonic regions with equivalent data quantity. Covers more genes and extracts additional transcriptome features.
Application Characteristics Suitable for detecting and quantifying gene expression in high-quality clinical samples. Allows comprehensive analysis across various sample qualities, serving both clinical and research purposes.
Content Detected Primarily focuses on mRNA, with limited information on lncRNA and pseudogenes. Encompasses mRNA, richly detecting lncRNA, pseudogenes, small RNAs, and more.

In summary, clinical applications can be tailored according to specific demands. However, whole transcriptome sequencing, which detects both mRNA and non-coding RNA, offering extensive transcriptomic insights, has progressively become the primary trend in both clinical and research settings.

Recommended Reading: Whole Transcriptome Sequencing: Brief Introduction, Workflow, Advantages and Applications and mRNA Sequencing: Introduction, Workflow, and Data Analysis.

Key Aspects of Transcriptome Sequencing Technology

Fusion Genes and RNA-Level Gene Mutations:

Figure 4. Fusion gene formation and mutations identified through transcriptome sequencing.Figure 4. Diagram illustrating the formation of fusion genes and gene mutations detected through transcriptome sequencing.

Alternative Splicing and Novel Transcript Prediction

Figure 5. Diagram of alternative splicing and novel transcript formation in gene expression.Figure 5. Visualization of alternative splicing and the resulting novel transcript formation in gene expression.

Differential Gene Expression Analysis

Figure 6. Heatmap showing differential gene expression analysis and gene activity.Figure 6. Heatmap displaying differential gene expression analysis, highlighting significant gene activity variations.

Application of Transcriptome Sequencing Technology

Comprehensive Identification of MLL-Associated Fusions:

Approximately 15% to 20% of pediatric acute myeloid leukemia (AML) patients exhibit chromosomal translocations within the 11q23 region, alongside about 7% of pediatric acute lymphoblastic leukemia (ALL) patients presenting with MLL gene rearrangements. In the context of AML and ALL, MLL gene rearrangements are consistently regarded as pivotal indicators of poor prognosis. Furthermore, both the types of fusion genes and their frequencies differ markedly across various leukemic conditions.

To date, more than 200 MLL-associated fusion genes have been identified, each exhibiting distinct breakpoints. Traditional PCR techniques are limited to detecting only a small subset—typically no more than a dozen—of these common fusion genes, leading to incomplete coverage. Fortunately, transcriptome sequencing technology offers a promising solution to this limitation by enabling a more comprehensive elucidation of the diversity of MLL-associated fusion genes.

Figure 7. Known MLL fusion gene partners linked to various hematological diseases.Figure 7. Known MLL fusion gene partners associated with different diseases, shown for diagnostic and research purposes.

Figure 8. MLL gene breakpoints distribution in leukemia showing genetic rearrangement.Figure 8. Distribution of MLL gene breakpoints, indicating key regions of genetic rearrangement in leukemia patients.

Clustering Analysis of ALL Subtypes

While gene rearrangement techniques have facilitated the identification of several novel subtypes of B-cell acute lymphoblastic leukemia (B-ALL), a significant number of patients still exhibit no discernible genetic abnormalities. In 2019, Mullighan and colleagues undertook a comprehensive analysis using transcriptome sequencing, whole-genome sequencing, and whole-exome sequencing technologies. They examined data from 1,988 pediatric and adult patients to reclassify B-ALL into 23 distinct subtypes. Each of these subtypes is characterized by features such as chromosomal rearrangements, sequence mutations, or heterogeneous genomic alterations.

Figure 9. Gene expression profiles and subtype distribution across 1,988 patient cases.Figure 9. Gene expression profiles and subtype distribution among 1,988 cases, categorized by subtype and age.

In 2018, the renowned Japanese researcher Junko Takita utilized advanced RNA sequencing techniques to meticulously analyze the genetic characteristics of 121 pediatric patients with T-cell acute lymphoblastic leukemia (T-ALL). Through this investigation, Takita made a groundbreaking discovery of a subtype characterized by SPI1 rearrangements. She successfully identified two key fusion genes: STMN1-SPI1 and TCF7-SPI1. Notably, the study revealed that patients harboring these fusion genes constituted 3.9% of the pediatric T-ALL population, with these patients generally exhibiting poor prognoses.

Figure 10. Identification of new T-ALL subtypes through RNA-seq, including SPI1 fusion genes.Figure 10. Discovery of new T-ALL subtypes through RNA sequencing technology, including SPI1 rearrangements.

Assisting in Prognostic Evaluation of Pediatric Hematological Malignancies

Fusion genes hold significant clinical relevance in both acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL).

1. In pediatric ALL, the presence of fusion genes such as TCF3-PBX1, Ph+ALL (Philadelphia chromosome-positive ALL), and Ph-like ALL are considered intermediate-risk factors. Conversely, MEF2D rearrangements and TCF3-HLF fusion genes are associated with high-risk prognoses. The National Comprehensive Cancer Network (NCCN) guidelines provide clear prognostic assessments for different fusion types, indicating their important role in determining patient outcomes.

Figure 11. Prognostic classification of pediatric ALL based on 2020 NCCN guidelines.Figure 11. Prognostic grouping of pediatric ALL based on the 2020 NCCN guidelines.

2. In the 2018 American Society of Hematology (ASH) meeting, Kathryn G. Roberts presented a comprehensive review of novel subtypes of acute lymphoblastic leukemia (ALL) and provided reference guidelines for prognostic evaluation (Table 3).

Table 3 outlines key characteristics of novel ALL subtypes.

3. In 2019, the Children's Hospital of Philadelphia, in collaboration with St. Jude Children's Research Hospital, conducted a retrospective study on pediatric acute lymphoblastic leukemia (ALL) patients treated between 1998 and 2014. This study focused on the genetic subtypes of both B-ALL and T-ALL and provided an analysis and evaluation of their prognostic implications (Figure 12 and Table 4).

Figure 12. Genetic subtypes of B-ALL and T-ALL patients, showing variations in leukemia.Figure 12. Genetic subtypes in B-ALL and T-ALL patients, showing distinct genetic variations in leukemia.

Table 4 summarizes the prognostic evaluation of B-ALL and T-ALL patients, focusing on genetic markers.

Discovery of New Potentially Pathogenic Fusion Genes

In a comprehensive sequencing analysis by Chen and colleagues, whole-exome and transcriptome sequencing technologies were employed on 61 adult and 69 pediatric cases of T-cell acute lymphoblastic leukemia (T-ALL). This study not only validated known major genetic abnormalities associated with T-ALL but also unveiled 18 novel fusion genes and six gene mutations, all of which are reported here for the first time. Notably, among the newly identified fusion genes, ZBTB16-ABL1, TRA-SALL2, and NKX2-1 were closely correlated with disease relapse. ZBTB16-ABL1, in particular, was confirmed as a significant leukemia-driving factor and showed sensitivity to tyrosine kinase inhibitors. This discovery offers a new potential therapeutic target for T-ALL treatment.

Figure 13. RNA-seq discovery of fusion genes offering new leukemia prognosis insights.Figure 13. Newly discovered fusion genes identified by RNA-seq, providing insights into leukemia prognosis and treatment targets.

Conclusion

Whole transcriptome sequencing technology, through the removal of rRNA, not only comprehensively captures information on mRNA but also reveals a wealth of transcript-level insights, including lncRNA, pseudogenes, and small RNA. Compared to traditional methods such as PCR combined with FISH, whole transcriptome sequencing offers broader coverage and higher throughput. It enables precise detection of known genes while possessing the unique ability to uncover novel transcripts. This technology has successfully transitioned from research settings to clinical applications, particularly demonstrating significant clinical value and profound implications in the diagnosis, prognostic evaluation, and therapeutic guidance of hematological malignancies.

References:

  1. Shanrong Zhao, Ying Zhang, Ramya Gamini, Baohong Zhang, David von Schack. Evaluation of two main RNA-seq approaches for gene quantifcation in clinical RNA sequencing: polyA+ selection versus rRNA depletion. Scientific REpoRtS 2018;8:4781
  2. R Marschalek, et al. The MLL recombinome of acute leukemias in 2017. Leukemia 2018;32:273 – 284
  3. AX5-driven subtypes of B-progenitor acute lymphoblastic leukemia. Nat Genet, 2019; 51:296-307
  4. Masafumi Seki,Junko Takita. Recurrent SPI1 fusions in pediatric T-cell acute lymphoblastic leukemia: novel mutations with poor prognosis. The Japanese journal of clinical hematology 59;4:439-447
  5. NCCN Guidelines Version 2.2020. Pediatric Acute Lymphoblastic Leukemia
  6. Kathryn G. Roberts. Genetics and prognosis of ALL in children vs adults. American Society of Hematology 2018;137-145
  7. Comparative features and outcomes between paediatric T-cell and B-cell acute lymphoblastic leukaemia. Lancet Oncol 2019; 20: e142–54
  8. B lymphoblastic leukemia/lymphoma: new insights into genetics, molecular aberrations, subclassifcation and targeted therapy. Oncotarget. 2017 Jul15;8(39):66728-66741
  9. Identification of fusion genes and characterization of transcriptome features in T-cell acute lymphoblastic leukemia. PNAS, 2018, vol. 115, no. 2, 373–378
* For Research Use Only. Not for use in diagnostic procedures.


Inquiry
RNA
Research Areas
Copyright © CD Genomics. All rights reserved.
Top