In cells, in addition to studying the expression mechanism and regulatory mechanism of mRNA, some non-coding RNAs such as IncRNA, circRNA and miRNA also play an important role in various physiological reactions of cells. For example, most lncRNAs and circRNAs can regulate gene expression at the transcriptional and post-transcriptional levels, play an important role in a variety of biological processes, and are closely related to the occurrence of diseases. The ceRNA hypothesis reveals a new mechanism of RNA-RNA interaction. The ceRNA ( competing endogenous RNA) does not refer to a new RNA molecule, but a new gene expression regulation model. It refers to the presence of miRNA binding sites on some non-coding RNAs such as IncRNA or ncRNA and mRNA in cells, that is, MREs functional elements. These RNA molecules containing homologous MREs can competitively bind to miRNAs and regulate each other's expression levels. Studies have shown that miRNA can lead to gene silencing by binding to target genes, so the ceRNA network can regulate the expression of target genes by competitively binding to miRNA.
Total RNA-Seq is a technique for analyzing all RNA molecules in samples by high-throughput sequencing. Comprehensive information on gene expression, transcript structure, splicing variation, and RNA modification, ceRNA et al., is provided by capturing and sequencing all RNA molecules (including coding RNA and non-coding RNA) in the sample. It is an indispensable technology in transcriptomics research, which is widely used in gene function research, disease mechanism analysis and biomarker discovery.
What is Total RNA-Seq
Unlike traditional methods that only analyze mRNA (such as mRNA-Seq), Total RNA-Seq not only captures mRNA encoding proteins, but also captures various non-coding RNAs (such as miRNA, lncRNA, rRNA, tRNA, etc.). Whole transcriptome sequencing can discover a large number of lncRNAs and circRNAs while studying mRNAs, and analyze their expression levels and transcript structures. Further association analysis of lncRNAs, circRNAs and mRNAs can further study the regulatory network of lncRNAs and circRNAs. This allows Total RNA-Seq to fully reveal the expression profile of all RNA in cells or tissues, making it suitable for more complex biological studies.
Technology and Principle of Total RNA-Seq
Total RNA-Seq can analyze four kinds of RNA (miRNA, lncRNA, mRNA, circRNA) information at one time by constructing two libraries-small RNA library and chain-specific library without rRNA. The technical process is different when the database is different. The following describes the respective technical processes:
Strand-Specific RNA-seq
- RNA extraction: physical or chemical methods are used to extract total RNA, including mRNA, rRNA, tRNA, lncRNA, etc., from cell or tissue samples. For subsequent accurate sequencing, this step is required to ensure high quality RNA (be sure to ensure cold chain operation of the sample or use RNA protection reagents).
- Removal of rRNA: Since rRNA accounts for most of the total RNA (90%) but is not involved in protein coding, it is necessary to reduce the proportion of rRNA in the total data by rRNA removal techniques (e.g., rRNA specific probe capture, nuclease degradation, etc.) to increase the abundance of mRNA and other non-coding RNAs and to improve the depth of sequencing.
- RNA fragmentation: use RNA enzymes or sonic cleavage, etc. to randomly shear and fragment the extracted RNA into small fragments of 200-500bp.
- Strand-specific reverse transcription: using specific reverse transcription primers (e.g. using primers with tags) so that we are able to distinguish between positive and negative stranded transcripts. The choice of a specific strand-specific tag allows the reverse transcriptase to synthesize cDNA by reverse transcribing only the RNA from a specific direction (positive or negative strand), ignoring the RNA from the other strand.
- Second-strand cDNA synthesis: after completion of first-strand cDNA synthesis, second-strand cDNA synthesis is carried out by DNA-dependent reverse transcriptase or DNA polymerase. Again, in order to maintain strand specificity, specific chemical tags or adaptors are added to the second strand synthesis to differentiate between RNAs of positive and negative strand origin. in this way the synthesized double stranded cDNA maintains the strand specific information from the original RNA, i.e., it is ensured that the synthesized cDNA fragments reflect the positive or negative strand information.
- Library construction and amplification: Amplify the library by adding specific sequences for splice junctions, PCR amplification (using specific primers), and other operations to ensure that the final library generated reflects the strand-specificity of the transcript.
- Sequencing: The constructed library is sequenced by high-throughput sequencing technology (e.g. Illumina, PacBio, Oxford Nanopore, etc.) to obtain a large amount of short sequence data, and the source strand of each reads can be identified by aptamer or tag information.
- Data analysis: Sequencing data are analyzed using bioinformatics tools, including quality control, data cleaning, comparison to reference genomes, expression quantification, differential analysis, and functional annotation.
The technical process of small RNA library construction
- RNA extraction: RNA fragments greater than 200 nt such as mRNA, rRNA, etc.were removed by using a small RNA purification kit or by precipitation to retain small RNA molecules.
Small RNA enrichment : Extracting small RNA molecules less than 200 nt including miRNA, siRNA, piRNA, etc.
- 3' and 5' end modification: add a specific sequence tag (e.g., A tail) to the 3' end and a specific aptamer cap (e.g., RNA 5' cap) to the 5' end of small RNA to protect the small molecule of RNA from degradation, and also provide the necessary recognition sites for subsequent reverse transcription and amplification.
- Reverse transcription to cDNA: The small RNA molecule is single-stranded, so the reverse transcriptase needs to reverse transcription at its 3' end aptamer and A tail to synthesize the corresponding cDNA. In this step, should be pay attention to the use of primers with aptamer sequences for reverse transcription to ensure that the synthesized cDNA is connected to the aptamer sequence for subsequent PCR amplification and library construction.
- PCR amplification: cDNA was synthesized by PCR amplification to obtain sufficient library. This step requires the use of specific primers (including sequences connected to aptamers, sequences capable of specifically amplifying small RNA libraries). In addition, ensure not to over-amplify, avoid bias, and ensure the representativeness and uniformity of various small RNA molecules in the library.
- Purification and fragmentation of the library: The amplified library was purified to remove unconnected primers and unnecessary impurities (such as residual dNTPs, salt, etc.) during the amplification process. In addition, if the fragment size of the small RNA in the library is not suitable for the requirements of the subsequent sequencing platform for example, the fragment size required by the Illumina platform is usually between 200-500 nt, further fragmentation operations are required.
- High-throughput sequencing: Purified and quality-controlled libraries were loaded into high-throughput sequencing platforms (such as Illumina, BGISEQ, Ion Torrent, etc.) for sequencing. Small RNA library sequencing generally uses Single-end or Paired-end sequencing technology, and the specific selection depends on the experimental design and the required data depth.
Whole transcriptome shotgun sequencing(Jiang Z et al., 2015).
Applications of Total RNA-Seq
Total RNA-Seq, as a comprehensive RNA analysis technology, is widely used in the following fields:
DEGs study:
The main cause of paroxysmal nocturnal hemoglobinuria (PNH) is due to PIGA mutations, but some phenomena in PNH, such as high rate of thrombosis and abnormal proliferation, cannot be explained by PIGA mutations alone, and there must be other genes at play. To investigate the presence of these genes, the researchers selected CD59+ and CD59- peripheral blood mononuclear cells from six patients with PNH as well as from a control group (six healthy volunteers matched for sex and age to the patients) for WES and Total RNA-Seq, and, based on the gene expression of the cells from the healthy tissues, selected genes that were either up-regulated or down-regulated in CD59+ and CD59- cells of patients with PNH for functional analysis. Down-regulated genes, which were functionally analyzed, were found to be partially involved in thrombosis, proliferation and immune and neutrophil responses. These pathways involved in DEGs were found to be very different from those in healthy individuals, and it is possible that differential expression of these genes leads to abnormalities in platelet coagulation and hemostasis, as well as activation and leukocyte immune responses in patients with PNH, and thus a high rate of thrombosis may occur. The researchers also identified SELP, NRP1 and vWF, FLT1 genes that may be responsible for the abnormal proliferative phenomena and coagulopathy in PNH patients in addition to PIGA(Du Y et al., 2024).
Non-coding RNA research:
In breast cancer patients, 20 % -25 % are HER2-positive patients, which can be targeted for HER2, but for advanced patients, this targeted therapy is not effective, so it is necessary to find new treatment methods. Since the interaction between circRNA-miRNA-mRNA, a non-coding RNA, has been shown to play a regulatory role in breast cancer patients, in order to study the regulatory mechanism of ceRNA interaction in HER2-positive patients, the researchers obtained the cancer tissues of 6 HER2-positive breast cancer patients and performed Total RNA-Seq using matched normal tissues, thereby identifying 6960 DE mRNAs, 133 DE miRNAs and 1691 DE circRNAs and constructing a ceRNA network. Functional analysis of them revealed that they were involved in cell division, immune response and calcium signaling pathway. Among the 40 ceRNA networks, the circDOCK1 / miR-138-5p / GRB7 regulatory network was selected for research. The pathological results of 102 patients showed that the circDOCK1 / miR-138-5p / GRB7 axis promoted metastasis and progression of cancer cells in HER2-positive breast cancer patients(Zhang Y et al., 2024).
Disease Mechanism Study:
Type 2 diabetes mellitus (T2D) has become a highly prevalent disease, especially in South Asian countries. Previous studies have shown that "lean-fat phenotype" (a phenomenon of high visceral fat despite lean body weight) is an important cause of T2D in South Asians. In order to investigate whether transcription factors play a role in regulating the mechanism of T2D caused by lean-fat phenotype, the researchers chose a subset of peripheral subcutaneous adipose tissue samples from patients with T2D and healthy individuals for total RNA-seq. A subset of tissue samples from T2D patients and healthy individuals were subjected to total RNA-seq, from which 52 differentially expressed miRNAs were identified as well as a network of 110 protein molecules including (transcription factors, protein kinases, and intermediate linker proteins) in a small network that seems to be a convergence point in the adipogenesis process, and a number of genes in the small network with significantly altered expression patterns in other diabetes mellitus. altered. This also provides new targets for T2D treatment and drug development(Saxena A et al., 2021).
Exploring Physiological Processes:
Broccoli is popular for its low calorie content and rich nutrients such as flavonoids, phenolic compounds, and canola glycerin, but it can be deprived of nutrient energy from the root system, resulting in rapid post-harvest yellowing, senescence, and nutrient loss. Studies have shown that exogenously applied amino acids can affect broccoli ripening and senescence, and that LED lighting can mitigate postharvest loss of nutrients and ascorbic acid content in broccoli. To explore the relationship between red LED illumination and amino acid metabolism in broccoli, the researchers performed total RNA-seq and amino acid metabolism analyses on samples of broccoli (0d, 3d after darkness, 3d after red LED irradiation), and found approximately 20,000 DEGs, 3,400 IncRNAs, and 135 miRNAs in irradiated broccoli that were significantly different from the first two groups expressed significantly differently. In the experimental group, the content of 16 amino acids contained in broccoli remained almost unchanged. Red LED irradiation led to significant changes in various RNAs involved in amino acid anabolism pathway, aspartic acid synthesis pathway and His biosynthesis pathway, indicating that the red LED irradiation promotes the synthesis of amino acids and maintains the original nutrient content of broccoli after harvest(Yan Z et al., 2023).
More RNA-seq applications, refer to "Overview of RNA Sequencing Applications"
mRNA Sequencing vs Total RNA Sequencing
Although both mRNA-seq and total RNA-Seq are used for RNA expression analysis, their main differences lie in the type of RNA captured and the scope of application:
Target RNA types:
mRNA-seq focuses on capturing and analyzing mRNA, while total RNA-Seq can analyze a variety of RNAs at once: mRNAs, miRNAs, lncRNAs, etc., but both of them generally have the step of removing rRNAs.
Application scope:
mRNA-seq is mainly used to analyze gene expression level, transcript structure and splicing variants, focusing on understanding the protein coding information of genes.
Total RNA-Seq can not only be used for mRNA analysis, but also for quantitative analysis of non-coding RNA, and therefore is widely used in the study of transcriptional regulation, non-coding RNA function and the interaction between genome and epigenetics.
- For example, total RNA-seq can analyze ceRNA: Chicken has been one of the widespread sources of meat protein acquisition for humans. The growth cycle of chickens is getting shorter and shorter under modernization, but the fat accumulation is not affected. What is the molecular mechanism of this effect on fat accumulation is not well understood. Previous studies have shown that non-coding RNAs as well as ceRNA networks have roles in fat formation accumulation and aspects. The researchers chose Guangxi local jatropha chickens as the study material for total RNA-Seq, and identified hundreds of differentially expressed mRNAs, miRNAs, circRNAs and lncRNAs in the abdominal fat, liver, back skin and other organs of the chickens, and compared these differential RNAs between the high-fat group and the low-fat group. ceRNA Network analysis Bai-Ou-Ming These genes are involved in various pathways such as carboxylic acid metabolism, fatty acid metabolism, and glycerolipid metabolism, which also suggests that these pathways are collectively involved in chicken fat accumulation. The researchers also identified some miRNAs such as gga-miR-101-2-5p, gga-miR-460b-5p, gga-miR-6595-5p, etc. as well as lncRNAs such as MSTRG.21310, MSTRG.18043, etc. and circRNAs such as novel_circ _PTPRD, etc., as well as genes regulated by the ceRNA network such as ELOVL5, FADS2, PLIN2, etc., which can be candidate targets for muscle modification(Xiao C et al., 2022).
- In eukaryotic cells snRNAs (which form the core of the spliceosome and are responsible for intron removal) as well as snoRNAs (which are involved in the formation of the nucleolus as well as in the modification of rRNAs) are important for the innate immune response. It has been shown that RNA modifications affect the function of these ncRNAs and thus have an impact on cellular function. The researchers performed post-transfection total RNA-Seq by transfecting cells with U25 snoRNA analogs and investigated the effects of modifications on ncRNAs, especially ψ and m⁵C, on the ncRNAs and their toxicological effects on the cells by simulating the synthesis of synthetic artificial RNAs from modified nucleotides in natural ncRNAs. The artificial analogs of ncRNAs were found to activate the innate immune response in humans, but the RNA modifications on them could attenuate the immunostimulation of ncRNAs on the cells (reducing the nonspecific response) as well as the toxicity to the cells, and at the same time, if ψ is converted to U it would affect the stability of the structure of the ncRNAs, which would provide a theoretical basis for our drugs to treat the diseases caused by ncRNAs(Stepanov G et al., 2018).
Data analysis complexity: The data analysis of mRNA-seq is relatively simple, which mainly focuses on the expression of mRNA. The data analysis of total RNA-Seq is relatively complex. It needs to deal with various types of RNA, especially non-coding RNA, which requires analytical tools to be able to process and distinguish different types of RNA.
More RNA-seq techniques and their uses can be found in the following pages: "Overview of RNA Sequencing Techniques".
References:
- Du Y, Wang D, Hu Q, Lai Z, Yang C, He H, Wang S, Zhang H, Chen P, Li Z, Chen M, Han B. "Identifying Genes Associated With Proliferation, Immunity and Thrombosis in Paroxysmal Nocturnal Haemoglobinuria." J Cell Mol Med. 2024;28(23):e70295. doi: 10.1111/jcmm.70295
- Zhang Y, Yang M, Wang Y, Zhao J, Lee PY, Ma Y, Qu S. "Identification and Validation of circDOCK1/miR-138-5p/GRB7 Axis for Promoting Breast Cancer Progression." Breast Cancer (Dove Med Press). 2024;16:795-810. doi: 10.2147/BCTT.S495517
- Saxena A, Mathur N, Tiwari P, Mathur SK. "Whole transcriptome RNA-seq reveals key regulatory factors involved in type 2 diabetes pathology in peripheral fat of Asian Indians." Sci Rep. 2021;11(1):10632. doi: 10.1038/s41598-021-90148-z
- Yan Z, Xu D, Yue X, Yuan S, Shi J, Gao L, Wu C, Zuo J, Wang Q. "Whole-transcriptome RNA sequencing reveals changes in amino acid metabolism induced in harvested broccoli by red LED irradiation." Food Res Int. 2023;169:112820. doi: 10.1016/j.foodres.2023.112820
- Xiao C, Sun T, Yang Z, Zou L, Deng J, Yang X. "Whole-transcriptome RNA sequencing reveals the global molecular responses and circRNA/lncRNA-miRNA-mRNA ceRNA regulatory network in chicken fat deposition." Poult Sci. 2022;101(11):102121. doi: 10.1016/j.psj.2022.102121
- Stepanov G, Zhuravlev E, Shender V, Nushtaeva A, Balakhonova E, Mozhaeva E, Kasakin M, Koval V, Lomzov A, Pavlyukov M, Malyants I, Zhorov M, Kabilova T, Chernolovskaya E, Govorun V, Kuligina E, Semenov D, Richter V. "Nucleotide Modifications Decrease Innate Immune Response Induced by Synthetic Analogs of snRNAs and snoRNAs." Genes (Basel). 2018;9(11):531. doi: 10.3390/genes9110531