Among the vast array of RNA modifications that have undergone intensive scrutiny, 5-methylcytosine (m5C) occupies a prominent position as one of the most extensively researched and comprehensively understood. m5C is a prevalent modification found on various RNA molecules, and its presence and role extend across a spectrum of RNA species.
Specifically, m5C plays a pivotal role in diverse cellular processes. It adorns transfer RNAs (tRNAs), where it acts as a regulatory element, exerting control over the translation process. Additionally, m5C is a significant participant in ribosomal ribonucleic acids (rRNAs), where it influences the regulation of ribosomal development and synthesis. Furthermore, m5C modifications grace messenger RNAs (mRNAs), where they exert influence over the structural integrity and stability of mRNA molecules, as well as the intricate orchestration of the translation process itself.
You can refer to our article Overview of Sequencing Methods for RNA m5C Profiling for more information.
Recognition of m5C modifications on mRNAs dates back many years, but their comprehensive study was hindered by the scarcity of appropriate tools and methodologies. Technological challenges impeded progress, as conventional techniques lacked the precision and sensitivity required to explore the intricacies of m5C modifications in mRNA molecules.
The resurgence of interest in m5C modifications on mRNAs can be attributed to the development of several cutting-edge research methods, including MeRIP-seq (Methylated RNA Immunoprecipitation Sequencing), miCLIP, Aza-IP, and RNA-BS-seq (RNA Bisulfite Sequencing). These innovative techniques have rekindled the fascination with RNA m5C modifications, placing them back at the forefront of scientific inquiry.
RNA-BS-seq: A Game-Changer for m5C Detection
Among these pioneering methods, RNA-BS-seq is particularly noteworthy for its ability to detect m5C modifications at the single-base resolution level. This technique leverages the principles of bisulfite treatment, which has been widely used in DNA methylation studies. In RNA-BS-seq, RNA is first subjected to bisulfite treatment, which selectively converts unmodified cytosines (C) to uracils (U) while leaving m5C-modified cytosines intact as cytosines. After the bisulfite treatment, PCR is performed to amplify the modified RNA molecules. During PCR, uracils (U) are converted to thymines (T), providing a clear distinction between m5C and unmodified cytosines.
Principle of m5C detection by RNA bisulfite sequencing and combination with next-generation sequencing. (Motorin et al., 2010)
Targeted mRNA Profiling
RNA-BS-seq is meticulously tailored to specifically target mRNA molecules. This tailored approach allows researchers to streamline their investigations towards comprehending modifications within the transcriptome, rather than encompassing the entire spectrum of RNA species. This precision becomes particularly advantageous when delving into the functional implications of mRNA modifications in gene expression and regulation.
Single-Base Resolution Precision
One of the most salient attributes of RNA-BS-seq is its exceptional capacity to discern m5C modifications at a single-base resolution. This remarkable precision is indispensable for precisely localizing m5C sites within individual mRNA molecules. It empowers researchers not only to pinpoint which cytosines are methylated but also to determine the density and distribution of m5C modifications along the mRNA sequence. Single-base resolution is pivotal for elucidating the potential functional roles of specific modifications within mRNA, encompassing their effects on mRNA structure, stability, and translation efficiency.
Elevated Precision and Reliability
RNA-BS-seq is renowned for its remarkable precision in identifying m5C modifications. Through the utilization of bisulfite treatment and state-of-the-art next-generation sequencing, this technique minimizes the occurrence of false positives and negatives. This rigorous methodology ensures that the identified m5C sites are not only dependable but also biologically significant. The amalgamation of bisulfite conversion and sequencing renders RNA-BS-seq an exceptionally robust and precise tool for profiling m5C modifications throughout the transcriptome.
I. Total RNA Sample Detection
II. Library Construction and Quality Control
III. Library Quality Control
IV. Sequencing
After successfully passing the quality control checks, the libraries are primed for sequencing. Multiple libraries can be strategically pooled based on their effective concentration and the desired volume of sequencing data. Sequencing is typically executed on a platform such as HiSeq, employing the PE150 (paired-end 150) sequencing strategy. This sequencing technique, known as Sequencing by Synthesis, entails the introduction of fluorescently labeled nucleotides, DNA polymerase, and junction primers into the flow cell, where DNA fragments are amplified. The sequencer captures and transforms the fluorescence signals into sequencing data using specialized computer software, thereby furnishing comprehensive sequence information for the segments earmarked for sequencing.
Data analysis is a critical component of m5C RNA methylation sequencing (RNA-BS) that involves several key steps to ensure the accuracy and reliability of the results. Below, we describe the quality control steps and contrasted quality assessment involved in the data analysis process.
RNA bisulfite sequencing bioinformatics analysis – CD Genomics
Quality Control of Sequencing Data
Data Quality Assessment
After the initial raw data filtering and preprocessing, data quality assessment is performed to summarize the characteristics of the sequencing data. This assessment typically includes the following checks:
Contrasted Quality Assessment
The BS-RNA analysis process is divided into three main steps: preprocessing, contrasting, and annotation. In the contrasting step, various transformations are applied to the reference genome sequences, sequencing data, and gene annotation files.
After the mapping step, BS-RNA provides several files for further analysis, including:
These files are essential for downstream analysis, annotation, and the identification of m5C modifications within the RNA sequence. The contrasting quality assessment ensures that the analysis is performed accurately and that the modifications are correctly identified and characterized.
Identification of DMCs
DMCs are identified by comparing DNA methylation levels between different experimental conditions or groups. This involves analyzing sequencing data to pinpoint sites where DNA methylation significantly differs. Methods such as bisulfite sequencing or methylated RNA immunoprecipitation followed by sequencing (MeRIP-seq) are commonly used for this purpose.
Annotation of DMCs
Once DMCs are identified, they contain information about their chromosomal location, start and end positions, and other relevant details. To understand their functional implications, these DMCs are annotated by determining which gene elements (e.g., 5'UTR, CDS, intron, 3'UTR, ncRNA, tRNA) they overlap with. This step helps associate DMCs with specific genes and gene regions.
Statistics of DMC-Modified Genes
After annotation, information about DMC-modified genes is extracted. This includes details about which genes are affected by DMCs, the specific DMCs involved, and their modification status (e.g., hypermethylated or hypomethylated).
Distribution of DMCs on Chromosomes
Understanding the distribution of DMCs across chromosomes can provide insights into whether certain chromosomes are more susceptible to methylation changes. This information is often graphically displayed to visualize the preferential distribution of DMCs on chromosomes.
Distribution of DMCs on Gene Elements
Similarly, DMCs are categorized based on their location within different gene elements (e.g., promoters, exons, introns). This analysis can reveal patterns in methylation changes within specific gene regions.
Functional Enrichment Analysis of DMC-Modified Genes
Functional enrichment analysis is a critical step in understanding the biological significance of DMCs. This analysis involves two main aspects:
Functional Enrichment Analysis Details
Fisher's exact test with Benjamini-Hochberg (BH) correction is commonly used for functional enrichment analysis. It determines whether certain GO terms or KEGG pathways are overrepresented in the list of DMC-modified genes.
The results of enrichment analysis are typically presented in tables and figures, listing all enriched GO/KEGG entries, including both significant and non-significant ones. This comprehensive view allows researchers to explore potential functional associations comprehensively.
Reference:
Motorin, Yuri, Frank Lyko, and Mark Helm. "5-methylcytosine in RNA: detection, enzymatic formation and biological functions." Nucleic acids research 38.5 (2010): 1415-1430.