MicroRNAs (miRNAs) are a type of naturally occurring, non-coding single-stranded RNAs (ncRNAs) that are approximately 18-25 nucleotides long. These miRNAs regulate the expression of genes within cells at the post-transcriptional level. In addition to their role in gene regulation, miRNAs also play vital roles in various biological processes, including cell proliferation, differentiation, apoptosis, hematopoiesis, and more. The expression levels of miRNAs are closely associated with the occurrence and progression of different types of tumors.
Regulation of miRNAs follows a specific pattern. It is estimated that miRNA-encoding genes make up about 1-5% of mammalian genes, and over 60% of human protein-coding genes are regulated by miRNAs. The synthesis of miRNAs begins with the transcription process by RNA polymerase II, typically located in the intronic region with their own promoter regions. During the production of long transcripts, a complex is formed between Drosha and the cofactor protein DGCR8, which binds to the primary miRNA (pri-miRNA). Drosha contains two RNase structural domains that cleave the 3' and 5' ends of the pri-miRNA, resulting in the formation of hairpin-shaped precursor miRNA (pre-miRNA). Subsequently, in the nucleus, the pre-miRNA is transported to the cytoplasm via the exportin 5-miRNA pathway, mediated by the exportin 5-RNA-GTP complex. In the cytoplasm, the pre-miRNA is further processed by the Dicer nuclease, along with the TAR RNA-binding protein (TRBP), which cuts off the terminal loop region, generating a double-stranded RNA fragment. This fragment is then incorporated into the RNA-induced silencing complex (RISC). The AGO proteins within the RISC complex select one strand from the double-stranded fragment, forming the active RISC complex that carries out diverse regulatory functions.
miRNA regulation pattern. (Gebert et al., 2019)
In contrast, plants employ a different mechanism for miRNA regulation. In plants, miRNAs target mRNAs, causing their degradation and loss of protein-coding function. Specifically, mature miRNA molecules form complexes with AGO proteins, which bind to target genes and cleave them at the middle-most binding site of the miRNA. This cleavage leads to the degradation of the mRNA into fragments, resulting in the loss of function of the protein encoded by the mRNA.
let-7, one of the pioneering miRNAs to be identified and named, has emerged as a prominent subject of miRNA research. Its distinctive role as an oncogenic factor has been extensively explored, as it effectively hampers tumor growth by down-regulating key factors such as MYC, HMGA2, BLIMP1, and members of the RAS family. Consequently, let-7 exhibits the ability to decrease cancer invasiveness, chemo-resistance, and radioresistance, while occasionally displaying proto-oncogene characteristics. The varied expression of let-7 across numerous cancer types positions it as a potential marker for tumor screening.
Another notable player, miR-210, operates as a proto-oncogenic miRNA and demonstrates significantly elevated expression in various cancer cells, including pancreatic and breast cancer. Remarkably, suppressing miR-210-3p has proven to be a potent method for impeding bone metastasis of prostate cancer cells in preclinical models. By targeting negative regulators of the NF-κB signaling pathway, namely TNIP1 and SOCS1, miR-210-3p actively promotes EMT (mesenchymal transition), invasion, and migration, ultimately triggering NF-κB pathway activation. Additionally, miR-210 emerges as a crucial target of hypoxia-inducible factors, and its heightened levels have been linked to in vivo hypoxia signaling.
MiR-210. (Huang et al., 2010)
Endogenous competitive binding elements for miRNAs are essential components involved in various non-coding RNA functions, including pre-transcriptional regulation. One particular function is their ability to act as sponges by adsorbing miRNAs. This category of RNA, capable of miRNA adsorption, is referred to as Competitive endogenous RNA (ceRNA). Besides lncRNAs and circRNAs, both mRNAs and pseudogenes can function as ceRNAs.
Recommended Reading: Overview of Competing Endogenous RNA (ceRNA).
Several types of miRNAs are recognized for their complementary binding to target regions. In plants, this involves complete degradation and target mimics, while in animals, the dominant mode is UTR sequence base complementary pairing.
By analyzing miRNA targets and their pairing modes, it becomes possible to construct a ceRNA regulatory network. In this network, non-coding RNAs like lncRNAs and circRNAs compete for miRNA binding, leading to changes in miRNA-regulated target genes. Consequently, these changes manifest in the levels of protein expression. The core of the ceRNA regulatory network revolves around miRNAs. A ceRNA can bind to multiple miRNAs, and in animals, the specific sites on these ceRNAs that bind to miRNAs are known as miRNA recognition elements (MREs). In plants, target mimics have stricter criteria, requiring non-compliance with the base complementary pairing principle for miRNAs in intermediate positions. For instance, miR-399, a classic small RNA used in plant target mimic studies, will undergo degradation when perfectly complementary to the target region bases. Conversely, when miR-399 does not perfectly match the target region and contains a bulge in the middle, it becomes subject to competition for binding by the target region.
IsomiRs, which are isoforms of miRNAs resulting from various molecular processes, exhibit differences in length or sequence. These isomiRs are generated through selective cleavage by DROSHA and DICER, exonuclease-mediated shortening of miRNA ends, production of non-templated miRNA variants by nucleotidyltransferases, and miRNA editing.
There are several types of isomiRs. The typical miRNAs consist of mature body sequences recorded in the miRNA database miRBase. Additionally, there are 5' isomiRs and 3' isomiRs, which refer to isomiRs with variations in length at the 5' and 3' ends, respectively. The presence of 5' isomiRs leads to a leftward shift in the miRNA seed sequence (positions 2-8). Furthermore, 5' and 3' isomiRs indicate nucleotide variations at both the 5' and 3' ends, while A-I editing represents polymorphic isoforms of isomiRs with the same length as the typical sequence, except for differences within the mature sequence. Lastly, there are hybrid isomiRs that exhibit changes in both length and sequence. Due to alterations in length and sequence, isomiRs and classical miRNAs may regulate the same target or target different ones, thus expanding the scope of miRNA regulation.
Although stem-loop qRT-PCR is commonly employed for the detection of isomiRs in commercial settings, next-generation sequencing (NGS) remains the preferred method for isomiR detection. NGS is favored because the efficiency and specificity of the assay are not affected by the nature of the sequence being detected.
The process of constructing a miRNA-seq library involves distinct characteristics that set miRNAs apart from mRNAs, lncRNAs, and circRNAs. Notably, miRNAs possess significantly shorter sequences. Consequently, the library construction procedure leverages the unique sequence properties of miRNAs by directly incorporating junctions. This is followed by reverse transcription amplification, the addition of sequencing junctions via PCR amplification, and subsequent purification of DNA fragments within a specific size range using gel-based techniques. Subsequently, online sequencing is performed.
It is important to note that the library construction process for miRNAs differs from that of mRNAs, lncRNAs, and circRNAs. Therefore, when both miRNA and mRNA detection is required, separate libraries must be constructed. This segregation is necessary because the fragment screening step in the library construction process, which is relevant to mRNAs, lncRNAs, and circRNAs, can inadvertently filter out small RNA fragments.
Please read our Guideline: Small RNA Library Preparation to explore how to construct a miRNA-seq library.
The naming rules for miRNAs dictate how they are identified and labeled. In NGS sequencing, miRNAs, which have short sequences, can be sequenced using both SE50 and PE150 modes. This allows researchers to obtain the actual sequence of miRNAs, including the standard reference sequence from miRBase and its isomeric isomiRs.
The official miRBase website provides guidelines for naming identified and newly predicted miRNAs, as outlined in the blog post 'What's in a name'. According to these rules, miRNAs from plants are named using a 3-letter abbreviation of the Latin name of the species, followed by '-miR/MIR' and a number. For animals, the abbreviation is '-miR/mir' followed by the number. 'miR' denotes a mature miRNA sequence, 'MIR' is used for the precursor in plants, and 'mir' is used for animals. It's important to note that there is no hyphen between the 'miR/mir' and the number for animals.
In the past, the nomenclature used '' to mark the complementary sequence of a miRNA in its hairpin precursor. However, this has been updated. The new convention uses '-3p' and '-5p' as suffixes to distinguish the two sequences instead of ''. When labeling miRNAs, 'mir/MIR-"p3"' and 'mir/MIR-"p5"' are used to indicate that the sequence at the end of the 'mir/MIR' arm only aligns with the precursor sequence, not the mature body 'miR'. This distinguishes them from 'miR-3p/5p', which aligns with the mature body. The suffixes '3p' and '5p' are used for immediate distinction.
For newly identified miRNAs, a unique nomenclature is adopted to indicate the position at the end of the precursor arm and the isoform form. If two forms of miRNA isoforms are identified within the same sequence, both forms are included in the naming. MiRNAs starting with the Latin name of the species are known miRNAs and their isoforms, while those starting with the Latin name of other species are newly identified miRNAs that are conserved in both the species and their counterparts. MiRNAs that are not listed in miRBase are labeled as 'PC' (Predicted Candidate), indicating that they are brand-new miRNAs.
In summary, miRNAs are named based on specific rules depending on the species. The naming convention differentiates between known miRNAs, their isoforms, newly identified miRNAs conserved in multiple species, and brand-new miRNAs without miRBase information.
Despite its small size of approximately 22 nucleotides, miRNA plays a significant role in gene expression by regulating numerous target genes. The study of miRNA function often involves predicting its binding sites with target genes. Commonly used miRNA research databases employ prediction techniques based on the binding of the miRNA's seed region to the mRNA. The seed region, spanning from the 2nd to the 8th nucleotide, is the most conserved segment in miRNA and typically exhibits complete complementarity to the target site on the mRNA's 3'-UTR.
Main types of miRNA seed sequences. (Riolo et al., 2020)
While each database employs its own specific rules, they generally adhere to the following fundamental principles: complementarity between miRNAs and their target sites, conservation of miRNA target sites across different species, thermal stability of the miRNA-mRNA duplexes, and absence of complex secondary structures at miRNA target sites.
Please read our article How to Predict miRNA Targets? for more information about miRNAs databases.
References: