The field of spatial transcriptomics is moving at an incredible pace. New databases are popping up all the time, each offering unique insights to help researchers tackle the complexities of biological systems. If you're diving into this field, you're in luck-there are some fantastic resources available to make your work a bit easier. Let's take a look at some of the most innovative spatial transcriptome databases in 2024:
The CROST database is a treasure trove of spatial transcriptomics data. It houses information from over 1,000 samples, spanning eight different species, 35 types of tissue, and 56 diseases. Each sample comes with a wealth of data-gene expression details, as well as precise spatial location info, often in the form of images. The best part? The database is designed with simplicity in mind, so researchers can easily browse and interact with the data.
But that's not all. CROST also offers two online tools that help users dive deeper into spatial transcriptomics, allowing for personalized data analysis. Now, let's break down the core features of the CROST database:
Database Access Link:
The sample overview module provides a high-level summary of the dataset, including general information regarding sample composition and tissue representation. This module enables users to gain an initial understanding of the breadth of data available for further exploration.
This module presents the results of clustering analyses, often complemented by dimensionality reduction techniques such as principal component analysis (PCA) or t-SNE. These methods allow for the visualization and interpretation of the underlying structure in high-dimensional transcriptomic data.
Spatially variable gene analysis identifies genes that exhibit differential expression patterns across different spatial locations within the tissue. This provides critical insight into the functional architecture of tissues and how gene expression varies in response to spatial cues.
In this module, transcriptomic data is annotated with information regarding the cellular composition of the tissue. By integrating single-cell RNA-sequencing data with spatial transcriptomics, the database assigns cell-type identities to different regions of the tissue, facilitating the study of cellular organization in its native context.
This analysis examines the correlation and spatial proximity of different cell types. Understanding these relationships is crucial for elucidating how cells interact in situ, especially in complex tissues where cellular interactions dictate tissue function and pathology.
Spatial communication analysis investigates how different tissue regions communicate with each other, focusing on both the clustering of cells and the types of interactions that occur between them. This approach helps uncover the dynamic processes involved in tissue homeostasis and disease progression.
CROST offers several visualization tools to aid in the interpretation of spatial transcriptomics data. These visualizations include heatmaps, spatial maps of gene expression, and 3D tissue models that allow for an interactive exploration of gene expression in its native spatial context.
A recent study introduced SpatialScope, a method that integrates single-cell RNA sequencing (scRNA-seq) data with spatial transcriptomics (ST) data. This approach enhances the resolution of ST data to single-cell levels, demonstrating the potential for databases like CROST to complement advanced analytical techniques (Zhang et al., 2023).
The CROST database represents a vital resource for researchers investigating spatial transcriptomics. With its extensive dataset and user-friendly tools, it facilitates the exploration of complex biological questions related to tissue development and disease mechanisms. As new analytical methods emerge and integrate with such databases, our understanding of spatial biology will continue to deepen.
SpatialDB is a publicly accessible database dedicated to organizing spatial transcriptomics data derived from published scientific papers. This repository serves as an essential resource for researchers studying gene expression in spatial contexts across various organisms and tissues. Currently, SpatialDB includes 24 datasets generated using eight distinct spatial transcriptomics technologies. These datasets predominantly originate from human, mouse, Drosophila, Caenorhabditis elegans, and zebrafish tissues.
The database can be accessed via the following link:
The homepage of SpatialDB includes a comprehensive Search SpatialDB module (equivalent to the "Search" menu). This feature allows users to query gene expression data based on either gene symbols or ENSEMBL IDs. For instance, to search for the expression of the KCTD12 gene, one can enter the gene name and click the Submit button. The resulting output will display the expression data relevant to the query.
By selecting the Browse button, users can access additional information regarding the datasets available in the database. This interface provides detailed descriptions of each dataset, including the associated tissue types, experimental conditions, and technologies used. Clicking the PMID link directs users to the corresponding published article, where further methodological details and contextual information can be found.
The database also includes tissue-specific expression maps. Within the Organism Atlas section, users can visualize the expression of selected genes across various tissue maps. For example, users can explore how the KCTD12 gene is expressed within different tissues by adjusting parameters such as tissue type, data processing methods, and the shape and size of scatter points on microscopic images. Additionally, the database allows for the download of both image files and quantitative expression data.
In a research, Huang et al. (2022) highlighted how spatial transcriptomics data from multiple species can be used to understand evolutionary differences in tissue organization and gene expression patterns. They noted that databases like SpatialDB are crucial for such comparative studies.
SpatialDB supports data sharing and exchange through its Upload and Download menus. Users have the option to upload their own spatial transcriptomics datasets for inclusion in the database. Furthermore, the database enables users to download datasets of interest for further analysis, thereby fostering collaborative research and data-driven investigations.
The SPASCER database is a cutting-edge resource that brings new possibilities to the study of spatial transcriptomics. Its primary focus is on understanding the complexities of tissue heterogeneity, the microenvironment of tissues, and the intricate web of intercellular interactions across diverse tissue structures. Within its framework, SPASCER houses 43 comprehensive studies, encompassing 1,082 datasets that capture the spatially resolved gene expression profiles of a wide array of tissues.
The database can be accessed here: SPASCER Database
The SPASCER platform is designed to facilitate deep exploration into the spatial dynamics of gene expression. Researchers will find several valuable resources for their studies:
The SPASCER database is intended to enhance the understanding of the spatial dynamics of gene expression within complex tissue architectures. By integrating data from various tissue types, this platform supports the study of how cellular interactions within specific microenvironments influence tissue function and structure. Researchers can utilize this resource to explore gene expression patterns, map cell-to-cell communication networks, and investigate gene regulatory pathways in spatially distinct regions of tissues.
CancerSRT is a specialized database that consolidates, organizes, and analyzes spatial transcriptomics (ST) data from 14 distinct human cancers. The database encompasses 46 datasets (347 sub-datasets), derived from five different spatial transcriptomics technologies. This resource aims to provide in-depth insights into the molecular mechanisms and spatial characteristics of cancerous tissues, facilitating a deeper understanding of tumor microenvironments and intercellular interactions.
The CancerSRT database can be accessed via the following link:
The CancerSRT portal supports a range of functionalities for data exploration, including dataset search, browsing, uploading, and downloading. Users can conduct multi-level analyses within the platform, exploring various dimensions of the data:
Gene-Level Analysis: Includes tasks such as marker gene identification and spatially variable gene detection, allowing for detailed exploration of gene expression patterns within different tissue regions.
Cell-Level Analysis: Users can investigate the spatial distribution of cells, annotate cell types, and infer interactions between immune and tumor cells. This feature enables a comprehensive view of cellular organization within the tumor microenvironment.
A study introduced SpaCET (Spatial Cellular Estimator for Tumors), which provides a framework to decompose cell identities in tumor ST data. By integrating gene expression changes and copy number alterations, SpaCET improves the accuracy of estimating cell types within tumors. This highlights the importance of databases like CancerSRT in providing essential data for such analyses.
Cancer-Relevant Analysis: This level focuses on cancer-specific evaluations, including:
Pan-Cancer Analysis: The platform offers tools for cross-cancer analysis, enabling:
In addition to the data exploration capabilities, CancerSRT also provides an online analysis module, which enables users to perform in-depth, customized analyses of the datasets available within the database.
CancerSRT serves as a crucial resource for cancer research, particularly in the field of spatial transcriptomics. By providing access to a wide array of datasets and offering comprehensive tools for multi-layered analysis, the platform aids researchers in exploring the spatial and molecular complexities of cancer. This database not only enhances our understanding of tumor biology but also facilitates the identification of novel biomarkers and therapeutic targets.
STOmicsDB, or the Spatial Transcript Omics DataBase, is like a treasure trove for anyone diving into the complex world of spatial transcriptomics. Imagine trying to map the exact locations where genes express themselves within a tissue-sounds tricky, right? But that's exactly what STOmicsDB aims to make easier. It's more than just a repository of data; it's an integrated platform that connects researchers with a rich collection of spatial gene expression datasets. And these aren't just any datasets-they're curated and organized to help users dig deeper into how genes are behaving across different species and tissues.
Instead of seeing gene activity in a broad, global sense, you can now explore it in precise, spatial contexts-seeing how cells in one part of a tissue might express a certain gene, while cells in another part stay silent. It's like zooming in on the microscopic choreography of life itself, and it's invaluable for anyone exploring the nuances of tissue architecture, developmental biology, or disease progression.
Whether you're looking at human tissues, mouse models, or even plant biology, STOmicsDB offers a vast array of datasets to fuel new insights into how genes and cells interact in their natural environments. The level of precision and detail it offers could be a game-changer in understanding spatially-driven gene expression.
The STOmicsDB is accessible at the following link:
STOmicsDB integrates 221 manually curated datasets, covering 17 distinct species. These datasets have undergone detailed annotation, with particular emphasis on the cell types present in the samples. The platform facilitates efficient access to these datasets, enabling researchers to query specific data using keywords. For instance, a query such as "Mouse Heart spatial transcript" allows users to quickly identify and access relevant datasets. Additionally, a drop-down menu is available for conducting searches based on specific categories, enhancing the versatility of the search function.
One of the standout features of STOmicsDB is its powerful visualization capabilities. Imagine being able to see gene expression not as just abstract data points, but as vivid, interactive maps that reveal the very composition of cells in a tissue. That's what the platform offers. When you search for a particular sample, under the "Visualization" tab, you're instantly greeted with detailed cellular composition maps that show exactly how genes are expressed across different regions of the tissue. It's like stepping inside the tissue itself and observing the action in real-time.
But it doesn't stop there. The beauty of STOmicsDB's visualization tools is their flexibility. If you need to adjust the view to zoom in on a specific area, highlight certain cell types, or even tweak how the data is displayed-no problem. You can customize the maps to fit your specific research goals. This kind of interactivity is key, allowing you to truly explore the data from multiple angles and get a sense of how gene expression shifts within different spatial contexts.
Whether you're digging into complex interactions between cell types or just trying to get a clearer picture of how gene activity correlates with tissue architecture, these visualization tools give you the freedom to analyze the data in ways that make sense for your work.
Aquila is redefining how researchers approach spatial omics, bringing together both transcriptomics and proteomics in one seamless, cutting-edge platform. It's not just a database-it's an entire ecosystem designed to empower scientists to dig deep into the spatial intricacies of gene expression and protein distribution.
What makes Aquila so groundbreaking is its versatility. It doesn't simply provide access to data; it opens up a world where you can visualize and analyze gene activity alongside protein distribution, all in the precise locations within tissues where they occur. Whether you're studying a developing embryo, a complex organ, or even cancerous tissues, Aquila lets you explore how genes and proteins are expressed spatially, creating a detailed map of molecular activity in its natural setting.
The tools are designed to let you zoom in, adjust, and fine-tune your view of the molecular landscape. You're not just looking at static snapshots of data; you're actively engaged in exploring dynamic patterns that reveal how cellular components are orchestrated within their tissue homes. This could be invaluable in uncovering everything from developmental processes to disease mechanisms, where context is just as important as the data itself.
Schematic design of Aquila. (Yimin Zhenget al,. 2022)
Diverse Dataset Coverage: Aquila encompasses studies from spatial transcriptome and proteome analyses, including both 2D and 3D experiments. The database integrates various technologies, making it a versatile tool for researchers in the field.
User-Friendly Visualization Tools: The platform offers multiple visualization formats, such as:
Spatial Cell Distribution: Visualizing how different cell types are distributed within a tissue sample.
Spatial Expression Maps: Displaying gene expression levels across different regions of the tissue.
Co-localization of Markers: Analyzing the spatial relationships between different proteins or genes.
Interactive Analysis Capabilities: Aquila allows users to submit their own data for interactive online analysis, enhancing the collaborative potential of research efforts. This feature empowers researchers to conduct personalized analyses tailored to their specific hypotheses.
Advanced Analytical Tools: The database includes tools for conducting spatial community analysis and spatial co-expression analyses, providing insights into the interactions and relationships between different cell types in their native environments.
Aquila's development is part of a broader trend in spatial omics research, where understanding the spatial organization of cells is crucial for deciphering complex biological systems. As highlighted in recent literature, databases like Aquila are essential for integrating diverse datasets and facilitating comprehensive analyses that can lead to new discoveries in areas such as cancer research, developmental biology, and tissue engineering.
Open-ST is an innovative open-source platform designed to create high-resolution three-dimensional (3D) maps of tissue samples using spatial transcriptomics technology. Developed by researchers from the Max Delbrück Center and various academic institutions, this platform aims to uncover the molecular mechanisms underlying tissue development and disease by providing detailed spatial maps of gene expression at subcellular precision.
Overview of Open-ST (Marie Schott et al,. 2024)
Open-ST has shown promising results in various studies. For instance, it has been successfully applied to analyze tissue samples from patients with head and neck cancer, revealing important insights into tumor heterogeneity and the spatial organization of different cell types within the tumor microenvironment. The platform's ability to capture the diversity of immune, stromal, and tumor cell populations has potential implications for identifying new biomarkers and therapeutic targets.In addition to cancer research, Open-ST can be utilized in studies of other diseases and tissue types, making it a versatile tool for exploring the complexities of biological systems.
Open-ST represents a significant advancement in the field of spatial transcriptomics, providing researchers with a powerful tool for high-resolution 3D mapping of gene expression. Its cost-effectiveness, flexibility, and user-friendly design make it an attractive option for scientists aiming to explore the intricate relationships between cells in their native environments.
References: