The Most Cutting-Edge Spatial Transcriptome Database in 2024

Transcriptome Analysis

At A Glance

01 CROST Database 02 SpatialDB: A Public Database for Spatial Transcriptomics Data 03 SPASCER Database 04 CancerSRT for Human Cancer 05 STOmicsDB 06 Aquila: A Comprehensive Spatial Omics Database 07 Open-ST 3D Spatial Transcriptomics

The field of spatial transcriptomics is moving at an incredible pace. New databases are popping up all the time, each offering unique insights to help researchers tackle the complexities of biological systems. If you're diving into this field, you're in luck-there are some fantastic resources available to make your work a bit easier. Let's take a look at some of the most innovative spatial transcriptome databases in 2024:

CROST Database

The CROST database is a treasure trove of spatial transcriptomics data. It houses information from over 1,000 samples, spanning eight different species, 35 types of tissue, and 56 diseases. Each sample comes with a wealth of data-gene expression details, as well as precise spatial location info, often in the form of images. The best part? The database is designed with simplicity in mind, so researchers can easily browse and interact with the data.

Features of the CROST Database

But that's not all. CROST also offers two online tools that help users dive deeper into spatial transcriptomics, allowing for personalized data analysis. Now, let's break down the core features of the CROST database:

Sample Analysis: Here's where the magic happens. The results of spatial transcriptomics analyses are presented from seven key perspectives. These include everything from an overview of the sample to more detailed analyses like clustering, dimensionality reduction, spatially variable gene (SVG) analysis, and even cell-type annotations. There's also data on how cells correlate, co-localize, and communicate spatially-providing a holistic view of the sample.

Database Access Link:

CROST Database

Data Analysis Modules

1. Overview of Samples

The sample overview module provides a high-level summary of the dataset, including general information regarding sample composition and tissue representation. This module enables users to gain an initial understanding of the breadth of data available for further exploration.

2. Clustering and Dimensionality Reduction

This module presents the results of clustering analyses, often complemented by dimensionality reduction techniques such as principal component analysis (PCA) or t-SNE. These methods allow for the visualization and interpretation of the underlying structure in high-dimensional transcriptomic data.

3. Spatially Variable Gene (SVG) Analysis

Spatially variable gene analysis identifies genes that exhibit differential expression patterns across different spatial locations within the tissue. This provides critical insight into the functional architecture of tissues and how gene expression varies in response to spatial cues.

4. Cell-Type Annotation

In this module, transcriptomic data is annotated with information regarding the cellular composition of the tissue. By integrating single-cell RNA-sequencing data with spatial transcriptomics, the database assigns cell-type identities to different regions of the tissue, facilitating the study of cellular organization in its native context.

5. Cell-Type Correlation and Co-localization Analysis

This analysis examines the correlation and spatial proximity of different cell types. Understanding these relationships is crucial for elucidating how cells interact in situ, especially in complex tissues where cellular interactions dictate tissue function and pathology.

6. Spatial Communication Analysis (Cluster and Cell Type)

Spatial communication analysis investigates how different tissue regions communicate with each other, focusing on both the clustering of cells and the types of interactions that occur between them. This approach helps uncover the dynamic processes involved in tissue homeostasis and disease progression.

Visualizations

CROST offers several visualization tools to aid in the interpretation of spatial transcriptomics data. These visualizations include heatmaps, spatial maps of gene expression, and 3D tissue models that allow for an interactive exploration of gene expression in its native spatial context.

A recent study introduced SpatialScope, a method that integrates single-cell RNA sequencing (scRNA-seq) data with spatial transcriptomics (ST) data. This approach enhances the resolution of ST data to single-cell levels, demonstrating the potential for databases like CROST to complement advanced analytical techniques (Zhang et al., 2023).

The CROST database represents a vital resource for researchers investigating spatial transcriptomics. With its extensive dataset and user-friendly tools, it facilitates the exploration of complex biological questions related to tissue development and disease mechanisms. As new analytical methods emerge and integrate with such databases, our understanding of spatial biology will continue to deepen.

SpatialDB: A Public Database for Spatial Transcriptomics Data

SpatialDB is a publicly accessible database dedicated to organizing spatial transcriptomics data derived from published scientific papers. This repository serves as an essential resource for researchers studying gene expression in spatial contexts across various organisms and tissues. Currently, SpatialDB includes 24 datasets generated using eight distinct spatial transcriptomics technologies. These datasets predominantly originate from human, mouse, Drosophila, Caenorhabditis elegans, and zebrafish tissues.

The database can be accessed via the following link:

CROST Database

Search Functionality

The homepage of SpatialDB includes a comprehensive Search SpatialDB module (equivalent to the "Search" menu). This feature allows users to query gene expression data based on either gene symbols or ENSEMBL IDs. For instance, to search for the expression of the KCTD12 gene, one can enter the gene name and click the Submit button. The resulting output will display the expression data relevant to the query.

Browsing Dataset Information

By selecting the Browse button, users can access additional information regarding the datasets available in the database. This interface provides detailed descriptions of each dataset, including the associated tissue types, experimental conditions, and technologies used. Clicking the PMID link directs users to the corresponding published article, where further methodological details and contextual information can be found.

Gene Expression in Tissue Maps

The database also includes tissue-specific expression maps. Within the Organism Atlas section, users can visualize the expression of selected genes across various tissue maps. For example, users can explore how the KCTD12 gene is expressed within different tissues by adjusting parameters such as tissue type, data processing methods, and the shape and size of scatter points on microscopic images. Additionally, the database allows for the download of both image files and quantitative expression data.

In a research, Huang et al. (2022) highlighted how spatial transcriptomics data from multiple species can be used to understand evolutionary differences in tissue organization and gene expression patterns. They noted that databases like SpatialDB are crucial for such comparative studies.

Upload and Download Capabilities

SpatialDB supports data sharing and exchange through its Upload and Download menus. Users have the option to upload their own spatial transcriptomics datasets for inclusion in the database. Furthermore, the database enables users to download datasets of interest for further analysis, thereby fostering collaborative research and data-driven investigations.

SPASCER Database

The SPASCER database is a cutting-edge resource that brings new possibilities to the study of spatial transcriptomics. Its primary focus is on understanding the complexities of tissue heterogeneity, the microenvironment of tissues, and the intricate web of intercellular interactions across diverse tissue structures. Within its framework, SPASCER houses 43 comprehensive studies, encompassing 1,082 datasets that capture the spatially resolved gene expression profiles of a wide array of tissues.

The database can be accessed here: SPASCER Database

Key Features of SPASCER

The SPASCER platform is designed to facilitate deep exploration into the spatial dynamics of gene expression. Researchers will find several valuable resources for their studies:

Marker Gene Tables
These tables contain lists of genes identified as markers for specific cell types or tissue regions. They provide vital reference points for studying tissue organization at a molecular level.
Cell Communication Data
Understanding how cells communicate is pivotal. This feature includes information about the molecular signals exchanged between cells within their local microenvironments, offering insights into cellular coordination and interaction.
Gene Regulatory Networks
These networks illustrate the regulatory mechanisms governing gene interactions in the spatial context of tissues. Researchers can examine how genes influence each other's expression in situ, contributing to a more comprehensive understanding of tissue function.
Published Literature
SPASCER offers direct access to relevant research papers linked to the datasets housed in the database. This resource ensures that users can explore the academic backdrop of the data and delve into findings that have shaped current knowledge.
Datasets
The full datasets are available for download, providing detailed gene expression data with spatial resolution. This allows for advanced analysis and facilitates reproducibility across studies.

Applications and Utility

The SPASCER database is intended to enhance the understanding of the spatial dynamics of gene expression within complex tissue architectures. By integrating data from various tissue types, this platform supports the study of how cellular interactions within specific microenvironments influence tissue function and structure. Researchers can utilize this resource to explore gene expression patterns, map cell-to-cell communication networks, and investigate gene regulatory pathways in spatially distinct regions of tissues.

CancerSRT for Human Cancer

CancerSRT is a specialized database that consolidates, organizes, and analyzes spatial transcriptomics (ST) data from 14 distinct human cancers. The database encompasses 46 datasets (347 sub-datasets), derived from five different spatial transcriptomics technologies. This resource aims to provide in-depth insights into the molecular mechanisms and spatial characteristics of cancerous tissues, facilitating a deeper understanding of tumor microenvironments and intercellular interactions.

The CancerSRT database can be accessed via the following link:

CancerSRT Database

Core Features and Functionality

The CancerSRT portal supports a range of functionalities for data exploration, including dataset search, browsing, uploading, and downloading. Users can conduct multi-level analyses within the platform, exploring various dimensions of the data:

Gene-Level Analysis: Includes tasks such as marker gene identification and spatially variable gene detection, allowing for detailed exploration of gene expression patterns within different tissue regions.

Cell-Level Analysis: Users can investigate the spatial distribution of cells, annotate cell types, and infer interactions between immune and tumor cells. This feature enables a comprehensive view of cellular organization within the tumor microenvironment.

A study introduced SpaCET (Spatial Cellular Estimator for Tumors), which provides a framework to decompose cell identities in tumor ST data. By integrating gene expression changes and copy number alterations, SpaCET improves the accuracy of estimating cell types within tumors. This highlights the importance of databases like CancerSRT in providing essential data for such analyses.

Cancer-Relevant Analysis: This level focuses on cancer-specific evaluations, including:

Assessment of malignant tumor cells.
Evaluation of cancer functional states.
Immune profiling and characterization of tumor-immune interactions.
Recognition of tertiary lymphoid structures (TLS).
Analysis of the tumor core and tumor edge regions to explore regional differences in gene expression and cell interactions.

Pan-Cancer Analysis: The platform offers tools for cross-cancer analysis, enabling:

Functional activity spectrum analysis across multiple cancer types.
Cell type composition analysis in various cancers.
Examination of tertiary lymphoid structures across different cancers.

Additional Features

In addition to the data exploration capabilities, CancerSRT also provides an online analysis module, which enables users to perform in-depth, customized analyses of the datasets available within the database.

CancerSRT serves as a crucial resource for cancer research, particularly in the field of spatial transcriptomics. By providing access to a wide array of datasets and offering comprehensive tools for multi-layered analysis, the platform aids researchers in exploring the spatial and molecular complexities of cancer. This database not only enhances our understanding of tumor biology but also facilitates the identification of novel biomarkers and therapeutic targets.

STOmicsDB

STOmicsDB, or the Spatial Transcript Omics DataBase, is like a treasure trove for anyone diving into the complex world of spatial transcriptomics. Imagine trying to map the exact locations where genes express themselves within a tissue-sounds tricky, right? But that's exactly what STOmicsDB aims to make easier. It's more than just a repository of data; it's an integrated platform that connects researchers with a rich collection of spatial gene expression datasets. And these aren't just any datasets-they're curated and organized to help users dig deeper into how genes are behaving across different species and tissues.

Instead of seeing gene activity in a broad, global sense, you can now explore it in precise, spatial contexts-seeing how cells in one part of a tissue might express a certain gene, while cells in another part stay silent. It's like zooming in on the microscopic choreography of life itself, and it's invaluable for anyone exploring the nuances of tissue architecture, developmental biology, or disease progression.

Whether you're looking at human tissues, mouse models, or even plant biology, STOmicsDB offers a vast array of datasets to fuel new insights into how genes and cells interact in their natural environments. The level of precision and detail it offers could be a game-changer in understanding spatially-driven gene expression.

The STOmicsDB is accessible at the following link:

STOmicsDB Database

Data Integration and Features

STOmicsDB integrates 221 manually curated datasets, covering 17 distinct species. These datasets have undergone detailed annotation, with particular emphasis on the cell types present in the samples. The platform facilitates efficient access to these datasets, enabling researchers to query specific data using keywords. For instance, a query such as "Mouse Heart spatial transcript" allows users to quickly identify and access relevant datasets. Additionally, a drop-down menu is available for conducting searches based on specific categories, enhancing the versatility of the search function.

Visualization and Analysis

One of the standout features of STOmicsDB is its powerful visualization capabilities. Imagine being able to see gene expression not as just abstract data points, but as vivid, interactive maps that reveal the very composition of cells in a tissue. That's what the platform offers. When you search for a particular sample, under the "Visualization" tab, you're instantly greeted with detailed cellular composition maps that show exactly how genes are expressed across different regions of the tissue. It's like stepping inside the tissue itself and observing the action in real-time.

But it doesn't stop there. The beauty of STOmicsDB's visualization tools is their flexibility. If you need to adjust the view to zoom in on a specific area, highlight certain cell types, or even tweak how the data is displayed-no problem. You can customize the maps to fit your specific research goals. This kind of interactivity is key, allowing you to truly explore the data from multiple angles and get a sense of how gene expression shifts within different spatial contexts.

Whether you're digging into complex interactions between cell types or just trying to get a clearer picture of how gene activity correlates with tissue architecture, these visualization tools give you the freedom to analyze the data in ways that make sense for your work.

Aquila: A Comprehensive Spatial Omics Database

Aquila is redefining how researchers approach spatial omics, bringing together both transcriptomics and proteomics in one seamless, cutting-edge platform. It's not just a database-it's an entire ecosystem designed to empower scientists to dig deep into the spatial intricacies of gene expression and protein distribution.

What makes Aquila so groundbreaking is its versatility. It doesn't simply provide access to data; it opens up a world where you can visualize and analyze gene activity alongside protein distribution, all in the precise locations within tissues where they occur. Whether you're studying a developing embryo, a complex organ, or even cancerous tissues, Aquila lets you explore how genes and proteins are expressed spatially, creating a detailed map of molecular activity in its natural setting.

The tools are designed to let you zoom in, adjust, and fine-tune your view of the molecular landscape. You're not just looking at static snapshots of data; you're actively engaged in exploring dynamic patterns that reveal how cellular components are orchestrated within their tissue homes. This could be invaluable in uncovering everything from developmental processes to disease mechanisms, where context is just as important as the data itself.

Schematic diagram of the Aquila spatial omics platform. Schematic design of Aquila. (Yimin Zhenget al,. 2022)

Key Features of Aquila

Diverse Dataset Coverage: Aquila encompasses studies from spatial transcriptome and proteome analyses, including both 2D and 3D experiments. The database integrates various technologies, making it a versatile tool for researchers in the field.

User-Friendly Visualization Tools: The platform offers multiple visualization formats, such as:

Spatial Cell Distribution: Visualizing how different cell types are distributed within a tissue sample.

Spatial Expression Maps: Displaying gene expression levels across different regions of the tissue.

Co-localization of Markers: Analyzing the spatial relationships between different proteins or genes.

Interactive Analysis Capabilities: Aquila allows users to submit their own data for interactive online analysis, enhancing the collaborative potential of research efforts. This feature empowers researchers to conduct personalized analyses tailored to their specific hypotheses.

Advanced Analytical Tools: The database includes tools for conducting spatial community analysis and spatial co-expression analyses, providing insights into the interactions and relationships between different cell types in their native environments.

Aquila's development is part of a broader trend in spatial omics research, where understanding the spatial organization of cells is crucial for deciphering complex biological systems. As highlighted in recent literature, databases like Aquila are essential for integrating diverse datasets and facilitating comprehensive analyses that can lead to new discoveries in areas such as cancer research, developmental biology, and tissue engineering.

Open-ST 3D Spatial Transcriptomics

Open-ST is an innovative open-source platform designed to create high-resolution three-dimensional (3D) maps of tissue samples using spatial transcriptomics technology. Developed by researchers from the Max Delbrück Center and various academic institutions, this platform aims to uncover the molecular mechanisms underlying tissue development and disease by providing detailed spatial maps of gene expression at subcellular precision.

Overview of Open-ST high-resolution transcriptomics. Overview of Open-ST (Marie Schott et al,. 2024)

Key Features of Open-ST

High-Resolution Imaging Capabilities: Open-ST allows researchers to visualize molecular and cellular structures that are often lost in traditional two-dimensional (2D) representations. This high resolution is crucial for studying complex tissues and understanding cellular interactions.
Integration with Other Molecular Data Types: The platform supports the integration of various molecular data types, enabling comprehensive analyses that combine transcriptomic information with other omics data.
Cost-Effective Solution: Unlike many commercially available spatial transcriptomics technologies, Open-ST is designed to be low-cost, utilizing standard laboratory equipment. This accessibility encourages wider adoption among researchers.
Modular and Flexible Design: Open-ST's modular design allows researchers to adapt the platform to their specific needs, making it suitable for a variety of research applications beyond cancer studies.
User-Friendly Online Resources: The developers have provided detailed experimental protocols and computational workflows online, facilitating easy implementation for researchers looking to utilize spatial transcriptomics in their work.

Applications and Impact

Open-ST has shown promising results in various studies. For instance, it has been successfully applied to analyze tissue samples from patients with head and neck cancer, revealing important insights into tumor heterogeneity and the spatial organization of different cell types within the tumor microenvironment. The platform's ability to capture the diversity of immune, stromal, and tumor cell populations has potential implications for identifying new biomarkers and therapeutic targets.In addition to cancer research, Open-ST can be utilized in studies of other diseases and tissue types, making it a versatile tool for exploring the complexities of biological systems.

Open-ST represents a significant advancement in the field of spatial transcriptomics, providing researchers with a powerful tool for high-resolution 3D mapping of gene expression. Its cost-effectiveness, flexibility, and user-friendly design make it an attractive option for scientists aiming to explore the intricate relationships between cells in their native environments.

References:

Aquila: A spatial omics database and analysis platform. (2022). ResearchGate. Retrieved from https://www.researchgate.net/publication/364364077_Aquila_a_spatial_omics_database_and_analysis_platform
Schott, M., León-Periñán, D., Splendiani, E., et al. (2024). Open-ST: High-resolution spatial transcriptomics in 3D. Cell. https://doi.org/10.1016/j.cell.2024.01.001
Zhang, H., et al. (2023). Integrating spatial and single-cell transcriptomics data using deep generative models. Nature Communications, 14(1), 5029. https://doi.org/10.1038/s41467-023-43629-w
Huang, Y., et al. (2022). Comparative analysis of spatial transcriptomics data reveals evolutionary conservation of tissue architecture. Cell Reports, 38(12), 110423. https://doi.org/10.1093/gbe/evab160
Jiang, L., Chen, H., Pinello, L., & Yuan, G.-C. (2016). GiniClust: Detecting rare cell types from single-cell gene expression data using the Gini index. Genome Biology, 17, 144. https://doi.org/10.1186/s13059-016-1010-4

* For Research Use Only. Not for use in diagnostic procedures.