New tool for simple and powerful integration of single-cell RNA data from heterogeneous sources

Facilitating robust and reliable cross-sample and cross-individual comparisons of single-cell transcriptomics data

Picture generated with Dall-E-3 inside Skype.

The integration of single-cell transcriptomics data coming from different experiments and individuals is at the core of data analysis, but critical and challenging due to the presence of technical variability or “batch effects”. Variations stemming from differences in sample processing and experimental protocols often impede comparative analyses and can lead to overcorrection when applying standard batch effect correction methods, resulting in the loss of true biological variability. A new method just published in Nature Communications offers an elegant and powerful solution: by leveraging prior knowledge in the form of cell type annotations to preserve the biological variance within the data, STACAS v2 ensures that critical distinctions between cell types and individuals are not lost during the integration process. The new method outperforms more complex methods, either supervised or unsupervised, and strikes a remarkable balance between mitigating batch effects and preserving the genuine biological variability within the data, even when faced with incomplete or imprecise cell type annotations. And we at Nexco can use this new tool in your projects involving data from heterogenous sources.

The integration of single-cell transcriptomics data from different experiments is essential in any analysis pipeline, but it can be a daunting task due to the presence of technical variability or “batch effects”. These variations, stemming from differences in sample processing, experimental protocols and in samples themselves, often impede comparative analyses and can lead to overcorrection when applying standard batch effect correction methods. This results in the loss of true biological variability, to an extent that it can blur any relevant information in the datasets.

In response to this challenge, a rather simple yet highly effective solution has emerged in the form of STACAS v2, a semi-supervised single-cell RNA sequencing data integration method. Published by the Carmona lab from the Swiss Institute of Bioinformatics, STACAS leverages prior knowledge in the form of cell type annotations to preserve the biological variance within each dataset when merging several different datasets together. The method strikes a balance between correction of batch effects and preservation of information, essential for large-scale studies in which data comes from different sources or even when comparing data from different samples.

In an extensive benchmark, STACAS v2 outperformed more complex unsupervised and supervised methods, including Harmony, FastMNN, Seurat v4, scVI, Scanorama, scANVI, and scGen. Importantly, although the method relies on cell annotations, it is not much sensitive to incorrect priors. In fact, the article presenting the method and program demonstrate STACAS v2 is robust to incomplete and imprecise cell type annotations.

As an example case use, the paper reporting the tool shows its successful application to constructing a high-resolution map of tumor-infiltrating CD8 T cells by integrating data from over 500,000 cells coming from almost 300 patients, without loosing any information about their diversity of T cell subtypes. This application showcases the program’s high scalability and suitability for real-world problems.

Applying STACAS v2 to your data

We at Nexco are well aware about the relevance of properly curating and integrating data for transcriptomics analyses, and we know that STACAS v2 is a powerful tool for this.

As needed, we can use STACAS v2 to integrate your data from different sources, manage different baselines, noise levels, and other sources of variability, exactly as we need in order to address the complexities of real-world datasets and problems.

Reference

Semi-supervised integration of single-cell transcriptomics data - Nature Communications

  • Monday, Feb 19, 2024, 8:51 AM
  • single-cell-sequencing, single-cell-analysis, batch-normalization, bioinformatics
  • Share this post
Contact us

Our location

Nexco Analytics Bâtiment Alanine, Startlab Route de la Corniche 5A 1066 Epalinges, Switzerland

Give us a call

+41 76 509 73 73     

Leave us a message

contact@nexco.ch

Do not hesitate to contact us

We will answer you shortly with the optimal solution to answer your needs