2. scRNA-seq in R (Seurat + reticulate)#
Open the R Markdown workflow
for a complete, runnable script. The steps mirror the Python notebooks but stay
inside a Seurat workflow by hopping into scbiot through reticulate.
Environment setup#
Load the required packages and (optionally) the demo PBMC dataset from
SeuratData:
library(Seurat)
library(SeuratData)
library(reticulate)
library(tidyverse)
library(sceasy) # converts Seurat objects to AnnData
data("ifnb")
alldata <- UpdateSeuratObject(ifnb)
alldata[["pct_mt"]] <- PercentageFeatureSet(alldata, pattern = "^MT-")
alldata <- subset(
alldata,
subset = nFeature_RNA > 200 & nFeature_RNA < 2500 & pct_mt < 5
)
Convert to AnnData#
scBIOT works with AnnData inputs. sceasy bridges Seurat and AnnData,
and reticulate exposes the Python API:
sc <- import("scanpy", convert = FALSE)
scb <- import("scbiot", convert = FALSE)
adata <- sceasy::convertFormat(
alldata,
from = "seurat",
to = "anndata",
main_layer = "counts",
drop_single_values = FALSE
)
sceasy::convertFormat(
alldata,
from = "seurat",
to = "anndata",
main_layer = "counts",
drop_single_values = FALSE,
outFile = "ifnb.h5ad"
)
Preprocess and run PCA with Scanpy#
Once the data sit in AnnData, call the same Scanpy preprocessing helpers we use in the Python quick start:
sc$pp$highly_variable_genes(
adata,
n_top_genes = as.integer(2000),
flavor = "seurat_v3",
batch_key = "stim"
)
sc$pp$normalize_total(adata)
sc$pp$log1p(adata)
sc$pp$scale(adata)
sc$tl$pca(adata, n_comps = as.integer(50), use_highly_variable = TRUE)
Run optimal transport + annotate neighbors#
Call the OT integrator and reuse Scanpy for graph construction and Leiden clustering:
res <- scb$ot$integrate(
adata,
preset = "rna",
obsm_key = "X_pca",
batch_key = "stim", # point this at your batch/condition column
out_key = "X_ot"
)
adata <- res[[0]]
metrics <- res[[1]]
sc$pp$neighbors(adata, use_rep = "X_ot")
sc$tl$umap(adata)
sc$tl$leiden(adata, resolution = 0.8, key_added = "leiden_X_ot")
adata
Visualise OT embeddings inside Seurat#
Mirror the OT UMAP coordinates inside the Seurat object for quick comparison:
ot <- as.matrix(adata$obsm["X_umap"])
rownames(ot) <- colnames(alldata)
alldata[["ot"]] <- CreateDimReducObject(
embeddings = ot,
key = "ot_umap_",
assay = DefaultAssay(alldata)
)
DimPlot(alldata, reduction = "ot", group.by = "stim", label = TRUE) +
ggtitle("Optimal Transport Integration")
Finishing up#
sessionInfo() records package versions when sharing results:
info <- sessionInfo()
info$loadedOnly <- NULL
print(info, locale = FALSE)
The linked R Markdown contains all of the above cells so it can be rendered via
rmarkdown::render() or opened interactively in RStudio / VS Code.