2. scRNA-seq in R (Seurat + reticulate)#

Open the R Markdown workflow for a complete, runnable script. The steps mirror the Python notebooks but stay inside a Seurat workflow by hopping into scbiot through reticulate.

Environment setup#

Load the required packages and (optionally) the demo PBMC dataset from SeuratData:

library(Seurat)
library(SeuratData)
library(reticulate)
library(tidyverse)
library(sceasy)  # converts Seurat objects to AnnData

data("ifnb")
alldata <- UpdateSeuratObject(ifnb)
alldata[["pct_mt"]] <- PercentageFeatureSet(alldata, pattern = "^MT-")
alldata <- subset(
  alldata,
  subset = nFeature_RNA > 200 & nFeature_RNA < 2500 & pct_mt < 5
)

Convert to AnnData#

scBIOT works with AnnData inputs. sceasy bridges Seurat and AnnData, and reticulate exposes the Python API:

sc <- import("scanpy", convert = FALSE)
scb <- import("scbiot", convert = FALSE)

adata <- sceasy::convertFormat(
  alldata,
  from = "seurat",
  to = "anndata",
  main_layer = "counts",
  drop_single_values = FALSE
)

sceasy::convertFormat(
  alldata,
  from = "seurat",
  to = "anndata",
  main_layer = "counts",
  drop_single_values = FALSE,
  outFile = "ifnb.h5ad"
)

Preprocess and run PCA with Scanpy#

Once the data sit in AnnData, call the same Scanpy preprocessing helpers we use in the Python quick start:

sc$pp$highly_variable_genes(
  adata,
  n_top_genes = as.integer(2000),
  flavor = "seurat_v3",
  batch_key = "stim"
)
sc$pp$normalize_total(adata)
sc$pp$log1p(adata)
sc$pp$scale(adata)
sc$tl$pca(adata, n_comps = as.integer(50), use_highly_variable = TRUE)

Run optimal transport + annotate neighbors#

Call the OT integrator and reuse Scanpy for graph construction and Leiden clustering:

res <- scb$ot$integrate(
  adata,
  preset = "rna",
  obsm_key = "X_pca",
  batch_key = "stim",  # point this at your batch/condition column
  out_key = "X_ot"
)
adata <- res[[0]]
metrics <- res[[1]]

sc$pp$neighbors(adata, use_rep = "X_ot")
sc$tl$umap(adata)
sc$tl$leiden(adata, resolution = 0.8, key_added = "leiden_X_ot")
adata

Visualise OT embeddings inside Seurat#

Mirror the OT UMAP coordinates inside the Seurat object for quick comparison:

ot <- as.matrix(adata$obsm["X_umap"])
rownames(ot) <- colnames(alldata)

alldata[["ot"]] <- CreateDimReducObject(
  embeddings = ot,
  key = "ot_umap_",
  assay = DefaultAssay(alldata)
)

DimPlot(alldata, reduction = "ot", group.by = "stim", label = TRUE) +
  ggtitle("Optimal Transport Integration")

Finishing up#

sessionInfo() records package versions when sharing results:

info <- sessionInfo()
info$loadedOnly <- NULL
print(info, locale = FALSE)

The linked R Markdown contains all of the above cells so it can be rendered via rmarkdown::render() or opened interactively in RStudio / VS Code.