scbiot.ot.integrate_ot

Contents

scbiot.ot.integrate_ot#

scbiot.ot.integrate_ot(adata, obsm_key='X_pca', batch_key='batch', out_key='scBIOT', strength=0.5, conservation=0.5, prototypes=0.5, sharpen=0.5, supervision=0.5, projector=0.5, approximate=False, centroid=False, reference='auto', label_key=None, unlabeled_category='unknown', use_gpu=True, gpu_device=0, ot_backend='torch', random_state=0, verbose=True, modality='auto', spatial_key=None, spatial_weight=0.5, prealign=None, prealign_strength=1.0, prealign_eps=0.001, prealign_max_points=20000, max_iter=15, n_centroids_per_batch=2048, max_samples_per_batch=500000, k_interp=8, chunk_size=500000, tmp_path=None, align_reference=False)#

scBIOT OT integration with semantic 0–1 knobs.

Parameters#

strength / conservation / prototypes / sharpen / supervision / projector

Semantic 0–1 knobs controlling aggressiveness, structure preservation, capacity, sharpening, label supervision, and projector strength.

approximate / centroid

Switches for approximate Sinkhorn or centroid-level OT.

obsm_key / batch_key / out_key / reference

Keys and alignment semantics.

align_reference

When True, map query cells onto the reference subset (query→reference OT) and keep reference fixed. If label_key is provided (and exists in adata.obs), the reference/query split is inferred as: reference = labeled cells, query = unlabeled_category (plus NA). Otherwise, the split is inferred from batch_key and reference.

max_iter

Maximum number of outer optimization iterations.

n_centroids_per_batch / max_samples_per_batch / k_interp / chunk_size / tmp_path

Centroid-level OT controls forwarded to integrate_centroids when centroid=True.

Examples#

RNA: >>> adata, metrics = integrate_ot( … adata, … obsm_key=”X_pca”, … batch_key=”batch”, … reference=”union”, … )

ATAC: >>> adata, metrics = integrate_ot( … adata, … obsm_key=”X_lsi”, … batch_key=”batchname_all”, … reference=”largest”, … )

Parameters:
  • adata (Any)

  • obsm_key (str)

  • batch_key (str)

  • out_key (str)

  • strength (float)

  • conservation (float)

  • prototypes (float)

  • sharpen (float)

  • supervision (float)

  • projector (float)

  • approximate (bool)

  • centroid (bool)

  • reference (str)

  • label_key (str | None)

  • unlabeled_category (Any)

  • use_gpu (bool)

  • gpu_device (int)

  • ot_backend (str)

  • random_state (int)

  • verbose (bool)

  • modality (str)

  • spatial_key (str | None)

  • spatial_weight (float)

  • prealign (str | None)

  • prealign_strength (float)

  • prealign_eps (float)

  • prealign_max_points (int)

  • max_iter (int)

  • n_centroids_per_batch (int)

  • max_samples_per_batch (int)

  • k_interp (int)

  • chunk_size (int)

  • tmp_path (str | None)

  • align_reference (bool)

Return type:

Tuple[Any, Dict[str, float | int]]