dotools_py.tl.run_seurat_integration

dotools_py.tl.run_seurat_integration#

dotools_py.tl.run_seurat_integration(adata, batch_key, key_hvg='highly_variable', backend='r', use_rep='X_pca', key_corrected='X', method='pca', n_components=50, random_state=0, n_jobs=-1)[source]#

Run Seurat Integration methods.

The input object should contain a key in adata.var with the highly variable genes (HVG). When the python backend is used, a re-implementation of the R code is used. To run CCA integration there are two modes. To reproduce the behavior of Seurat v4 set key_corrected = 'X'. In this case, the expression of the HVGs will be corrected. If key_corrected is a key in adata.obsm (e.g., X_pca), the behavior of Seurat v5 will be reproduced and the embedding will be corrected.

Warning

Currently the python backend is experimental.

Parameters:
adata AnnData

Annotated data matrix.

batch_key str

Key in adata.obs with batch information.

key_hvg str (default: 'highly_variable')

Key in adata.var with boolean indicating if a feature is highly variable or not.

backend Literal['python', 'r'] (default: 'r')

Backend to use. Currently, python is experimental.

use_rep str (default: 'X_pca')

Representation to use to compute within batch KNN to find the anchors. Use when backend = 'python'

key_corrected str (default: 'X')

If set to X the expression values will be corrected (v4 approach), otherwise a key in adata.obsm needs to be set (v5 approach).

method Literal['cca', 'pca', 'lsi', 'lsi-cca', 'rpca', 'rlsi'] (default: 'pca')

Method available in Seurat Integration. Use when backend = 'python'.

n_components int (default: 50)

Number of components to consider. Use when backend = 'python'.

random_state int (default: 0)

Random seed.

n_jobs int (default: -1)

Number of threads to use. Use when backend = 'python'.

Return type:

None

Returns:

Returns None. The corrected matrix will be saved in adata.obsm.

Examples

>>> import  dotools_py as do
>>> adata = do.dt.example_10x_processed()
>>> del adata.obsm
>>> adata.obsm_keys()
[]
>>> do.tl.run_seurat_integration(adata, batch_key="batch", backend="python")
>>> integrator.find_anchor(adata_list=adata_list, n_components=50)
2026-04-01 16:51:37,581 - This backend is currently experimental
2026-04-01 16:51:37,581 - Running CCA using Python backend
2026-04-01 16:51:37,585 - Finding anchors across datasets
Batches : 100%|██████████| 1/1 [00:15<00:00, 15.51s/it]
Batch alignment:   0%|          | 0/1 [00:00<?, ?it/s]
2026-04-01 16:51:53,101 - Integrating datasets
Batch alignment: 100%|██████████| 1/1 [00:01<00:00,  1.15s/it]
>>> adata.obsm_keys()
['X_CCA']