dotools_py.tl.umap_clustering

dotools_py.tl.umap_clustering#

dotools_py.tl.umap_clustering(adata, use_rep, batch_key='batch', compute_neighbors=True, compute_umap=True, compute_clusters=True, resolution=0.3, cluster_key='leiden', neighbors_kwg=None)[source]#

Compute UMAP embedding and identify clusters.

This function allows to compute the neighbors, UMAP embedding and identify clusters. The neighbors will be computed based on a low dimentional representation present in adata.obsm.

Parameters:
adata AnnData

Annotated data matrix

use_rep str

Low dimentional representation to use to compute neighbors.

batch_key str (default: 'batch')

Column in adata.obs with batch information

compute_neighbors bool (default: True)

If set to True compute neighbors.

compute_umap bool (default: True)

If set to True, the UMAP embeddings will be computed.

compute_clusters bool (default: True)

If set to True, the leiden clustering algorithm will be run.

resolution float (default: 0.3)

Resolution to use for clustering

cluster_key str (default: 'leiden')

Key in adata.obs with clustering information

neighbors_kwg dict | None (default: None)

Additional parameters pass to sc.pp.neighbors()

Return type:

None

Returns:

Returns None.

Example

>>> import dotools_py as do
>>> adata = do.dt.example_10x_processed()
>>> del adata.obsm["X_umap"],  adata.obs["leiden"]
>>> do.tl.umap_clustering(adata, "X_CCA")
2026-06-03 14:14:04,044 - Computing neighbors
computing neighbors
    finished: added to `.uns['neighbors']`
    `.obsp['distances']`, distances for each pair of neighbors
    `.obsp['connectivities']`, weighted adjacency matrix (0:00:06)
2026-06-03 14:14:10,621 - Computing UMAP
computing UMAP
    finished: added
    'X_umap', UMAP coordinates (adata.obsm)
    'umap', UMAP parameters (adata.uns) (0:00:00)
2026-06-03 14:14:11,579 - Computing clusters
running Leiden clustering
    finished: found 5 clusters and added
    'leiden', the cluster labels (adata.obs, categorical) (0:00:00)
>>> adata
AnnData object with n_obs × n_vars = 700 × 1851
    obs: 'batch', 'condition', 'n_genes_by_counts', 'log1p_n_genes_by_counts', 'total_counts', 'log1p_total_counts',
         'total_counts_mt', 'log1p_total_counts_mt', 'pct_counts_mt', 'total_counts_ribo', 'log1p_total_counts_ribo',
         'pct_counts_ribo', 'n_genes', 'n_counts', 'doublet_class', 'doublet_score', 'cell_type', 'autoAnnot',
         'celltypist_conf_score', 'annotation', 'annotation_recluster', 'leiden'
    var: 'mean', 'std', 'highly_variable', 'means', 'dispersions', 'dispersions_norm', 'highly_variable_nbatches',
         'highly_variable_intersection'
    uns: 'annotation_colors', 'annotation_recluster_colors', 'batch_colors', 'hvg', 'leiden', 'leiden_colors',
         'log1p', 'neighbors', 'pca', 'umap'
    obsm: 'X_CCA', 'X_pca', 'X_umap'
    varm: 'PCs'
    layers: 'counts', 'logcounts'
    obsp: 'connectivities', 'distances'