dotools_py.bm.silhouette_batch#
- dotools_py.bm.silhouette_batch(adata, batch_key, annotation_key, use_rep, metric='euclidean', scale=True, get_all=False)[source]#
Batch ASW.
Modified average silhouette width (ASW) of batch This metric measures the silhouette of a given batch. It assumes that a silhouette width close to 0 represents perfect overlap of the batches, thus the absolute value of the silhouette width is used to measure how well batches are mixed. If
scaleis set toTrue, the absolute ASW per group is subtracted from 1 before averaging, so that 0 indicates suboptimal label representation and 1 indicates optimal label representation.- Parameters:
- adata
AnnData Annotated data matrix.
- batch_key
str Column in adata.obs with batch information.
- annotation_key
str Column in adata.obs with cell type or cluster information.
- use_rep
str Column in adata.obsm with the embedding.
- metric
str(default:'euclidean') - scale
bool(default:True) If set to
True, scale the values between 0 and 1- get_all
bool(default:False) If set to
Truereturns the silhouette score, the average silhouette score per cluster and all the silhouette scores.
- adata
- Return type:
- Returns:
Returns 1) the average width silhouette 2) the average silhouette score per cluster and 3) all silhouette scores if
get_allis set toTrue, otherwise returns the average width silhouette (ASW).
Examples
>>> import dotools_py as do >>> adata = do.dt.example_10x_processed() >>> do.bm.silhouette_batch(adata, batch_key="batch", annotation_key="annotation", use_rep="X_CCA") Out[63]: np.float64(0.8107897347900055) >>> score, mean_score, all_scores = do.bm.silhouette_batch(adata, batch_key="batch", annotation_key="annotation", use_rep="X_CCA", get_all=True) >>> mean_score silhouette_score group B_cells 0.795807 Monocytes 0.603867 NK 0.878482 T_cells 0.961296 pDC 0.814496