dotools_py.tl.auto_annot

Contents

dotools_py.tl.auto_annot#

dotools_py.tl.auto_annot(adata, cluster_key, model='Healthy_Adult_Heart.pkl', key_added='autoAnnot', majority=True, convert=True, update_label=False, key_updated='annotation', verbose=False, update_models=False, dict_labels='default', pl_cell_prob=False, path=None, filename='Dotplot_CellProbabilities.svg')[source]#

Semi-automatic annotation based on CellTypist.

This function takes an AnnData object with log-counts in X and annotate the clusters employing a model available for Celltypist.

Parameters:
adata AnnData

Annotated data matrix.

cluster_key str

Metadata column in obs with cluster groups.

model str (default: 'Healthy_Adult_Heart.pkl')

Celltypist model to use for the prediction.

key_added str (default: 'autoAnnot')

New metadata column in obs to save the predicted cell types.

majority bool (default: True)

Whether to refine the predicted labels by running the majority voting classifier after over-clustering.

convert bool (default: True)

Convert the gene format of the model. If a Human model is provided, and is set to True, then gene in mouse format will be use and viceverse.

update_label bool (default: False)

Add a new metadata column in obs with cell type labels updated based on dict_labels.

key_updated str (default: 'annotation')

Metadata column in obs to save the updated cell type labels. Ignored if update_labels is set to False.

verbose bool (default: False)

Whether to show information of the analysis steps.

update_models bool (default: False)

Download the latest models.

dict_labels dict | str (default: 'default')

Dictionary with the updated labels for the names in celltypist model. Currently, only a dictionary for the Human_Adult_Heart.pkl model. See dotools_py.dt.standard_ct_labels_heart()

pl_cell_prob bool (default: False)

Generate a Dotplot to visualize the cell probabilities for each cluster.

path str | PathLike[str] | Path (default: None)

Path to save the dotplot of cell probabilities.

filename str | None (default: 'Dotplot_CellProbabilities.svg')

Name of the file.

Return type:

None

Returns:

Return None. The following fields will be set:

adata.obs['autoAnnot' | key_added]: pandas.Series (dtype category)

Array that stores the predicted annotation for each cell.

adata.obs['celltypist_conf_score']: pandas.Series (dtype float)

Array that stores the confidence scores for the prediction.

adata.obs['annotation' | key_updated]: pandas.Series (dtype category)

If update_label is set to True, this field will be set and contains an array that stores the predicted annotation for each cell updated based on the dictionary dict_labels.

Example

>>> import dotools_py as do
>>> adata = do.dt.example_10x_processed()
>>> do.tl.auto_annot(adata, "leiden", model="Healthy_COVID19_PBMC.pkl", pl_cell_prob=False, convert=False)
🔬 Input data has 700 cells and 1851 genes
🔗 Matching reference genes in the model
🧬 358 features used for prediction
⚖️ Scaling input data
🖋️ Predicting labels
✅ Prediction done!
🗳️ Majority voting the predictions
✅ Majority voting done!