dotools_py.tl.auto_annot#
- dotools_py.tl.auto_annot(adata, cluster_key, model='Healthy_Adult_Heart.pkl', key_added='autoAnnot', majority=True, convert=True, update_label=False, key_updated='annotation', verbose=False, update_models=False, dict_labels='default', pl_cell_prob=False, path=None, filename='Dotplot_CellProbabilities.svg')[source]#
Semi-automatic annotation based on CellTypist.
This function takes an AnnData object with log-counts in
Xand annotate the clusters employing a model available for Celltypist.- Parameters:
- adata
AnnData Annotated data matrix.
- cluster_key
str Metadata column in
obswith cluster groups.- model
str(default:'Healthy_Adult_Heart.pkl') Celltypist model to use for the prediction.
- key_added
str(default:'autoAnnot') New metadata column in
obsto save the predicted cell types.- majority
bool(default:True) Whether to refine the predicted labels by running the majority voting classifier after over-clustering.
- convert
bool(default:True) Convert the gene format of the model. If a Human model is provided, and is set to
True, then gene in mouse format will be use and viceverse.- update_label
bool(default:False) Add a new metadata column in
obswith cell type labels updated based ondict_labels.- key_updated
str(default:'annotation') Metadata column in
obsto save the updated cell type labels. Ignored ifupdate_labelsis set toFalse.- verbose
bool(default:False) Whether to show information of the analysis steps.
- update_models
bool(default:False) Download the latest models.
- dict_labels
dict|str(default:'default') Dictionary with the updated labels for the names in celltypist model. Currently, only a dictionary for the
Human_Adult_Heart.pklmodel. Seedotools_py.dt.standard_ct_labels_heart()- pl_cell_prob
bool(default:False) Generate a Dotplot to visualize the cell probabilities for each cluster.
- path
str|PathLike[str] |Path(default:None) Path to save the dotplot of cell probabilities.
- filename
str|None(default:'Dotplot_CellProbabilities.svg') Name of the file.
- adata
- Return type:
- Returns:
Return
None. The following fields will be set:adata.obs['autoAnnot' | key_added]:pandas.Series(dtypecategory)Array that stores the predicted annotation for each cell.
adata.obs['celltypist_conf_score']:pandas.Series(dtypefloat)Array that stores the confidence scores for the prediction.
adata.obs['annotation' | key_updated]:pandas.Series(dtypecategory)If
update_labelis set to True, this field will be set and contains an array that stores the predicted annotation for each cell updated based on the dictionarydict_labels.
Example
>>> import dotools_py as do >>> adata = do.dt.example_10x_processed() >>> do.tl.auto_annot(adata, "leiden", model="Healthy_COVID19_PBMC.pkl", pl_cell_prob=False, convert=False) 🔬 Input data has 700 cells and 1851 genes 🔗 Matching reference genes in the model 🧬 358 features used for prediction ⚖️ Scaling input data 🖋️ Predicting labels ✅ Prediction done! 🗳️ Majority voting the predictions ✅ Majority voting done!