dotools_py.get.subset#
- dotools_py.get.subset(adata, obs_key=None, obs_groups=None, var_key=None, var_groups=None, comparison='include', copy=False)[source]#
Subset AnnData object.
Subset an AnnData object based on
obsorvarcolumn. Currently it does not allow to subset by multiple obs/var columns at the same time.- Parameters:
- adata
AnnData AnnData Object.
- obs_key
str|None(default:None) Column in
obsto subset for.- obs_groups
str|list|float|bool|None(default:None) Groups or values to include or filter for the AnnData object.
- var_key
str|None(default:None) Column in
varto subset for.- var_groups
str|list|float|bool|None(default:None) Groups or values to include or filter for in the AnnData object.
- comparison
Literal['>=','>','==','<','<=','include','exclude'] (default:'include') Method to filter the AnnData object.
- copy
bool(default:False) if set to
True, a copy is returned, otherwise a view of the AnnData is returned.
- adata
- Return type:
- Returns:
Returns a view or a new AnnData object.
- Returns:
Returns an AnnData Object if copy is set to
True, otherwise returns a View of an AnnData after subsetting.
Example
>>> import dotools_py as do >>> adata = do.dt.example_10x_processed() >>> tcells = do.get.subset(adata, obs_key="annotation", obs_groups="T_cells") >>> tcells View of AnnData object with n_obs × n_vars = 464 × 1851 obs: 'batch', 'condition', 'n_genes_by_counts', 'log1p_n_genes_by_counts', 'total_counts', 'log1p_total_counts', 'total_counts_mt', 'log1p_total_counts_mt', 'pct_counts_mt', 'total_counts_ribo', 'log1p_total_counts_ribo', 'pct_counts_ribo', 'n_genes', 'n_counts', 'doublet_class', 'doublet_score', 'leiden', 'cell_type', 'autoAnnot', 'celltypist_conf_score', 'annotation', 'annotation_recluster' var: 'mean', 'std', 'highly_variable', 'means', 'dispersions', 'dispersions_norm', 'highly_variable_nbatches', 'highly_variable_intersection' uns: 'annotation_colors', 'annotation_recluster_colors', 'batch_colors', 'hvg', 'leiden', 'leiden_colors', 'log1p', 'neighbors', 'pca', 'umap' obsm: 'X_CCA', 'X_pca', 'X_umap' varm: 'PCs' layers: 'counts', 'logcounts' obsp: 'connectivities', 'distances' >>> adata_subset = do.get.subset(adata, obs_key="total_counts", obs_groups=1000, comparison=">=", copy=True) >>> adata_subset AnnData object with n_obs × n_vars = 699 × 1851 obs: 'batch', 'condition', 'n_genes_by_counts', 'log1p_n_genes_by_counts', 'total_counts', 'log1p_total_counts', 'total_counts_mt', 'log1p_total_counts_mt', 'pct_counts_mt', 'total_counts_ribo', 'log1p_total_counts_ribo', 'pct_counts_ribo', 'n_genes', 'n_counts', 'doublet_class', 'doublet_score', 'leiden', 'cell_type', 'autoAnnot', 'celltypist_conf_score', 'annotation', 'annotation_recluster' var: 'mean', 'std', 'highly_variable', 'means', 'dispersions', 'dispersions_norm', 'highly_variable_nbatches', 'highly_variable_intersection' uns: 'annotation_colors', 'annotation_recluster_colors', 'batch_colors', 'hvg', 'leiden', 'leiden_colors', 'log1p', 'neighbors', 'pca', 'umap' obsm: 'X_CCA', 'X_pca', 'X_umap' varm: 'PCs' layers: 'counts', 'logcounts' obsp: 'connectivities', 'distances'