dotools_py.tl.rank_genes_condition

dotools_py.tl.rank_genes_condition#

dotools_py.tl.rank_genes_condition(adata, groupby, subset_by=None, reference='rest', groups=None, method='wilcoxon', pval_cutoff=0.05, log2fc_cutoff=0.25, path=None, filename='DGE.xlsx', layer=None, covariates=None, get_results=True, key_added='rank_genes_condition')[source]#

Run DGE Analysis.

Run differential expression analysis. Besides the methods implemented in scanpy (wilcoxon, t-test, logreg and t-test_overestim_var), the MAST test can be used. If subset_by is provided the DGE analysis will be run over each group. Benjamini-Hochberg correction method is used for multiple testing.

After running DGE analysis and if path is provided an ExcelSheet will be generated with 3 sheets: 1) AllGenes containing all the genes, 2) UpregGenes containing upregulated genes and 3) DownregGenes containing downregulated genes. The up- and down-regulated genes are filtered depending on the pval_cutoff and log2fc_cutoff. The results will be saved in the uns attribute under rank_genes_condition.

Parameters:

adata AnnData: Annotated data matrix.
groupby str: Column in obs with conditions to test.
subset_by str (default: None): Column in obs to subset by. (e.g., column with cell-type annotation)
reference str (default: 'rest'): Reference condition.
groups list | str (default: None): Alternative conditions.
method Literal['wilcoxon', 'mast', 't-test', 'logreg', 't-test_overestim_var'] (default: 'wilcoxon'): Method to test.
pval_cutoff float (default: 0.05): P-value cutoff to filter when generating the ExcelSheet.
log2fc_cutoff float (default: 0.25): log2 foldchange cutoff to filter when generating the ExcelSheet.
path str | PathLike[str] | Path (default: None): Path to save ExcelSheet.
filename str (default: 'DGE.xlsx'): Name of the ExcelSheet.
layer str (default: None): Layer of the AnnData to use.
covariates list (default: None): list with extra covariates to correct for in the MAST test.
get_results bool (default: True): Return a dataframe with results.
key_added str (default: 'rank_genes_condition'): Key to use in uns.

Return type:

None | DataFrame

Returns:

Returns a DataFrame if get_results is set to True with the results from the differential expression

analysis. If a path is provided, the DataFrame will be saved under the specified path. The following fields are included:

GeneName: Name of the genes
pvals and padj: The adjusted p-value uses Benjamini-Hochberg correction method.
log2fc: Log2FoldChange
pts_ref and pts_group: Percentage of cells in the reference in the disease group expressing the gene
groups: Column containing the group tested

groupby

The column name is set to groupby and contains the cluster groups.

adata.uns['rank_genes_condition' | key_added]

Dataframe with results of the differential expression analysis.

dotools_py.tl.rank_genes_condition

Contents

dotools_py.tl.rank_genes_condition#