dotools_py.tl.rank_genes_condition

dotools_py.tl.rank_genes_condition#

dotools_py.tl.rank_genes_condition(adata, groupby, subset_by=None, reference='rest', groups=None, method='wilcoxon', pval_cutoff=0.05, log2fc_cutoff=0.25, path=None, filename='DGE.xlsx', layer=None, covariates=None, get_results=True, key_added='rank_genes_condition')[source]#

Run DGE Analysis.

Run differential expression analysis. Besides the methods implemented in scanpy (wilcoxon, t-test, logreg and t-test_overestim_var), the MAST test can be used. If subset_by is provided the DGE analysis will be run over each group. Benjamini-Hochberg correction method is used for multiple testing.

After running DGE analysis and if path is provided an ExcelSheet will be generated with 3 sheets: 1) AllGenes containing all the genes, 2) UpregGenes containing upregulated genes and 3) DownregGenes containing downregulated genes. The up- and down-regulated genes are filtered depending on the pval_cutoff and log2fc_cutoff. The results will be saved in the uns attribute under rank_genes_condition.

Parameters:
adata AnnData

Annotated data matrix.

groupby str

Column in obs with conditions to test.

subset_by str (default: None)

Column in obs to subset by. (e.g., column with cell-type annotation)

reference str (default: 'rest')

Reference condition.

groups list | str (default: None)

Alternative conditions.

method Literal['wilcoxon', 'mast', 't-test', 'logreg', 't-test_overestim_var'] (default: 'wilcoxon')

Method to test.

pval_cutoff float (default: 0.05)

P-value cutoff to filter when generating the ExcelSheet.

log2fc_cutoff float (default: 0.25)

log2 foldchange cutoff to filter when generating the ExcelSheet.

path str | PathLike[str] | Path (default: None)

Path to save ExcelSheet.

filename str (default: 'DGE.xlsx')

Name of the ExcelSheet.

layer str (default: None)

Layer of the AnnData to use.

covariates list (default: None)

list with extra covariates to correct for in the MAST test.

get_results bool (default: True)

Return a dataframe with results.

key_added str (default: 'rank_genes_condition')

Key to use in uns.

Return type:

None | DataFrame

Returns:

Returns a DataFrame if get_results is set to True with the results from the differential expression

analysis. If a path is provided, the DataFrame will be saved under the specified path. The following fields are included:

GeneName

Name of the genes

pvals and padj

The adjusted p-value uses Benjamini-Hochberg correction method.

log2fc

Log2FoldChange

pts_ref and pts_group

Percentage of cells in the reference in the disease group expressing the gene

groups

Column containing the group tested

groupby

The column name is set to groupby and contains the cluster groups.

adata.uns['rank_genes_condition' | key_added]

Dataframe with results of the differential expression analysis.

See also

dotools_py.tl.rank_genes_pseudobulk()

run DEA at pseudobulk level between condition for all clusters

dotools_py.tl.rank_genes_consensus()

run DEA at pseudobulk and single-cell level between condition for all clusters