dotools_py.tl.rank_genes_condition#
- dotools_py.tl.rank_genes_condition(adata, groupby, subset_by=None, reference='rest', groups=None, method='wilcoxon', pval_cutoff=0.05, log2fc_cutoff=0.25, path=None, filename='DGE.xlsx', layer=None, covariates=None, get_results=True, key_added='rank_genes_condition')[source]#
Run DGE Analysis.
Run differential expression analysis. Besides the methods implemented in scanpy (wilcoxon, t-test, logreg and t-test_overestim_var), the MAST test can be used. If subset_by is provided the DGE analysis will be run over each group. Benjamini-Hochberg correction method is used for multiple testing.
After running DGE analysis and if path is provided an ExcelSheet will be generated with 3 sheets: 1) AllGenes containing all the genes, 2) UpregGenes containing upregulated genes and 3) DownregGenes containing downregulated genes. The up- and down-regulated genes are filtered depending on the pval_cutoff and log2fc_cutoff. The results will be saved in the uns attribute under
rank_genes_condition.- Parameters:
- adata
AnnData Annotated data matrix.
- groupby
str Column in
obswith conditions to test.- subset_by
str(default:None) Column in
obsto subset by. (e.g., column with cell-type annotation)- reference
str(default:'rest') Reference condition.
- groups
list|str(default:None) Alternative conditions.
- method
Literal['wilcoxon','mast','t-test','logreg','t-test_overestim_var'] (default:'wilcoxon') Method to test.
- pval_cutoff
float(default:0.05) P-value cutoff to filter when generating the ExcelSheet.
- log2fc_cutoff
float(default:0.25) log2 foldchange cutoff to filter when generating the ExcelSheet.
- path
str|PathLike[str] |Path(default:None) Path to save ExcelSheet.
- filename
str(default:'DGE.xlsx') Name of the ExcelSheet.
- layer
str(default:None) Layer of the AnnData to use.
- covariates
list(default:None) list with extra covariates to correct for in the MAST test.
- get_results
bool(default:True) Return a dataframe with results.
- key_added
str(default:'rank_genes_condition') Key to use in uns.
- adata
- Return type:
- Returns:
- Returns a
DataFrameifget_resultsis set toTruewith the results from the differential expression analysis. If a path is provided, the DataFrame will be saved under the specified path. The following fields are included:
GeneNameName of the genes
pvalsandpadjThe adjusted p-value uses Benjamini-Hochberg correction method.
log2fcLog2FoldChange
pts_refandpts_groupPercentage of cells in the reference in the disease group expressing the gene
groupsColumn containing the group tested
groupbyThe column name is set to
groupbyand contains the cluster groups.adata.uns['rank_genes_condition' | key_added]Dataframe with results of the differential expression analysis.
- Returns a
See also
dotools_py.tl.rank_genes_pseudobulk()run DEA at pseudobulk level between condition for all clusters
dotools_py.tl.rank_genes_consensus()run DEA at pseudobulk and single-cell level between condition for all clusters