dotools_py.get.mean_expr

Contents

dotools_py.get.mean_expr#

dotools_py.get.mean_expr(adata, group_by, features=None, out_format='long', layer=None, logcounts=True, logmean=True)[source]#

Calculate the average expression in an AnnData objects for features.

This function calculates the average expression of a set of features grouping by one or several categories. Assume log-normalized counts. If logcounts is set to True, the log10 transformation is undone for the mean expression calculation. The reported mean expression is log-transformed.

Parameters:
adata AnnData

Annotated data matrix.

group_by str | list

Metadata columns in obs to group by.

features list | str | None (default: None)

List of features in var_name to use. If not set, it will be calculated over all the genes.

out_format Literal['long', 'wide'] (default: 'long')

Format of the Dataframe returned. This can be wide or long format.

layer str | None (default: None)

Layer of the AnnData to use. If not set use X.

logcounts bool (default: True)

Set to True if the input is in log space.

logmean bool (default: True)

If set to True the calculated mean will be log1p transform. For expression data it would return the LogMean(nUMI) if set to True and Mean(nUMI) if set to False.

Return type:

DataFrame

Returns:

Returns a DataFrame with the mean expression woth log1p transformation if logmean is set to True. If out_format is set to wide, the index will be set to the gene names and the column names will be set to the groups. If out_format is set to long, the following fields are included:

gene

Contains the gene names.

groupN

Contains the groups (For each metadata column a new column will be added).

expr

Contains the mean expression values after log1p transformation.

Example

>>> import dotools_py as do
>>> adata = do.dt.example_10x_processed()
>>> df = do.get.mean_expr(adata, "annotation")
>>> df.head(5)
         gene   group0      expr
0  ATP2A1-AS1  B_cells  0.000000
1      STK17A  B_cells  1.453713
2    C19orf18  B_cells  0.000000
3        TPP2  B_cells  0.126846
4       MFSD1  B_cells  0.053630
>>> df = do.get.mean_expr(adata, "annotation", out_format="wide")
>>> df.head(5)
    group0   B_cells  Monocytes        NK   T_cells       pDC
gene
A4GALT  0.222505   0.000000  0.000000  0.000000  0.000000
AAK1    0.000000   0.364976  1.126293  1.143016  0.128019
ABAT    0.182251   0.146378  0.047404  0.045826  0.158761
ABCB4   0.062785   0.000000  0.000000  0.000000  0.000000
ABCB9   0.000000   0.000000  0.027683  0.057814  0.000000