dotools_py.get.expr

Contents

dotools_py.get.expr#

dotools_py.get.expr(adata, features, groups=None, out_format='long', layer=None)[source]#

Extract the expression of features.

This function extract the expression from an AnnData object and returns a DataFrame. If layer is not specified the expression in X will be extracted. Additionally, metadata from obs can be added to the dataframe.

Parameters:
adata AnnData

Annotated data matrix.

groups str | list | None (default: None)

Metadata column in obs to include in the DataFrame.

features str | list | None

Name of the features in var_names to extract the expression of. If set to None, extract all genes.

out_format Literal['long', 'wide'] (default: 'long')

Format of the dataframe. The wide format will generate a DataFrame with shape n_obs x n_vars, while the long format will generate an unpivot version.

layer str | None (default: None)

Layer in the AnnData object to extract the expression from. If set to None the expression in X will be used.

Return type:

DataFrame

Returns:

Returns a DataFrame. If out_format is set to wide, the index will be the cell barcodes and the column names will be set to the gene names. If groups are specified, extra columns will be present. If out_format is set to long, the following fields are included:

genes

Contains the gene names.

expr

Contains the expression values extracted.

Example

>>> import dotools_py as do
>>> adata = do.dt.example_10x_processed()
>>> df = do.get.expr(adata, "CD4", "annotation")
>>> df.head(5)
  annotation genes  expr
0    B_cells   CD4   0.0
1         NK   CD4   0.0
2    T_cells   CD4   0.0
3    T_cells   CD4   0.0
4    T_cells   CD4   0.0
>>> df = do.get.expr(adata, "CD4", "annotation", out_format="wide")
>>> df.head(5)
                               CD4 annotation
CAAAGAATCAGATTGC-1-batch2  0.0    B_cells
AGCTTCCCAGTCAACT-1-batch1  0.0         NK
GAGAGGTTCCCTCTAG-1-batch1  0.0    T_cells
CTAACTTCAGATCATC-1-batch1  0.0    T_cells
CATGGTACAAACGGCA-1-batch1  0.0    T_cells