dotools_py.pl.boxplot

Contents

dotools_py.pl.boxplot#

dotools_py.pl.boxplot(adata, x_axis, feature, batch_key='batch', hue=None, hue_order=None, layer=None, pseudobulk=False, figsize=(3, 4.2), palette='tab10', title=None, title_fontproperties=None, xticks_order=None, xticks_rotation=45, ylabel='Log(nUMI)', legend_title=None, legend_fontproperties=None, legend_ncols=1, legend_loc='center left', path=None, filename='barplot.svg', show=True, ax=None, reference=None, groups=None, groups_pvals=None, test='wilcoxon', corr_method='benjamini-hochberg', line_offset=0.05, txt_size=13, txt='p = ', showfliers=False, scatter=False, marker_size=2, **kwargs)[source]#

Boxplot with statistical significance.

Show the distribution of the expression of var_names or a continuous value in obs along different categorical values and test for significance.

Parameters:
adata AnnData

Annotated data matrix.

x_axis str

Name of a categorical column in adata.obs to groupby.

feature str

A valid feature in adata.var_names or column in adata.obs with continuous values.

batch_key str (default: 'batch')

Name of a categorical column in adata.obs that contains the sample names.

hue str (default: None)

Name of a second categorical column in adata.obs to use additionally to groupby.

hue_order list (default: None)

List with orders for the categories in hue. If it is not set, the order will be inferred.

layer str (default: None)

Name of the AnnData object layer that wants to be plotted. By default, adata.X is plotted. If layer is set to a valid layer name, then the layer is plotted.

pseudobulk bool (default: False)

If set to True the distribution of the mean across samples will be plotted.

figsize tuple[float, float] (default: (3, 4.2))

Figure size, the format is (width, height).

palette str | dict | Colormap (default: 'tab10')

String denoting matplotlib colormap. If not set, it will try to access adata.uns[hue_colors | x_axis_colors], if not the colormap do.utility.tab30() will be used. A dictionary with the categories available in adata.obs[x_axis] or adata.obs[hue] if hue is not None can also be provided. The format is {category:color}.

title str (default: None)

Title for the figure.

title_fontproperties Dict[Literal['size', 'weight'], str | int] (default: None)

Dictionary which should contain ‘size’ and ‘weight’ to define the fontsize and fontweight of the title of the figure.

xticks_order list (default: None)

Order for the categories in adata.obs[x_axis].

xticks_rotation int (default: 45)

Rotation of the X-axis ticks.

ylabel str (default: 'Log(nUMI)')

Label for the Y-axis.

legend_title str (default: None)

Title for the legend.

legend_fontproperties Dict[Literal['size', 'weight'], str | int] (default: None)

Dictionary which should contain ‘size’ and ‘weight’ to define the fontsize and fontweight of the title of the legend.

legend_ncols int (default: 1)

Number of columns for the legend.

legend_loc Literal['center left', 'cemter right', 'upper right', 'upper left', 'lower left', 'lower right', 'right', 'lower center', 'upper center', 'center'] (default: 'center left')

Location of the legend.

path str | PathLike[str] | Path (default: None)

Path to the folder to save the figure.

filename str (default: 'barplot.svg')

Name of file to use when saving the figure.

show bool (default: True)

If set to False, returns a dictionary with the matplotlib axes.

ax Axes (default: None)

Matplotlib axes to use for plotting. If not set, a new figure will be generated.

reference str (default: None)

Reference condition to use when testing for significance. When hue is set, the reference condition correspond to the categories in hue. For each x_axis category the different hue categories will be tested.

groups str | list (default: None)

List of the name of the groups to test against.

groups_pvals float | list (default: None)

If provided, these values will be plotted. If not set, the p-values will be estimated. The order of the p-values should match the order of the groups_cond categories.

test Literal['wilcoxon', 't-test', 'kruskal', 'anova', 'logreg', 't-test_overestim_var'] (default: 'wilcoxon')

Name of the method to test for significance.

corr_method Literal['benjamini-hochberg', 'bonferroni'] (default: 'benjamini-hochberg')

Correction method for multiple testing.

line_offset float (default: 0.05)

Offset for the brackets draw to indicate significance. This offset represent a percentage.

txt_size int (default: 13)

Font size of the text indicating significance.

txt str (default: 'p = ')

Text to include before the p-value. If not set, only the p-value is shown.

showfliers bool (default: False)

Show the outliers beyond the caps.

scatter bool (default: False)

Plot the mean expression per sample on top of the boxplots plots.

marker_size float (default: 2)

Radius of the markers, in points.

kwargs

Other parameters are passed through to sns.boxplot.

Return type:

Axes | dict | None

Returns:

Depending on show, returns the plot if set to True or a dictionary with the axes.

Example

Create a boxplot showing the expression of a given gene including the p-value to indicate if there is a significant statistical difference between groups.

import dotools_py as do
adata = do.dt.example_10x_processed()
do.pl.boxplot(adata,  'annotation', 'CD4', reference = 'pDC', groups=['B_cells'], xticks_rotation=45, scatter=False)
../../_images/dotools_py-pl-boxplot-1.png

Setting the hue argument allow to test across conditions for several groups.

# Take only lymphoid cells
lymphoid = adata[adata.obs['annotation'].isin(['T_cells', 'NK', 'B_cells'])].copy()
do.pl.boxplot(lymphoid, 'annotation', 'RPL11', hue = 'condition', reference = 'healthy', groups=['disease'], hue_order=['healthy', 'disease'], xticks_rotation=45, figsize=(6, 4), scatter=True)
../../_images/dotools_py-pl-boxplot-2.png

Plot a continuous value in adata.obs.

do.pl.boxplot(adata,'annotation','total_counts', figsize=(6, 4), scatter=True)
../../_images/dotools_py-pl-boxplot-3.png

Plot over the sample level

do.pl.boxplot(adata, "condition", "RPL11", batch_key="annotation", pseudobulk=True, scatter=True, marker_size=5)
../../_images/dotools_py-pl-boxplot-4.png