dotools_py.pl.boxplot#
- dotools_py.pl.boxplot(adata, x_axis, feature, batch_key='batch', hue=None, hue_order=None, layer=None, pseudobulk=False, figsize=(3, 4.2), palette='tab10', title=None, title_fontproperties=None, xticks_order=None, xticks_rotation=45, ylabel='Log(nUMI)', legend_title=None, legend_fontproperties=None, legend_ncols=1, legend_loc='center left', path=None, filename='barplot.svg', show=True, ax=None, reference=None, groups=None, groups_pvals=None, test='wilcoxon', corr_method='benjamini-hochberg', line_offset=0.05, txt_size=13, txt='p = ', showfliers=False, scatter=False, marker_size=2, **kwargs)[source]#
Boxplot with statistical significance.
Show the distribution of the expression of
var_namesor a continuous value inobsalong different categorical values and test for significance.- Parameters:
- adata
AnnData Annotated data matrix.
- x_axis
str Name of a categorical column in
adata.obsto groupby.- feature
str A valid feature in
adata.var_namesor column inadata.obswith continuous values.- batch_key
str(default:'batch') Name of a categorical column in
adata.obsthat contains the sample names.- hue
str(default:None) Name of a second categorical column in
adata.obsto use additionally to groupby.- hue_order
list(default:None) List with orders for the categories in
hue. If it is not set, the order will be inferred.- layer
str(default:None) Name of the AnnData object layer that wants to be plotted. By default,
adata.Xis plotted. If layer is set to a valid layer name, then the layer is plotted.- pseudobulk
bool(default:False) If set to
Truethe distribution of the mean across samples will be plotted.- figsize
tuple[float,float] (default:(3, 4.2)) Figure size, the format is (width, height).
- palette
str|dict|Colormap(default:'tab10') String denoting matplotlib colormap. If not set, it will try to access
adata.uns[hue_colors | x_axis_colors], if not the colormapdo.utility.tab30()will be used. A dictionary with the categories available inadata.obs[x_axis]oradata.obs[hue]if hue is not None can also be provided. The format is {category:color}.- title
str(default:None) Title for the figure.
- title_fontproperties
Dict[Literal['size','weight'],str|int] (default:None) Dictionary which should contain ‘size’ and ‘weight’ to define the fontsize and fontweight of the title of the figure.
- xticks_order
list(default:None) Order for the categories in
adata.obs[x_axis].- xticks_rotation
int(default:45) Rotation of the X-axis ticks.
- ylabel
str(default:'Log(nUMI)') Label for the Y-axis.
- legend_title
str(default:None) Title for the legend.
- legend_fontproperties
Dict[Literal['size','weight'],str|int] (default:None) Dictionary which should contain ‘size’ and ‘weight’ to define the fontsize and fontweight of the title of the legend.
- legend_ncols
int(default:1) Number of columns for the legend.
- legend_loc
Literal['center left','cemter right','upper right','upper left','lower left','lower right','right','lower center','upper center','center'] (default:'center left') Location of the legend.
- path
str|PathLike[str] |Path(default:None) Path to the folder to save the figure.
- filename
str(default:'barplot.svg') Name of file to use when saving the figure.
- show
bool(default:True) If set to
False, returns a dictionary with the matplotlib axes.- ax
Axes(default:None) Matplotlib axes to use for plotting. If not set, a new figure will be generated.
- reference
str(default:None) Reference condition to use when testing for significance. When
hueis set, the reference condition correspond to the categories inhue. For eachx_axiscategory the different hue categories will be tested.- groups
str|list(default:None) List of the name of the groups to test against.
- groups_pvals
float|list(default:None) If provided, these values will be plotted. If not set, the p-values will be estimated. The order of the p-values should match the order of the
groups_condcategories.- test
Literal['wilcoxon','t-test','kruskal','anova','logreg','t-test_overestim_var'] (default:'wilcoxon') Name of the method to test for significance.
- corr_method
Literal['benjamini-hochberg','bonferroni'] (default:'benjamini-hochberg') Correction method for multiple testing.
- line_offset
float(default:0.05) Offset for the brackets draw to indicate significance. This offset represent a percentage.
- txt_size
int(default:13) Font size of the text indicating significance.
- txt
str(default:'p = ') Text to include before the p-value. If not set, only the p-value is shown.
- showfliers
bool(default:False) Show the outliers beyond the caps.
- scatter
bool(default:False) Plot the mean expression per sample on top of the boxplots plots.
- marker_size
float(default:2) Radius of the markers, in points.
- kwargs
Other parameters are passed through to sns.boxplot.
- adata
- Return type:
- Returns:
Depending on
show, returns the plot if set toTrueor a dictionary with the axes.
Example
Create a boxplot showing the expression of a given gene including the p-value to indicate if there is a significant statistical difference between groups.
import dotools_py as do adata = do.dt.example_10x_processed() do.pl.boxplot(adata, 'annotation', 'CD4', reference = 'pDC', groups=['B_cells'], xticks_rotation=45, scatter=False)
Setting the
hueargument allow to test across conditions for several groups.# Take only lymphoid cells lymphoid = adata[adata.obs['annotation'].isin(['T_cells', 'NK', 'B_cells'])].copy() do.pl.boxplot(lymphoid, 'annotation', 'RPL11', hue = 'condition', reference = 'healthy', groups=['disease'], hue_order=['healthy', 'disease'], xticks_rotation=45, figsize=(6, 4), scatter=True)
Plot a continuous value in
adata.obs.do.pl.boxplot(adata,'annotation','total_counts', figsize=(6, 4), scatter=True)
Plot over the sample level
do.pl.boxplot(adata, "condition", "RPL11", batch_key="annotation", pseudobulk=True, scatter=True, marker_size=5)