dotools_py.pp.run_cellbender

Contents

dotools_py.pp.run_cellbender#

dotools_py.pp.run_cellbender(cellranger_path, output_path, samplenames=None, cuda=True, cpu_threads=15, epochs=150, lr=1e-05, estimator_multiple_cpu=False, log=True, conda_path=None, run_dropletutils=False)[source]#

Run cellbender to remove ambient RNA.

Remove ambient RNA using Cellbender. Assumes that the FASTQ files have been mapped with CellRanger.

Warning

It is recommended to have access to GPU when running cellbender. Running CellBender on CPU might lead to high running time.

Parameters:
cellranger_path str

path to folder containing subfolders for each sample.

output_path str

output folder to save the H5 files with the corrected expression matrix.

samplenames list | None (default: None)

list with the name of the folders in cellranger_path. If not set, it will be infered.

cuda bool (default: True)

set to True to use GPU for the training.

cpu_threads int (default: 15)

number of CPUs to use for training.

epochs int (default: 150)

number of epochs to train for. The default number is 150, higher number might lead to overfitting.

lr float (default: 1e-05)

learning rate.

estimator_multiple_cpu bool (default: False)

use multiple CPUs for the generation of results. It is not recommended for big datasets.

log bool (default: True)

generate a log file with the stdout from running CellBender.

conda_path str | None (default: None)

path to the conda environment with cellbender installed. If not provided, a conda environment will be created in ~/.venv/cellbender.

run_dropletutils bool (default: False)

run DropletUtils to estimate the expected number of cells and total number of droplets to use as a prior for cellbender.

Return type:

None

Returns:

H5 files with the corrected expression matrix will be saved in the output folder

Example

>>> import dotools_py as do
>>> in_path = "/path/to/cellranger"
>>> out_path = "/path/to/output"
>>> do.pp.run_cellbender(in_path, out_path)