dotools_py.pp.run_cellbender#
- dotools_py.pp.run_cellbender(cellranger_path, output_path, samplenames=None, cuda=True, cpu_threads=15, epochs=150, lr=1e-05, estimator_multiple_cpu=False, log=True, conda_path=None, run_dropletutils=False)[source]#
Run cellbender to remove ambient RNA.
Remove ambient RNA using Cellbender. Assumes that the FASTQ files have been mapped with CellRanger.
Warning
It is recommended to have access to GPU when running cellbender. Running CellBender on CPU might lead to high running time.
- Parameters:
- cellranger_path
str path to folder containing subfolders for each sample.
- output_path
str output folder to save the H5 files with the corrected expression matrix.
- samplenames
list|None(default:None) list with the name of the folders in
cellranger_path. If not set, it will be infered.- cuda
bool(default:True) set to True to use GPU for the training.
- cpu_threads
int(default:15) number of CPUs to use for training.
- epochs
int(default:150) number of epochs to train for. The default number is 150, higher number might lead to overfitting.
- lr
float(default:1e-05) learning rate.
- estimator_multiple_cpu
bool(default:False) use multiple CPUs for the generation of results. It is not recommended for big datasets.
- log
bool(default:True) generate a log file with the stdout from running CellBender.
- conda_path
str|None(default:None) path to the conda environment with cellbender installed. If not provided, a conda environment will be created in
~/.venv/cellbender.- run_dropletutils
bool(default:False) run DropletUtils to estimate the expected number of cells and total number of droplets to use as a prior for cellbender.
- cellranger_path
- Return type:
- Returns:
H5 files with the corrected expression matrix will be saved in the output folder
Example
>>> import dotools_py as do >>> in_path = "/path/to/cellranger" >>> out_path = "/path/to/output" >>> do.pp.run_cellbender(in_path, out_path)