crested.get_dataset#
- crested.get_dataset(dataset)#
Fetch an example dataset. This function retrieves the dataset of bigwig or bed files and associated region file, downloading if not already cached, and returns the paths to the dataset.
- Provided examples:
‘mouse_cortex_bed’: the BICCN mouse cortex snATAC-seq dataset, processed as BED files per topic. For use in topic classification.
‘mouse_cortex_bigwig_coverage’: the BICCN mouse cortex snATAC-seq dataset, processed as pseudobulked bigWig coverage tracks per cell type. For use in peak regression.
‘mouse_cortex_bigwig_cut_sites’: the BICCN mouse cortex snATAC-seq dataset, processed as pseudobulked bigWig cut site tracks per cell type. For use in peak regression.
These two paths can be passed to
crested.import_bigwigs()
/crested.import_beds()
.Note
The cache location can be changed by setting environment variable $CRESTED_DATA_DIR.
- Parameters:
dataset (
str
) –- The name of the dataset to fetch. Available options:
’mouse_cortex_bed’
’mouse_cortex_bigwig_cut_sites’
’mouse_cortex_bigwig_coverage’
’mouse_cortex_bigwig’ (deprecated, same as ‘mouse_cortex_bigwig_coverage’)
- Returns:
A tuple consisting of the BED/bigWig-containing directory and the consensus regions file.
Example
>>> beds_folder, regions_file = crested.get_dataset("mouse_cortex_bed") >>> adata = crested.import_beds(beds_folder=beds_folder, regions_file=regions_file)