crested.pp.filter_regions_on_specificity#
- crested.pp.filter_regions_on_specificity(adata, gini_std_threshold=1.0, model_name=None)#
Filter bed regions & targets/predictions based on high Gini score.
This function filters regions based on their specificity using Gini scores. The regions with high Gini scores are retained, and a new AnnData object is created with the filtered data. If model_name is provided, will look for the corresponding predictions in the adata.layers[model_name] layer. Else, it will use the targets in adata.X to decide which regions to keep.
- Parameters:
adata (
AnnData
) – The AnnData object containing the matrix (celltypes, regions) to be filtered.gini_std_threshold (
float
(default:1.0
)) – The number of standard deviations above the mean Gini score used to determine the threshold for high variability.model_name (
Optional
[str
] (default:None
)) – The name of the model to look for in adata.layers[model_name] for predictions. If None, will use the targets in adata.X to select specific regions.
- Return type:
- Returns:
The AnnData object is filtered inplace with the filtered matrix and updated variable names.
Example
>>> crested.pp.filter_regions_on_specificity( ... adata, ... gini_std_threshold=1.0, ... )