crested.pp.sort_and_filter_regions_on_specificity#
- crested.pp.sort_and_filter_regions_on_specificity(adata, top_k, model_name=None, method='gini')#
Sort bed regions & targets/predictions based on high Gini or proportion score per colum while keeping the top k rows per column.
Combines them into a single AnnData object with extra columns indicating the original class name, the rank per column, and the score.
- Parameters:
adata (
AnnData
) – The AnnData object containing the matrix (celltypes, regions) to be sorted.top_k (
int
) – The number of top regions to keep per column.model_name (
Optional
[str
] (default:None
)) – The name of the model to look for in adata.layers[model_name] for predictions. If None, will use the targets in adata.X to decide which regions to sort.method (
str
(default:'gini'
)) – The method to use for calculating scores, either ‘gini’ or ‘proportion’. Default is ‘gini’.
- Return type:
- Returns:
The AnnData object is modified inplace with the sorted and filtered matrix, and extra columns indicating the original class name, the rank per column, and the score.
Example
>>> crested.pp.sort_and_filter_regions_on_specificity( ... adata, ... top_k=500, ... method="gini", ... )