crested.pp.normalize_peaks

crested.pp.normalize_peaks#

crested.pp.normalize_peaks(adata, peak_threshold=0, gini_std_threshold=1.0, top_k_percent=0.01)#

Normalize the adata.X based on variability of the top values per cell type.

This function applies a normalization factor to each cell type, focusing on regions with the most significant peaks above a defined threshold and considering the variability within those peaks. Only used on continuous .X data. Modifies the input AnnData.X in place.

Parameters:
  • adata (AnnData) – The AnnData object containing the matrix (celltypes, regions) to be normalized.

  • peak_threshold (int (default: 0)) – The minimum value for a peak to be considered significant for the Gini score calculation.

  • gini_std_threshold (float (default: 1.0)) – The number of standard deviations below the mean Gini score used to determine the threshold for low variability.

  • top_k_percent (float (default: 0.01)) – The percentage (expressed as a fraction) of top values to consider for Gini score calculation.

Return type:

None

Returns:

The AnnData object with the normalized matrix and cell type weights used for normalization in the obsm attribute.

Example

>>> crested.pp.normalize_peaks(
...     adata,
...     peak_threshold=0,
...     gini_std_threshold=2.0,
...     top_k_percent=0.05,
... )