crested.tl.modisco.process_patterns

crested.tl.modisco.process_patterns#

crested.tl.modisco.process_patterns(matched_files, sim_threshold=3.0, trim_ic_threshold=0.1, discard_ic_threshold=0.1, verbose=False)#

Process genomic patterns from matched HDF5 files, trim based on information content, and match to known patterns.

Parameters:
  • matched_files (dict[str, str | list[str] | None]) – dictionary with class names as keys and paths to HDF5 files as values.

  • sim_threshold (float (default: 3.0)) – Similarity threshold for matching patterns (-log10(pval), pval obtained through TOMTOM matching from tangermeme)

  • trim_ic_threshold (float (default: 0.1)) – Information content threshold for trimming patterns.

  • discard_ic_threshold (float (default: 0.1)) – Information content threshold for discarding patterns.

  • verbose (bool (default: False)) – Flag to enable verbose output.

Return type:

dict[str, dict[str, str | list[float]]]

Returns:

All processed patterns with metadata.