crested.tl.modisco.tfmodisco

Contents

crested.tl.modisco.tfmodisco#

crested.tl.modisco.tfmodisco(contrib_dir='modisco_results', class_names=None, output_dir='modisco_results', max_seqlets=5000, window=500, n_leiden=2, report=False, meme_db=None, verbose=True)#

Run tf-modisco on one-hot encoded sequences and contribution scores stored in .npz files.

Parameters:
  • contrib_dir (PathLike (default: 'modisco_results')) – Directory containing the contribution score and one hot encoded regions npz files.

  • class_names (Optional[list[str]] (default: None)) – list of class names to process. If None, all class names found in the output directory will be processed.

  • output_dir (PathLike (default: 'modisco_results')) – Directory where output files will be saved.

  • max_seqlets (int (default: 5000)) – Maximum number of seqlets per metacluster.

  • window (int (default: 500)) – The window surrounding the peak center that will be considered for motif discovery.

  • n_leiden (int (default: 2)) – Number of Leiden clusterings to perform with different random seeds.

  • report (bool (default: False)) – Generate a modisco report.

  • meme_db (Optional[str] (default: None)) – Path to a MEME file (.meme) containing motifs. Required if report is True.

  • verbose (bool (default: True)) – Print verbose output.

Examples

>>> evaluator = crested.tl.Crested(...)
>>> evaluator.load_model(/path/to/trained/model.keras)
>>> evaluator.tfmodisco_calculate_and_save_contribution_scores(
...     adata, class_names=["Astro", "Vip"], method="expected_integrated_grad"
... )
>>> crested.tl.tfmodisco(
...     contrib_dir="modisco_results",
...     class_names=["Astro", "Vip"],
...     output_dir="modisco_results",
...     window=1000,
... )