crested.tl.modisco.tfmodisco

Contents

crested.tl.modisco.tfmodisco#

crested.tl.modisco.tfmodisco(contrib_dir='modisco_results', class_names=None, output_dir='modisco_results', max_seqlets=5000, min_metacluster_size=100, min_final_cluster_size=20, window=500, n_leiden=2, report=False, meme_db=None, verbose=True, fdr=0.05, sliding_window_size=20, flank_size=5, top_n_regions=None)#

Run tf-modisco on one-hot encoded sequences and contribution scores stored in .npz files.

Parameters:
  • contrib_dir (PathLike (default: 'modisco_results')) – Directory containing the contribution score and one hot encoded regions npz files.

  • class_names (Optional[list[str]] (default: None)) – list of class names to process. If None, all class names found in the output directory will be processed.

  • output_dir (PathLike (default: 'modisco_results')) – Directory where output files will be saved.

  • max_seqlets (int (default: 5000)) – Maximum number of seqlets per metacluster.

  • min_metacluster_size (int (default: 100)) – Minimum number of seqlets per metacluster.

  • min_final_cluster_size (int (default: 20)) – Minimum size of final cluster.

  • window (int (default: 500)) – The window surrounding the peak center that will be considered for motif discovery.

  • n_leiden (int (default: 2)) – Number of Leiden clusterings to perform with different random seeds.

  • report (bool (default: False)) – Generate a modisco report.

  • meme_db (Optional[str] (default: None)) – Path to a MEME file (.meme) containing motifs. Required if report is True.

  • verbose (bool (default: True)) – Print verbose output.

  • fdr (float (default: 0.05)) – False discovery rate of seqlet finding.

  • sliding_window_size (int (default: 20)) – Sliding window size for seqlet finding in tfmodiscolite.

  • flank_size (int (default: 5)) – Flank size of seqlets.

  • top_n_regions (Optional[int] (default: None)) – The top n regions from the one hot encoded sequences and contribution scores to run modisco on.

Examples

>>> evaluator = crested.tl.Crested(...)
>>> evaluator.load_model(/path/to/trained/model.keras)
>>> evaluator.tfmodisco_calculate_and_save_contribution_scores(
...     adata, class_names=["Astro", "Vip"], method="expected_integrated_grad"
... )
>>> crested.tl.modisco.tfmodisco(
...     contrib_dir="modisco_results",
...     class_names=["Astro", "Vip"],
...     output_dir="modisco_results",
...     window=1000,
... )