Modisco tl.modisco

Modisco tl.modisco#

Tfmodisco (utility) functions. Requires the modisco-lite package to be installed.

tfmodisco([contrib_dir, class_names, ...])

Run tf-modisco on one-hot encoded sequences and contribution scores stored in .npz files.

match_h5_files_to_classes(contribution_dir, ...)

Match .h5 files in a given directory with a list of class names and returns a dictionary mapping.

process_patterns(matched_files[, ...])

Process genomic patterns from matched HDF5 files, trim based on information content, and match to known patterns.

create_pattern_matrix(classes, all_patterns)

Create a pattern matrix from classes and patterns, with optional normalization.

generate_nucleotide_sequences(all_patterns)

Generate nucleotide sequences from pattern data.

pattern_similarity(all_patterns, idx1, idx2)

Compute the similarity between two patterns.

find_pattern(pattern_id, pattern_dict)

Find the index of a pattern by its ID.

find_pattern_matches(all_patterns, html_paths)

Find and filter pattern matches from the modisco-lite list of patterns to the motif database from the corresponding HTML paths.

calculate_similarity_matrix(all_patterns)

Calculate the similarity matrix for the given patterns.

calculate_mean_expression_per_cell_type(...)

Read an AnnData object from an H5AD file and calculates the mean gene expression per cell type subclass.

generate_html_paths(all_patterns, classes, ...)

Generate html paths for each pattern in the filtered array.

read_motif_to_tf_file(file_path)

Read a TSV file mapping motifs to transcription factors (TFs) into a DataFrame.

create_pattern_tf_dict(pattern_match_dict, ...)

Create a dictionary mapping patterns to their associated transcription factors (TFs) and other metadata.

create_tf_ct_matrix(pattern_tf_dict, ...[, ...])

Create a tensor (matrix) of transcription factor (TF) expression and cell type contributions.