Tools tl

Tools tl#

Crested(data[, model, config, project_name, ...])

Main class to handle training, testing, predicting and calculation of contribution scores.

TaskConfig(optimizer, loss, metrics)

Task configuration (optimizer, loss, and metrics) for use in tl.Crested.

default_configs(task[, num_classes])

Get default loss, optimizer, and metrics for an existing task.

predict(input, model[, genome, batch_size])

Make predictions using the model(s) on some input that represents sequences.

score_gene_locus(chr_name, gene_start, ...)

Score regions upstream and downstream of a gene locus using the model's prediction.

contribution_scores(input, target_idx, model)

Calculate contribution scores based on given method for the specified inputs.

contribution_scores_specific(input, ...[, ...])

Calculate contribution scores based on given method only for the most specific regions per class.

extract_layer_embeddings(input, model, ...)

Extract embeddings from a specified layer for all inputs.

enhancer_design_in_silico_evolution(...[, ...])

Create synthetic enhancers for a specified class using in silico evolution (ISE).

enhancer_design_motif_insertion(patterns, ...)

Create synthetic enhancers using motif insertions.

Data#

data.AnnDataModule(adata[, genome, ...])

DataModule class which defines how dataloaders should be loaded in each stage.

data.AnnDataLoader(dataset, batch_size[, ...])

Pytorch-like DataLoader class for AnnDataset with options for batching, shuffling, and one-hot encoding.

data.AnnDataset(anndata, genome[, split, ...])

Dataset class for combining genome files and AnnData objects.

Model Zoo#

zoo.basenji(seq_len, num_classes[, ...])

Construct a Basenji model.

zoo.borzoi(seq_len, num_classes[, ...])

Construct an fully replicated Borzoi model.

zoo.deeptopic_cnn(seq_len, num_classes[, ...])

Construct a DeepTopicCNN model.

zoo.deeptopic_lstm(seq_len, num_classes[, ...])

Construct a DeepTopicLSTM model.

zoo.dilated_cnn(seq_len, num_classes[, ...])

Construct a CNN using dilated convolutions.

zoo.dilated_cnn_decoupled(seq_len, num_classes)

Construct a CNN using dilated convolutions with a separate dense head per output class.

zoo.enformer(seq_len, num_classes[, ...])

Construct an fully replicated Enformer model.

zoo.simple_convnet(seq_len, num_classes[, ...])

Construct a Simple ConvNet with standard convolutional and dense blocks.

Losses#

losses.CosineMSELoss([max_weight, name, ...])

Custom loss function that combines cosine similarity and mean squared error (MSE).

losses.CosineMSELogLoss([max_weight, name, ...])

Custom loss function combining logarithmic transformation, cosine similarity, and mean squared error (MSE).

Metrics#

metrics.ConcordanceCorrelationCoefficient([name])

Concordance correlation coefficient metric.

metrics.PearsonCorrelation([name])

Pearson correlation metric.

metrics.PearsonCorrelationLog([name])

Log Pearson correlation metric.

metrics.ZeroPenaltyMetric([name])

Zero penalty metric.

Modisco#

modisco.tfmodisco([contrib_dir, ...])

Run tf-modisco on one-hot encoded sequences and contribution scores stored in .npz files.

modisco.match_h5_files_to_classes(...)

Match .h5 files in a given directory with a list of class names and returns a dictionary mapping.

modisco.process_patterns(matched_files[, ...])

Process genomic patterns from matched HDF5 files, trim based on information content, and match to known patterns.

modisco.create_pattern_matrix(classes, ...)

Create a pattern matrix from classes and patterns, with optional normalization.

modisco.generate_nucleotide_sequences(...)

Generate nucleotide sequences from pattern data.

modisco.pattern_similarity(all_patterns, ...)

Compute the similarity between two patterns.

modisco.find_pattern(pattern_id, pattern_dict)

Find the index of a pattern by its ID.

modisco.find_pattern_matches(all_patterns, ...)

Find and filter pattern matches from the modisco-lite list of patterns to the motif database from the corresponding HTML paths.

modisco.calculate_similarity_matrix(all_patterns)

Calculate the similarity matrix for the given patterns.

modisco.calculate_mean_expression_per_cell_type(...)

Read an AnnData object from an H5AD file and calculates the mean gene expression per cell type subclass.

modisco.generate_html_paths(all_patterns, ...)

Generate html paths for each pattern in the filtered array.

modisco.read_motif_to_tf_file(file_path)

Read a TSV file mapping motifs to transcription factors (TFs) into a DataFrame.

modisco.create_pattern_tf_dict(...)

Create a dictionary mapping patterns to their associated transcription factors (TFs) and other metadata.

modisco.create_tf_ct_matrix(pattern_tf_dict, ...)

Create a tensor (matrix) of transcription factor (TF) expression and cell type contributions.