crested.tl.modisco.create_tf_ct_matrix#
- crested.tl.modisco.create_tf_ct_matrix(pattern_tf_dict, all_patterns, df, classes, log_transform=True, normalize_pattern_importances=False, normalize_gex=False, min_tf_gex=0, importance_threshold=0, pattern_parameter='seqlet_count', filter_correlation=False, zscore_threshold=2, correlation_threshold=0.2, verbose=False)#
Create a tensor (matrix) of transcription factor (TF) expression and cell type contributions.
- Parameters:
pattern_tf_dict (
dict
) – A dictionary with pattern indices and their TFs. Seecrested.tl.modisco.create_pattern_tf_dict
.all_patterns (
dict
) – A list of patterns with metadata. Seecrested.tl.modisco.process_patterns
.df (
DataFrame
) – A DataFrame containing gene expression data. Seecrested.tl.modisco.calculate_mean_expression_per_cell_type
log_transform (
bool
(default:True
)) – Whether to apply log transformation to the gene expression values. Default is True.normalize_pattern_importances (
bool
(default:False
)) – Whether to normalize the contribution scores across the cell types. Default is False.normalize_gex (
bool
(default:False
)) – Whether to normalize gene expression across the cell types. Default is False.min_tf_gex (
float
(default:0
)) – The minimal GEX value to select potential TF candidates. Default 0.importance_threshold (
float
(default:0
)) – The minimum pattern importance value. Default is 0.pattern_parameter (
str
(default:'seqlet_count'
)) – Parameter which is used to indicate the pattern’s importance. Either average contribution score (‘contrib’), or number of pattern instances (‘seqlet_count’, default) and its log (‘seqlet_count_log’).filter_correlation (
bool
(default:False
)) – Whether to filter based on Pearson correlation betweentf_gex
andct_contribs
. Default is False.zscore_threshold (
float
(default:2
)) – Zscore used for filtering TF candidates. If the max zscore over the cell types is belofw this threshold, the TF gets discarded. Default is 2.correlation_threshold (
float
(default:0.2
)) – Minimum Pearson correlation between expression and contribution profile required to keep a column if filtering is enabled. Default is 0.2.verbose (
bool
(default:False
)) – Whether to print intermediate debugging steps.