crested.tl.score_gene_locus#
- crested.tl.score_gene_locus(chr_name, gene_start, gene_end, target_idx, model, genome=None, strand='+', upstream=50000, downstream=10000, central_size=1000, step_size=50, **kwargs)#
Score regions upstream and downstream of a gene locus using the model’s prediction.
The model predicts a value for the {central_size} of each window.
- Parameters:
chrom – The chromosome name.
gene_start (
int
) – The start position of the gene locus (TSS for + strand).gene_end (
int
) – The end position of the gene locus (TSS for - strand).target_idx (
int
) – Index of the target class to score. You can usually get this from runninglist(anndata.obs_names).index(class_name)
.model (
Model
|list
[Model
]) – A (list of) trained keras model(s) to make predictions with.genome (
Union
[Genome
,PathLike
,None
] (default:None
)) – Genome or path to the genome file. Required if no genome is registered.strand (
str
(default:'+'
)) – ‘+’ for positive strand, ‘-’ for negative strand. Default ‘+’.upstream (
int
(default:50000
)) – Distance upstream of the gene to score.downstream (
int
(default:10000
)) – Distance downstream of the gene to score.central_size (
int
(default:1000
)) – Size of the central region that the model predicts for.step_size (
int
(default:50
)) – Distance between consecutive windows.**kwargs – Additional keyword arguments to pass to the keras.Model.predict method.
- Return type:
- Returns:
- scores
An array of prediction scores across the entire genomic range.
- coordinates
An array of tuples, each containing the chromosome name and the start and end positions of the sequence for each window.
- min_loc
Start position of the entire scored region.
- max_loc
End position of the entire scored region.
- tss_position
The transcription start site (TSS) position.
See also