crested.utils.fetch_sequences

crested.utils.fetch_sequences#

crested.utils.fetch_sequences(regions, genome=None, uppercase=True)#

Fetch sequences from a genome file for a list of regions using pysam.

Regions should be formatted as “chr:start-end”.

Parameters:
  • regions (str | list[str]) – List of regions to fetch sequences for.

  • genome (Union[PathLike, Genome, None] (default: None)) – Path to the genome fasta or Genome instance or None. If None, will look for a registered genome object.

  • uppercase (bool (default: True)) – If True, return sequences in uppercase.

Return type:

list[str]

Returns:

List of sequence strings for each region.

Examples

>>> regions = ["chr1:1000000-1000100", "chr1:1000100-1000200"]
>>> region_seqs = crested.utils.fetch_sequences(regions, genome_path)