crested.Genome#
- class crested.Genome(fasta, chrom_sizes=None, annotation=None, name=None)#
A class that encapsulates information about a genome, including its FASTA sequence, its annotation, and chromosome sizes.
Adapted from kaizhang/SnapATAC2.
- Parameters:
fasta (
Path
) – The path to the FASTA file.chrom_sizes (
Union
[dict
[str
,int
],Path
,None
] (default:None
)) – A path to a tab delimited chromsizes file or a dictionary containing chromosome names and sizes. If not provided, the chromosome sizes will be inferred from the FASTA file.annotation (
Optional
[Path
] (default:None
)) – The path to the annotation file.name (
Optional
[str
] (default:None
)) – Optional name of the genome.
Examples
>>> genome = Genome( ... fasta="tests/data/test.fa", ... chrom_sizes="tests/data/test.chrom.sizes", ... ) >>> print(genome.fasta) <pysam.libcfaidx.FastaFile at 0x7f4d8b4a8f40> >>> print(genome.chrom_sizes) {'chr1': 1000, 'chr2': 2000} >>> print(genome.name) test
See also
Attributes table#
The Path to the annotation file. |
|
A dictionary with chromosome names as keys and their lengths as values. |
|
The pysam FastaFile object for the FASTA file. |
|
The name of the genome. |
Methods table#
|
Fetch a sequence from a genomic region. |
Attributes#
- Genome.annotation#
The Path to the annotation file.
Currently not used in the package.
- Returns:
The path to the annotation file.
- Genome.chrom_sizes#
A dictionary with chromosome names as keys and their lengths as values.
- Returns:
A dictionary of chromosome sizes.
- Genome.fasta#
The pysam FastaFile object for the FASTA file.
- Returns:
The pysam FastaFile object.
- Genome.name#
The name of the genome.
- Returns:
The name of the genome.
Methods#
- Genome.fetch(chrom=None, start=None, end=None, strand='+', region=None)#
Fetch a sequence from a genomic region.
Start and end denote 0-based, half-open intervals, following the bed convention.
- Parameters:
chrom (default:
None
) – The chromosome of the region to extract.start (default:
None
) – The start of the region to extract. Assumes 0-indexed positions.end (default:
None
) – The end of the region to extract, exclusive.strand (default:
'+'
) – The strand of the region. If ‘-’, the sequence is reverse-complemented. Default is “+”.region (default:
None
) – Alternatively, a region string to parse. If supplied together with chrom/start/end, explicit coordinates take priority.
- Return type:
- Returns:
The requested sequence, as a string.