Command Line Interface

mutyper: ancestral kmer mutation types for variant data

usage: mutyper [-h] {ancestor,variants,targets,spectra,ksfs} ...

Sub-commands

ancestor

create an ancestral FASTA file, using an outgroup alignment, and a database of SNPs with a callability mask

mutyper ancestor [-h] [--verbose] [--bed BED]
                 vcf reference outgroup chain output

Positional Arguments

vcf

VCF/BCF file, usually for a single chromosome (“-” for stdin)

reference

path to reference FASTA for one chromosome

outgroup

path to outgroup genome FASTA

chain

path to alignment chain file (reference to outgroup)

output

path for output ancestral FASTA for this chromosome

Named Arguments

--verbose

increase logging verbosity

--bed

path to BED file mask (“-” for stdin)

variants

adds mutation_type to VCF/BCF INFO, polarizes REF/ALT/AC according to ancestral/derived states, and stream to stdout

mutyper variants [-h] [--verbose] [--k K] [--target TARGET] [--sep SEP]
                 [--chrom_pos CHROM_POS] [--strand_file STRAND_FILE]
                 [--strict]
                 fasta vcf

Positional Arguments

fasta

path to ancestral FASTA

vcf

VCF/BCF file, usually for a single chromosome (“-” for stdin)

Named Arguments

--verbose

increase logging verbosity

--k

k-mer context size (default 3)

--target

0-based mutation target position in kmer (default middle)

--sep

field delimiter in FASTA headers (default “:”)

--chrom_pos

0-based chromosome field position in FASTA headers (default 0)

--strand_file

path to bed file with regions where reverse strand defines mutation context, e.g. direction of replication or transcription (default collapse reverse complements)

--strict

only uppercase nucleotides in FASTA considered ancestrally identified

targets

compute 𝑘-mer target sizes and stream to stdout

mutyper targets [-h] [--verbose] [--k K] [--target TARGET] [--sep SEP]
                [--chrom_pos CHROM_POS] [--strand_file STRAND_FILE] [--strict]
                [--bed BED]
                fasta

Positional Arguments

fasta

path to ancestral FASTA

Named Arguments

--verbose

increase logging verbosity

--k

k-mer context size (default 3)

--target

0-based mutation target position in kmer (default middle)

--sep

field delimiter in FASTA headers (default “:”)

--chrom_pos

0-based chromosome field position in FASTA headers (default 0)

--strand_file

path to bed file with regions where reverse strand defines mutation context, e.g. direction of replication or transcription (default collapse reverse complements)

--strict

only uppercase nucleotides in FASTA considered ancestrally identified

--bed

path to BED file mask (“-” for stdin)

spectra

compute mutation spectra for each sample in VCF/BCF with mutation_type data and stream to stdout

mutyper spectra [-h] [--verbose] [--population] [--randomize] vcf

Positional Arguments

vcf

VCF/BCF file, usually for a single chromosome (“-” for stdin)

Named Arguments

--verbose

increase logging verbosity

--population

population-wise spectrum, instead of individual

--randomize

randomly assign mutation to a single haplotype

ksfs

compute sample frequency spectrum for each mutation type from a VCF/BCF file with mutation_type data (i.e. output from variants subcommand ) and stream to stdout

mutyper ksfs [-h] [--verbose] vcf

Positional Arguments

vcf

VCF/BCF file, usually for a single chromosome (“-” for stdin)

Named Arguments

--verbose

increase logging verbosity