Germline Variant Filtration
Implementation of the variant_filtration
step
This step takes annotated variants as the input from variant_annotation
and performs various
filtration and postprocessing operations:
- filter to high-confidence variants
apply quality filter sets
filter for consistency between different callers
filter to compatible mode of inheritance
filter by population/cohort frequency, remove polymorphisms
filter by region
filter by scores (e.g., conservation)
filter for het. comp. inheritance or keep all
#
# 1
# stringent
# loose
# 2 # $qual.denovo # $qual.dom # $qual.rec_hom
# 3 # $qual.denovo.denov_freq # $qual.dom.dom_freq # $qual.dom.rec_freq # $qual.rec_hom.rec_freq
# 4 # $qual.denovo.denov_freq.$region # $qual.dom.dom_freq.$region # $qual.dom.rec_freq.$region # $qual.rec_hom.rec_freq.$region
# 5 # $qual.denovo.denov_freq.$region.$scores # $qual.dom.dom_freq.$region.$scores # $qual.dom.rec_freq.$region.$scores # $qual.rec_hom.rec_freq.$region.$scores
# 6 # $qual.denovo.denov_freq.$region.keep_all # $qual.dom.dom_freq.$region.keep_all # $qual.dom.rec_freq.$region.$scores.same_gene # $qual.dom.rec_freq.$region.$scores.same_tad # $qual.dom.rec_freq.$region.$scores.itv_500bp # $qual.rec_hom.rec_freq.$region.keep_all
Filtration Steps
The combinations of the filters is given in the configuration setting filter_combinations
as dot-separated values, e.g., AA.BB.CC
.
Step Input
TODO
Step Output
TODO
Global Configuration
TODO
Default Configuration
The default configuration is as follows.
step_config:
variant_filtration:
#path_variant_annotation: ../variant_annotation
#
# defaults to ngs_mapping tool
#tools_ngs_mapping: []
#
# defaults to variant_annotation tool
#tools_variant_calling: []
#
# quality filter sets, "keep_all" implicitly defined
#thresholds:
# conservative:
# min_gq: 40
# min_dp_het: 10
# min_dp_hom: 5
# include_expressions:
# - "'MEDGEN_COHORT_INCONSISTENT_AC=0'"
# relaxed:
# min_gq: 20
# min_dp_het: 6
# min_dp_hom: 3
# include_expressions:
# - "'MEDGEN_COHORT_INCONSISTENT_AC=0'"
#frequencies:
#
# # AF (allele frequency) values
# af_dominant: 0.001
#
# # AF (allele frequency) values
# af_recessive: 0.01
#
# # AC (allele count in gnomAD) values
# ac_dominant: 3
#
# regions to filter to, "whole_genome" implicitly defined
#region_beds: {} # Examples: {'all_tads': '/fast/projects/medgen_genomes/static_data/GRCh37/hESC_hg19_allTads.bed', 'all_genes': '/fast/projects/medgen_genomes/static_data/GRCh37/gene_bed/ENSEMBL_v75.bed.gz', 'limb_tads': '/fast/projects/medgen_genomes/static_data/GRCh37/newlimb_tads.bed', 'lifted_enhancers': '/fast/projects/medgen_genomes/static_data/GRCh37/all_but_onlyMB.bed', 'vista_enhancers': '/fast/projects/medgen_genomes/static_data/GRCh37/vista_limb_enhancers.bed'}
#score_thresholds:
# coding:
# require_coding: true
# require_gerpp_gt2: false
# min_cadd:
# conservative:
# require_coding: false
# require_gerpp_gt2: false
# min_cadd: 0
# conserved:
# require_coding: false
# require_gerpp_gt2: true
# min_cadd:
#
# dot-separated {thresholds}.{inherit}.{freq}.{region}.{score}.{het_comp}
#filter_combinations: [] # Examples: conservative.de_novo.dominant_freq.lifted_enhancers.all_scores.passthrough, conservative.de_novo.dominant_freq.lifted_enhancers.conserved.passthrough, conservative.de_novo.dominant_freq.limb_tads.all_scores.passthrough, conservative.de_novo.dominant_freq.limb_tads.coding.passthrough, conservative.de_novo.dominant_freq.limb_tads.conserved.passthrough, conservative.de_novo.dominant_freq.vista_enhancers.all_scores.passthrough, conservative.de_novo.dominant_freq.vista_enhancers.conserved.passthrough, conservative.de_novo.dominant_freq.whole_genome.all_scores.passthrough, conservative.de_novo.dominant_freq.whole_genome.coding.passthrough, conservative.de_novo.dominant_freq.whole_genome.conserved.passthrough, conservative.dominant.dominant_freq.lifted_enhancers.all_scores.passthrough, conservative.dominant.dominant_freq.lifted_enhancers.conserved.passthrough, conservative.dominant.dominant_freq.limb_tads.all_scores.passthrough, conservative.dominant.dominant_freq.limb_tads.coding.passthrough, conservative.dominant.dominant_freq.limb_tads.conserved.passthrough, conservative.dominant.dominant_freq.vista_enhancers.all_scores.passthrough, conservative.dominant.dominant_freq.vista_enhancers.conserved.passthrough, conservative.dominant.dominant_freq.whole_genome.all_scores.passthrough, conservative.dominant.dominant_freq.whole_genome.coding.passthrough, conservative.dominant.dominant_freq.whole_genome.conserved.passthrough, conservative.dominant.recessive_freq.lifted_enhancers.all_scores.intervals500, conservative.dominant.recessive_freq.lifted_enhancers.conserved.intervals500, conservative.dominant.recessive_freq.lifted_enhancers.conserved.tads, conservative.dominant.recessive_freq.limb_tads.all_scores.intervals500, conservative.dominant.recessive_freq.limb_tads.coding.gene, conservative.dominant.recessive_freq.limb_tads.conserved.intervals500, conservative.dominant.recessive_freq.limb_tads.conserved.tads, conservative.dominant.recessive_freq.vista_enhancers.all_scores.intervals500, conservative.dominant.recessive_freq.vista_enhancers.conserved.intervals500, conservative.dominant.recessive_freq.vista_enhancers.conserved.tads, conservative.dominant.recessive_freq.whole_genome.all_scores.intervals500, conservative.dominant.recessive_freq.whole_genome.coding.gene, conservative.dominant.recessive_freq.whole_genome.conserved.intervals500, conservative.dominant.recessive_freq.whole_genome.conserved.tads, conservative.recessive_hom.recessive_freq.lifted_enhancers.all_scores.passthrough, conservative.recessive_hom.recessive_freq.lifted_enhancers.conserved.passthrough, conservative.recessive_hom.recessive_freq.limb_tads.all_scores.passthrough, conservative.recessive_hom.recessive_freq.limb_tads.coding.passthrough, conservative.recessive_hom.recessive_freq.limb_tads.conserved.passthrough, conservative.recessive_hom.recessive_freq.vista_enhancers.all_scores.passthrough, conservative.recessive_hom.recessive_freq.vista_enhancers.conserved.passthrough, conservative.recessive_hom.recessive_freq.whole_genome.all_scores.passthrough, conservative.recessive_hom.recessive_freq.whole_genome.coding.passthrough, conservative.recessive_hom.recessive_freq.whole_genome.conserved.passthrough, conservative.dominant.recessive_freq.whole_genome.coding.passthrough, conservative.dominant.recessive_freq.whole_genome.conserved.passthrough
Reports
Currently, no reports are generated.