Germline Variant De Novo Filtration
Implementation of the variant_denovo_filtration
step.
This step implements filtration of variants to de novo variants. This step was introduced for the “Ionizing Radiation” study in ca. 2016 and the aim here is to get a set of high-confidence de novo sequence variants (both SNVs and indels, although the latter turned out to be less reliable). Further, if the variants are phased, assigning to paternal or maternal allele can be attempted. This allows to study paternal age effects.
Note that in contrast to variant_calling
and variant_annotation
but in consistency with
variant_phasing
, the central individual here are children and not the index of pedigrees.
Step Input
The step reads in the variant call files from one of the following steps:
variant_calling
variant_annotation
variant_phasing
Of course, assignment to parental allele can only be performed on phased variants. Further, only filtering annotated variants is really useful as one wants to excludes variants in problematic genomic regions.
Step Output
For all children with both parents present, variant de novo annotation will be attempted on the primary DNA NGS library of that child. The name of this library will be used as the identification token in the output file and file name. For each read mapper, variant caller, and pedigree, the following files will be generated:
{mapper}.{var_caller}.{annotation}.{phasing}.de_novos.{lib_name}.vcf.gz.tbi
{mapper}.{var_caller}.{annotation}.{phasing}.de_novos.{lib_name}.vcf.gz
{mapper}.{var_caller}.{annotation}.{phasing}.de_novos.{lib_name}.vcf.gz.md5
{mapper}.{var_caller}.{annotation}.{phasing}.de_novos.{lib_name}.vcf.gz.tbi.md5
{mapper}.{var_caller}.{annotation}.{phasing}.de_novos_hard.{lib_name}.vcf.gz
{mapper}.{var_caller}.{annotation}.{phasing}.de_novos_hard.{lib_name}.vcf.gz.tbi
{mapper}.{var_caller}.{annotation}.{phasing}.de_novos_hard.{lib_name}.vcf.gz.md5
{mapper}.{var_caller}.{annotation}.{phasing}.de_novos_hard.{lib_name}.vcf.gz.tbi.md5
{mapper}.{var_caller}.{annotation}.{phasing}.de_novos_hard.{lib_name}.summary.txt
{mapper}.{var_caller}.{annotation}.{phasing}.de_novos_hard.{lib_name}.summary.txt.md5
The the annotation
and phasing
will only be persent when the input is read from the
variant_annotation
or variant_phasing
steps, respectively.
For example, it might look as follows for the example from above:
output/
+-- bwa.gatk3_hc.de_novos.P001-N1-DNA1-WES1
| `-- out
| |-- bwa.gatk3_hc.de_novos.P001-N1-DNA1-WES1.vcf.gz
| |-- bwa.gatk3_hc.de_novos.P001-N1-DNA1-WES1.vcf.gz.md5
| |-- bwa.gatk3_hc.de_novos.P001-N1-DNA1-WES1.vcf.gz.tbi
| |-- bwa.gatk3_hc.de_novos.P001-N1-DNA1-WES1.vcf.gz.tbi.md5
| |-- bwa.gatk3_hc.de_novos.P001-N1-DNA1-WES1.vcf.gz
| |-- bwa.gatk3_hc.de_novos_hard.P001-N1-DNA1-WES1.vcf.gz.md5
| |-- bwa.gatk3_hc.de_novos_hard.P001-N1-DNA1-WES1.vcf.gz.tbi
| |-- bwa.gatk3_hc.de_novos_hard.P001-N1-DNA1-WES1.vcf.gz.tbi.md5
| |-- bwa.gatk3_hc.de_novos_hard.P001-N1-DNA1-WES1.vcf.gz
| |-- bwa.gatk3_hc.de_novos_hard.P001-N1-DNA1-WES1.summary.txt
| `-- bwa.gatk3_hc.de_novos_hard.P001-N1-DNA1-WES1.summary.txt.md5
[...]
Global Configuration
No global configuration is in use.
Default Configuration
The default configuration is as follows.
step_config:
variant_denovo_filtration:
#path_variant_phasing: ''
#path_variant_annotation: ''
#path_variant_calling: ''
#path_ngs_mapping: ../ngs_mapping
#
# defaults to ngs_mapping tool
#tools_ngs_mapping: []
#
# defaults to variant_annotation tool
#tools_variant_calling: []
#
# optional INFO keys with reliable regions
#info_key_reliable_regions: []
#
# optional INFO keys with unreliable regions
#info_key_unreliable_regions: []
#params_besenbacher:
# min_gq: 50
# min_dp: 10
# max_dp: 120
# min_ab: 0.2
# max_ab: 0.9
# max_ad2: 1
#bad_region_expressions: [] # Examples: ["'UCSC_CRG_MAPABILITY36 == 1'", "'UCSC_SIMPLE_REPEAT == 1'"]
#
# whether or not to collect MSDN (requires GATK HC+UG)
#collect_msdn: true
Reports
Currently, no reports are generated.