Germline Variant Phasing
Implementation of the germline variant_phasing
step
This step takes the result of the variant_annotation
step and performs phasing of the
variants using the GATK tools. Note that there are some issues with the GATK tools implementing
this step:
The result of the PhaseByTransmission tool changes the genotype of some variants which is problematic when trying to phase de novo variants.
The read backed phasing is also not 100% reliable at the moment.
Thus, the functionality of the tools is made available by this pipeline step but it is not as fully integrated as it could because it is unclear how useful this is for clinical studies. Also, so far only the GATK variant caller results can be phased.
Also note that this step generates one output file for each child in a pedigree where both parents have been sequenced.
Step Input
The variant annotation step uses the output of the following CUBI pipeline steps:
ngs_mapping
variant_annotation
Step Output
For each input VCF file (i.e., for each mapper and pedigree), a directory
output/{mapper}.{caller}.{phaser}.{index_ngs_library}/out
will be created with the following
output files.
The {phaser}
placeholder can take the values gatk_phase_by_transmission,
gatk_read_backed_phasing, and gatk_phased_both (for the latter, first phasing by transmission
and then read backed phasing is performed).
Global Configuration
static_data_config/reference/path
must be set appropriately
Default Configuration
The default configuration is as follows.
step_config:
variant_phasing:
#path_ngs_mapping: ../ngs_mapping
#path_variant_annotation: ../variant_annotation # Examples: ../variant_annotation
#
# expected tools for ngs mapping
#tools_ngs_mapping: []
#
# expected tools for variant calling
#tools_variant_calling: []
#phasings:
# - gatk_phasing_both
#
# patterns of chromosome names to ignore
#ignore_chroms:
# - NC_007605
# - hs37d5
# - chrEBV
# - '*_decoy'
# - HLA-*
#gatk_read_backed_phasing:
#
# # quality threshold for phasing
# phase_quality_threshold: 20.0
#
# # split input into windows of this size, each triggers a job
# window_length: 5000000
#
# # number of windows to process in parallel
# num_jobs: 1000
#
# # use Snakemake profile for parallel processing
# use_profile: true
#
# # number of times to re-launch jobs in case of failure
# restart_times: 0
#
# # throttling of job creation
# max_jobs_per_second: 10
#
# # throttling of status checks
# max_status_checks_per_second: 10
#
# # truncation to first N tokens (0 for none)
# debug_trunc_tokens: 0
#
# # keep temporary directory, {always, never, onerror}
# keep_tmpdir: never # Options: 'always', 'never', 'onerror'
#
# # memory multiplier
# job_mult_memory: 1.0
#
# # running time multiplier
# job_mult_time: 1.0
#
# # memory multiplier for merging
# merge_mult_memory: 1.0
#
# # running time multiplier for merging
# merge_mult_time: 1.0
#gatk_phase_by_transmission:
#
# # use 1e-6 when interested in phasing de novos
# de_novo_prior: 1e-08
Reports
Currently, no reports are generated.