Germline Build Target Sequence gCNV Model
Implementation of the helper_gcnv_model_targeted
step
The helper_gcnv_model_targeted
step takes as the input the results of the ngs_mapping
step (aligned germline reads) and builds a model that can be used by GATK4 gCNV for a particular
library kit.
Step Input
The step uses Snakemake sub workflows for the result of the ngs_mapping
(aligned reads BAM files).
Step Output
All donors will be used to generate the two parts of the required gCNV model, specifically:
ploidy-model
and cnv_calls-model
. Both are required to execute gCNV in CASE mode.
For example, the relevant directories might look as follows:
work/
+-- bwa.gcnv_contig_ploidy.<library_kit_name>
`-- out
`-- bwa.gcnv_contig_ploidy.<library_kit_name>
|-- SAMPLE_0
| |-- contig_ploidy.tsv
| |-- global_read_depth.tsv
| |-- mu_psi_s_log__.tsv
| |-- sample_name.txt
| `-- std_psi_s_log__.tsv
|-- [...]
`-- bwa.gcnv_contig_ploidy.<library_kit_name>
`-- ploidy-model
|-- contig_ploidy_prior.tsv
|-- gcnvkernel_version.json
|-- interval_list.tsv
|-- mu_mean_bias_j_lowerbound__.tsv
|-- mu_psi_j_log__.tsv
|-- ploidy_config.json
|-- std_mean_bias_j_lowerbound__.tsv
`-- std_psi_j_log__.tsv
+-- bwa.gcnv_call_cnvs.<library_kit_name>.***_of_***
`-- out
`-- bwa.gcnv_call_cnvs.<library_kit_name>.***_of_***
|-- cnv_calls-calls
| |-- SAMPLE_0
| `-- [...]
| |-- [...]
|-- cnv_calls-model
| |-- denoising_config.json
| |-- gcnvkernel_version.json
| |-- interval_list.tsv
| |-- log_q_tau_tk.tsv
| |-- mu_W_tu.tsv
| |-- mu_ard_u_log__.tsv
| |-- mu_log_mean_bias_t.tsv
| |-- mu_psi_t_log__.tsv
| |-- std_W_tu.tsv
| |-- std_ard_u_log__.tsv
| |-- std_log_mean_bias_t.tsv
| `-- std_psi_t_log__.tsv
`-- cnv_calls-tracking
`-- [...]
Global Configuration
At the moment, no global configuration is used.
Default Configuration
The default configuration is as follows.
step_config:
helper_gcnv_model_targeted:
#path_ngs_mapping: ../ngs_mapping
gcnv: # REQUIRED
# Path to interval block list with PAR region for contig calling.
# path_par_intervals: ''
#
# path to BED file with uniquely mappable regions.
path_uniquely_mapable_bed: # REQUIRED
# The following allows to define one or more set of target intervals. This is only used by gcnv.
# Example:
# - name: "Agilent SureSelect Human All Exon V6"
# pattern: "Agilent SureSelect Human All Exon V6.*"
# path: "path/to/targets.bed"
path_target_interval_list_mapping: # REQUIRED