Somatic Gene Fusion Calling

Implementation of the somatic_gene_fusion_calling step

The somatic_gene_fusion calling step allows for the detection of gene fusions from RNA-seq data in cancer. The wrapped tools start at the raw RNA-seq reads and generate filtered lists of predicted gene fusions.

Step Input

Gene fusion calling starts at the raw RNA-seq reads. Thus, the input is very similar to one of ngs_mapping step.

See Step Input for more information.

Step Output

Note

TODO

Default Configuration

The default configuration is as follows.

step_config:
  somatic_gene_fusion_calling:
    path_link_in: ""  # OPTIONAL Override data set configuration search paths for FASTQ files
    tools: ['fusioncatcher', 'jaffa', 'arriba', 'defuse', 'hera', 'pizzly', 'star_fusion']  # REQUIRED, available: 'fusioncatcher', 'jaffa', 'arriba', 'defuse', 'hera', 'pizzly', 'star_fusion'.
    fusioncatcher:
      data_dir: REQUIRED   # REQUIRED
      configuration: null  # optional
      num_threads: 16
    pizzly:
      kallisto_index: REQUIRED    # REQUIRED
      transcripts_fasta: REQUIRED # REQUIRED
      annotations_gtf: REQUIRED       # REQUIRED
      kmer_size: 31
    hera:
      path_index: REQUIRED   # REQUIRED
      path_genome: REQUIRED  # REQUIRED
    star_fusion:
      path_ctat_resource_lib: REQUIRED
    defuse:
      path_dataset_directory: REQUIRED
    arriba:
      path_index: REQUIRED       # REQUIRED  STAR path index (preferably 2.7.10 or later)
      features: REQUIRED         # REQUIRED  Gene features (for ex. ENCODE or ENSEMBL) in gtf format
      blacklist: ""              # optional (provided in the arriba distribution, see /fast/work/groups/cubi/projects/biotools/static_data/app_support/arriba/v2.3.0)
      known_fusions: ""          # optional
      tags: ""                   # optional (can be set to the same path as known_fusions)
      structural_variants: ""    # optional
      protein_domains: ""        # optional
      num_threads: 8
      trim_adapters: false
      num_threads_trimming: 2
      star_parameters:
      - " --outFilterMultimapNmax 50"
      - " --peOverlapNbasesMin 10"
      - " --alignSplicedMateMapLminOverLmate 0.5"
      - " --alignSJstitchMismatchNmax 5 -1 5 5"
      - " --chimSegmentMin 10"
      - " --chimOutType WithinBAM HardClip"
      - " --chimJunctionOverhangMin 10"
      - " --chimScoreDropMax 30"
      - " --chimScoreJunctionNonGTAG 0"
      - " --chimScoreSeparation 1"
      - " --chimSegmentReadGapMax 3"
      - " --chimMultimapNmax 50"

Available Gene Fusion Callers

  • fusioncatcher