ngs checkmate workflow

Introduction

Based on the tool from https://github.com/parklab/NGSCheckMate, "NGSCheckMate uses depth-dependent correlation models of allele fractions of known single-nucleotide polymorphisms (SNPs) to identify samples from the same individual." Contains preprocessing tools and workflows, as well as a workflow for batch processing.

Workflows

References obtainable from https://cavatica.sbgenomics.com/u/kfdrc-harmonization/kf-references

bcf_call.cwl

Creates input vcfs for ngs checkmate. Especially useful to run when inputs are large WGS bam files.

inputs

inputs:
  input_align: File[]
  chr_list: File
  reference_fasta: File
  snp_bed: File

Suggested inputs:

chr_list: chr_list.txt
snp_bed: SNP_hg38_liftover_wChr.bed
reference_fasta: Homo_sapiens_assembly38.fasta

outputs

bcf_called_vcf: {type: File[], outputSource: bcf_filter/bcf_call}

ngs_checkmate_wf.cwl

Runs ngscheckmate in vcf mode - requires output from bcf_call step.

inputs

inputs:
  input_vcf:
    type:
        type: array
        items:
            type: array
            items: File
  
  snp_bed: File
  output_basename: string[]
  ram: 
    type: ['null', int]
    default: 4000

Ram in megabytes, input param optional - use if you plan on batching ~20+ vcfs.
input_vcf is an array of arrays - basically an array of groups of vcfs that you'd like ot see evaluated together.
output_basename is an array of file output prefixes - should line up with the first level of array elements from input_vcf

outputs

outputs:
  match_results: {type: 'File[]', outputSource: ngs_checkmate/match_results}
  correlation_matrix: {type: 'File[]', outputSource: ngs_checkmate/correlation_matrix}

Tools:

bcf_filter.cwl

Used by bcf_call.cwl, can take cram or bam as input.

inputs

inputs:
  input_align:
    type: File
    secondaryFiles: |
        ${
          if (inputs.input_align.nameext == '.cram'){
            return inputs.input_align.basename + '.crai';
          }
        else {
          return inputs.input_align.nameroot + '.bai';
        }
        }
  chr_list: File
  reference_fasta:
    type: File
    secondaryFiles: ['.fai']
  snp_bed: File

outputs

outputs:
  bcf_call:
    type: File
    outputBinding:
      glob: "$(inputs.input_align.nameroot).bcf.called.vcf"

ngs_checkmate_vcf.cwl

Takes an array of vcfs in which bams have been filtered using bcf tools and outputs match results and a correlation matrix.

inputs

inputs:
  input_vcf:
    type: File[]
  snp_bed: File
  output_basename: string
  ram:
    type: ['null', int]
    default: 4000

outputs

outputs:
  match_results:
    type: File
    outputBinding:
      glob: "$(inputs.output_basename)_all.txt"

  correlation_matrix:
    type: File
    outputBinding:
      glob: "$(inputs.output_basename)_corr_matrix.txt "

rna_tx2genome.cwl

In the rare event that your input is a transcriptome bam, this will convert to the necessary sorted genome bam.

inputs

inputs:
  input_bam: {type: File, doc: "Input transcriptome bam"}
  genomeDir: {type: File, doc: "rsem tar gzipped reference"}

outputs

outputs:
  genome_bam:
    type: File
    outputBinding:
      glob: "$(inputs.input_bam.nameroot).converted.bam"
    secondaryFiles: [^.bai]

samtools_sort_index.cwl

In the event that your bam is not coord sorted, use this tool first

inputs

inputs:
  input_align: File

outputs

outputs:
  sorted_bam:
    type: File
    outputBinding:
      glob: "$(inputs.input_align.nameroot).sorted.bam"
  sorted_bai:
    type: File
    outputBinding:
      glob: "$(inputs.input_align.nameroot).sorted.bai"

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.github		.github
docs		docs
tools		tools
workflows		workflows
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ngs checkmate workflow

Introduction

Workflows

bcf_call.cwl

inputs

outputs

ngs_checkmate_wf.cwl

inputs

outputs

Tools:

bcf_filter.cwl

inputs

outputs

ngs_checkmate_vcf.cwl

inputs

outputs

rna_tx2genome.cwl

inputs

outputs

samtools_sort_index.cwl

inputs

outputs

About

Releases 2

Packages

Languages

License

kids-first/ngs_checkmate_wf

Folders and files

Latest commit

History

Repository files navigation

ngs checkmate workflow

Introduction

Workflows

bcf_call.cwl

inputs

outputs

ngs_checkmate_wf.cwl

inputs

outputs

Tools:

bcf_filter.cwl

inputs

outputs

ngs_checkmate_vcf.cwl

inputs

outputs

rna_tx2genome.cwl

inputs

outputs

samtools_sort_index.cwl

inputs

outputs

About

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages