generated from jhudsl/OTTR_Template
-
Notifications
You must be signed in to change notification settings - Fork 2
/
03-first-task.Rmd
501 lines (372 loc) · 26.6 KB
/
03-first-task.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
```{r, include = FALSE}
ottrpal::set_knitr_image_path()
```
# The first task
Before we write any sort of WDL -- whether it is for somatic mutation calling like we will be going over, or any other bioinformatics task -- we need to understand the building blocks of WDL: Tasks!
As mentioned in the first part of this course, every WDL workflow is made up of at least one task. A task typically has inputs, outputs, runtime attributes, and a command section. You can think of a task as a discrete step in a workflow. It can involve a single call to a single bioinformatics tool, a sequence of bash commands, an inline Python script... almost anything you can do non-interactively in a terminal, you can do in a WDL task. In this section, we will go over the parts of a WDL task in more detail to help us write a task for somatic mutation calling.
## Inputs
The inputs of a task are the files and/or variables you will passing into your task's command section. Typically, you will want to include at least one File input in a task, but that isn't a requirement. You can pass most WDL variable types into a task. In our example workflow, we are starting with a single fastq file per sample, and we know we will need to convert it into a sam file. A sam file is an alignment, so we will need a reference genome to align our fastqs to. We also want to be able to control the threading for this task. Our first task's inputs will therefore start out looking like this:
```
task some_aligner {
input {
File input_fastq
File ref_fasta
}
[...]
}
```
For some aligners, this would be a sufficient set of inputs, but we have decided to use `bwa mem` in particular to take us from `.fastq` to `.sam`. `bwa mem` requires a lot of index files, which we will also need as inputs. These index files can be specified via an array, but for now we'll list everything separately to make sure nothing is being left out.
```
task BwaMem {
input {
# main input
File input_fastq
# reference files
File ref_fasta
File ref_fasta_index
File ref_dict
File ref_amb
File ref_ann
File ref_bwt
File ref_pac
File ref_sa
}
[...]
}
```
### Referencing inputs in the command section
The command section of a WDL task is a bash script that will be run [non-interactively](https://tldp.org/LDP/abs/html/intandnonint.html) by the WDL executor. Although it is helpful to think of tasks as discrete steps in a workflow, that does not mean each task needs to be a single line of code or use only one piece of software. You could, for example, call a bioinformatics tool and then reprocess the outputs in the same WDL task.
A WDL task's input variables are generally referred to in the command section using a tilde (~) and curly braces, using heredoc syntax.
<details>
<summary> <b>Why use heredox syntax?</b></summary>
You may see WDLs that use this notation for the command section in a task:
```
task do_something_curly_braces {
input {
String some_string
}
command { ## note the brackets
some_other_string="FOO"
echo ${some_string}
echo $some_other_string
}
}
```
We recommend using heredoc-style syntax instead:
```
task do_something_carrots {
input {
String some_string
}
command <<< ## note the '<<<'
some_other_string="FOO"
echo ~{some_string}
echo $some_other_string
>>> ## closing '>>>'
}
```
Heredoc-style syntax for command sections can be clearer than the alternative, as it makes a clearer distinction between bash variables and WDL variables. This is especially helpful for complicated bash scripts. Heredoc-style syntax is also what the WDL 1.1 spec recommends using in most cases. However, the older non-heredoc style is still perfectly functional for a lot of use cases.
</details>
To prevent issues with spaces in String and File types, it is often a good idea to put quotation marks around a String or File variables, like so:
```
task cowsay {
input {
String some_string
}
command <<<
cowsay -t "~{some_string}"
>>>
}
```
<details>
<summary><b>What can happen if we don't use quotation marks around String or File variables?</b></summary>
If `some_string` is "hello world" then the command section of this task is interpreted as the following:
```
cowsay -t "hello world"
```
What happens if we had not wrapped `~{some_string}` in quotation marks? If `some_string` was just "hello", it wouldn't matter. But because `some_string` is two words with a space in between, then the script would be interpreted as `cowsay -t hello world` and cause an error, because the cowsay program thinks `world` is another argument. By including quotation marks, `cowsay -t "~{some_string}"` can be interpreted as `cowsay -t "hello world"` and you will correctly get a cow's greeting instead of an error.
</details>
Let's see how we can reference our inputs in the command section of our task.
```
task BwaMem {
input {
File input_fastq
File ref_fasta
# these variables may look as though they are unused... but bwa mem needs them!
File ref_fasta_index
File ref_dict
File ref_amb
File ref_ann
File ref_bwt
File ref_pac
File ref_sa
}
command <<<
# warning: this will not run on all backends! see below for an explanation!
bwa mem \
-p -v 3 -t 16 -M -R '@RG\tID:foo\tSM:foo2' \
"~{ref_fasta}" "~{input_fastq}" > my_nifty_output.sam
>>>
}
```
If we were to run this task in a workflow as-is, we might expect it to run on any backend that can handle the hardware requirements. Those hardware requirements are a bit steep -- the `-t 16` part specifically requests 16 threads, for example -- but besides that, it may look like a perfectly functional task.
::: {.notice data-latex="warning"}
We have written this task to use 16 threads, as noted by `-t 16` in the `bwa mem` call. If you are running on a backend that cannot provide 16 threads, you may need to adjust set the number of threads to a lower number. Later on in this course, we will discuss how to account for lots of different backends using common workarounds and optional variables.
:::
Unfortunately, even on backends that can provide the necessary computing power, it is quite likely this task will not run as expected. This is because of how inputs work in WDL -- or, more specifically, how input files get localized when working with WDL.
### File localization
When running a WDL, a WDL executor will typically place duplicates of the input files in a brand-new subfolder of the task's [working directory](https://www.ibm.com/docs/en/zos/3.1.0?topic=directories-working-directory). Typically, you don't know the name of the directory before runtime -- they vary depending on the backend you are running and the WDL executor itself. Thankfully, at runtime, File-type variables such as `~{input_fastq}` and `~{ref_fasta}` will be replaced with paths to their respective files.
For example, if you were to run this workflow on a laptop using miniwdl, `~{ref_fasta}` would likely end up turning into `./_miniwdl_inputs/0/ref.fa` at runtime. On the other hand, if you were running the exact same workflow with Cromwell, `~{ref_fasta}` would turn into something like `/cromwell-executions/BwaMem/97c9341e-9322-9a2f-4f54-4114747b8fff/call-test_localization/inputs/-2022115965/ref.fa`. Keep in mind that these are the paths of *copies* of the input files, and that sometimes input files can be in different subfolders. For example, it's possible `~{input_fastq}` would be `./_miniwdl_inputs/0/sample.fastq` while `~{ref_fasta}` may be `./_miniwdl_inputs/1/ref.fa`.
For many programs, an input file being at `./ref.fa` versus `/_miniwdl_inputs/0/ref.fa` is inconsequential. However, this aspect of WDL can occasionally cause issues. `bwa mem` is a great example of the type of command where this sort of thing can go haywire without proper planning, due to the program making an assumption about some of your input files. Specifically, `bwa mem` assumes that the reference fasta that you pass in shares the same folder as the other reference files (ref_amb, ref_ann, ref_bwt, etc), and it does not allow you to specify otherwise.
<details>
<summary><b>Another example of file localization issue</b></summary>
bwa is not the only program that makes assumptions about where files are located, and assumptions being made do not only affect reference genome files. Bioinformatics programs that take in some sort of index file frequently assume that index file is located in the same directory as the non-index input. For example, if you were to pass in `SAMN1234.bam` into [covstats](https://github.com/brentp/goleft/tree/master/covstats), it would expect an index file named `SAMN1234.bam.bai` or `SAMN1234.bai` in the same directory as the bam file, [as seen in the source code here](https://github.com/brentp/goleft/blob/fa6b00d20d1f73a068ffbab49a5769d173cae56d/covstats/covstats.go#L239). As there is no way to specify that the index file manually, you need to take that into consideration when writing WDLs involving covstats, bwa, and other similar tools.
</details>
Thankfully, the solution here is simple: Move all of the input files directly into the working directory.
```
task BwaMem {
input {
File input_fastq
File ref_fasta
File ref_fasta_index
File ref_dict
File ref_amb
File ref_ann
File ref_bwt
File ref_pac
File ref_sa
}
command <<<
set -eo pipefail
# This can also be done by creating an array and then looping that array,
# but we'll do it one line at a time or clarity's sake.
mv "~{ref_fasta}" .
mv "~{ref_fasta_index}" .
mv "~{ref_dict}" .
mv "~{ref_amb}" .
mv "~{ref_ann}" .
mv "~{ref_bwt}" .
mv "~{ref_pac}" .
mv "~{ref_sa}" .
bwa mem \
[...]
>>>
}
```
::: {.notice data-latex="warning"}
Some backends/executors do not support `mv` acting on input files. If you are running into problems with this and are working with miniwdl, the `--copy-input-files` flag will usually allow `mv` to work. You could also simply use `cp` to copy the files instead of move them, although this may not be an efficient use of disk space, so consider using `mv` if your target backends and executors can handle it.
:::
With our files now all in the working directory, we can turn our attention to the bwa task itself. We can no longer directly pass in `~{ref_fasta}` or any of the other files we moved into the working directory, because those variables will point to a non-existent file in a now-empty input directory. There are several ways to solve this problem:
* Assuming the filename of an input is constant, which might be a safe assumption for reference files
* Using the bash built-in basename function
* Using the WDL built-in basename() function along with private variables
We recommend using the last option, as it works for essentially any input and may be more intuitive than the bash basename function. [OpenWDL explains](https://docs.openwdl.org/en/stable/WDL/basename/) how `basename()` works. The next section will provide an example of using it alongside private variables.
### Private variables
Is there a variable you wish to use in your task section that is based on another input variable, or do not want people using your workflow to be able to directly overwrite? You can define variables outside the `input {}` section to create variables that function like private variables. In our case, we create `String ref_fasta_local` as `ref_fasta`'s file base name to refer to the files we have moved to the working directory. We also create `String base_file_name` as `input_fastq`'s file base name and use it to name our output files, such as `"~{base_file_name}.sorted_query_aligned.bam"`. The variables `read_group_id`, `sample_name`, and `platform_info` are created similarly.
```
task BwaMem {
input {
File input_fastq
File ref_fasta
File ref_fasta_index
File ref_dict
File ref_amb
File ref_ann
File ref_bwt
File ref_pac
File ref_sa
}
# basename() is a built-in WDL function that acts like bash's basename
String base_file_name = basename(input_fastq, ".fastq")
String ref_fasta_local = basename(ref_fasta)
String read_group_id = "ID:" + base_file_name
String sample_name = "SM:" + base_file_name
String platform_info = "PL:illumina"
command <<<
set -eo pipefail
mv "~{ref_fasta}" .
mv "~{ref_fasta_index}" .
mv "~{ref_dict}" .
mv "~{ref_amb}" .
mv "~{ref_ann}" .
mv "~{ref_bwt}" .
mv "~{ref_pac}" .
mv "~{ref_sa}" .
bwa mem \
-p -v 3 -t 16 -M -R '@RG\t~{read_group_id}\t~{sample_name}\t~{platform_info}' \
~{ref_fasta_local} ~{input_fastq} > ~{base_file_name}.sam
samtools view -1bS -@ 15 -o ~{base_file_name}.aligned.bam ~{base_file_name}.sam
samtools sort -@ 15 -o ~{base_file_name}.sorted_query_aligned.bam ~{base_file_name}.aligned.bam
>>>
}
```
## Runtime attributes
The runtime attributes of a task tell the WDL executor important information about how to run the task. For a `bwa mem` task, we want to make sure we have plenty of hardware resources available. We also need to include a reference to the docker image we want the task to actually run in.
```
runtime {
memory: "48 GB"
cpu: 16
docker: "ghcr.io/getwilds/bwa:0.7.17"
disks: "local-disk 100 SSD"
}
```
In WDL 1.0, the interpretation of runtime attributes by different executors and backends is extremely varied. The [WDL 1.0 spec](https://github.com/openwdl/wdl/blob/main/versions/1.0/SPEC.md#runtime-section) allows for arbitrary values here:
> Individual backends will define which keys they will inspect so a key/value pair may or may not actually be honored depending on how the task is run. Values can be any expression and it is up to the engine to reject keys and/or values that do not make sense in that context.
This can lead to some pitfalls:
* Some of the attributes in your task's `runtime` section may be silently ignored, such as the `memory` attribute when running Cromwell on the Fred Hutch HPC (as of Feb 2024)
* Some runtime attributes that are unique to particular backends, such as the Fred Hutch HPC's `walltime` attribute
* The same runtime attribute working differently on different backends, such as `disks` acting differently on Cromwell depending on whether it is running on AWS or GCP
When writing WDL 1.0 workflows with specific hardware requirements, keep in mind what your backend and executor is able to interpret. It is also helpful to consider that other people running your workflow may be doing so on different backends and executors. More information can be found in the appendix, where we talk about designing WDLs for specific backends. For now, we will stick with `memory`, `cpu`, `docker`, and `disks` as this group of four runtime attributes will help us run this workflow on the majority of backends and executors. Even though the Fred Hutch HPC will ignore the `memory` and `disks` attributes, for instance, their inclusion will not cause the workflow to fail, but they will allow the workflow to run on Terra.
<details>
<summary><b>Some differences between WDL 1.0 and 1.1 on Runtime attributes</b></summary>
Although the focus of this course is on WDL 1.0, it is worth noting that in the [WDL 1.1 spec](https://github.com/openwdl/wdl/blob/main/versions/1.1/SPEC.md#runtime-section), a very different approach to runtime attributes is taken:
> There are a set of reserved attributes (described below) that must be supported by the execution engine, and which have well-defined meanings and default values. Default values for all optional standard attributes are directly defined by the WDL specification in order to encourage portability of workflows and tasks; execution engines should NOT provide additional mechanisms to set default values for when no runtime attributes are defined.
If you are writing WDLs under the WDL 1.1 standard, you may have more flexibility with runtime attributes. Be aware that as of February 2024, Cromwell does not support WDL 1.1.
</details>
### Docker images and containers
WDL is built to make use of Docker as it makes handling software dependencies much simpler. Docker images can help address all of these situations:
* Some software is difficult to install or compile on certain systems
* Some programs have conflicting dependencies
* You may not want to directly install software on your system to prevent it from breaking existing software
* You may not have permission to install software if you are using an institute HPC or other shared resource
When you run a WDL task that has a `docker` runtime attribute, your task will be executed in a Docker container sandbox environment. This container sandbox is derived from a template called a *Docker image*, which packages installed software in a special file system. This is one of the main features of a Docker image -- because a Docker image packages the software you need, you can skip much of the installation and dependency issues associated with using new software. Because you take actions within a Docker container sandbox, it's unlikely for you to "mess up" your main system's files. Although a Docker container is, strictly speaking, not the same as a virtual machine, it is helpful to think of it as one if you are new to Docker. Docker containers are managed by Docker Engine/Apptainer, and the official Docker GUI is called Docker Desktop.
<details>
<summary><b>More information on finding and developing Docker images</b></summary>
Although you will generally need to be able to run Docker in order to run WDLs, you do not need to know how to create Dockerfiles -- plaintext files which compile Docker images when run via `docker build` -- to write your own WDLs. Most popular bioinformatics software packages already have ready-to-use Docker images available, which you can typically find on [Docker Hub](https://hub.docker.com/search?q=). Other registries include quay.io and the Google Container Registry. With that being said, if you would like to create your own Docker images, there are many tutorials and [guidelines](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/) available online. You can also learn more about the details of Docker (and why they technically aren't virtual machines) in [Docker's official curriculum](https://docker-curriculum.com/#introduction).
</details>
## Outputs
The outputs of a task are defined in the `output` section of your task. Typically, this will take the form of directly outputting a file that was created in the command section. When these file outputs are referenced in the `output` section, you can refer to their path in the Docker container directly. You can also make outputs a function of input variables, including private input variables. This can be helpful if you intend on running this WDL on many different files -- each one will get a unique filename based on the input fastq, instead of every sample ending up being named something generic like "converted.sam". For our `bwa mem` task, one way to write the output section would be as follows:
```
output {
File analysisReadySorted = "~{base_file_name}.sorted_query_aligned.bam"
}
```
Another way of writing this is with string concatenation. This is equivalent to what we wrote above -- choose whichever version you prefer.
```
output {
File analysisReadySorted = base_file_name + ".sorted_query_aligned.bam"
}
```
If the output was not in the working directory, we would need to change the output to point to the file's path relative to the working directory, such as `File analysisReadySorted = "some_folder/~{base_file_name}.sorted_query_aligned.bam"`.
Below are some some additional ways you can handle task outputs.
<details>
<summary><b>Ouputs as functions of other outputs in the same task</b></summary>
Outputs can (generally, see warning below) also be functions of other outputs in the same task, as long as those outputs are declared first.
```
task add_one {
input {
Int some_integer
}
command <<<
echo ~{some_integer} > a.txt
echo "1" > b.txt
>>>
output {
Int a = read_int("a.txt")
Int b = read_int("b.txt")
Int c = a + b
}
}
```
::: {.notice data-latex="warning"}
Cromwell does not fully support outputs being a function of the same task's other outputs. On the Terra backend, the above code example would cause an error.
:::
</details>
<details>
<summary><b>Grabbing multiple outputs at the same time</b></summary>
To grab multiple outputs at the same time, use glob() to create an array of files. We'll also take this opportunity to demonstrate iterating through a bash array created from an Array[String] input -- for more information on this data type, see chapter six of this course.
```
task one_word_per_file {
input {
Array[String] a_big_sentence
}
command <<<
ARRAY_OF_WORDS=(~{sep=" " a_big_sentence})
i=0
for word in "${!ARRAY_OF_WORDS[@]}"
do
i=$((i+1))
echo $word >> $i.txt
done
>>>
output {
Array[File] several_words = glob("*.txt")
}
}
```
`glob()` can also be used to grab just one file via `glob("*.txt")[0]` to grab the first thing that matches the glob. This is usually only necessary if you know the extension of an output, but do not have a way of predicting the rest of its filename. Be aware that if anything else in the working directory has the extension you are searching for, you might accidentally grab that one instead of the one you are looking for!
</details>
## The whole task
We've now designed a `bwa mem` task that can run on almost any backend that supports WDL and can handle the hardware requirements. Issues involving `bwa mem` expecting reference files to be in the same folder and/or putting output files into input folders have been sidestepped thanks to careful design and consideration. The runtime section clearly defines the expected hardware requirements, and the outputs section defines what we expect the task to give us when all is said and done. We're now ready to continue with the rest of our workflow.
```
task BwaMem {
input {
File input_fastq
referenceGenome refGenome
}
String base_file_name = basename(input_fastq, ".fastq")
String ref_fasta_local = basename(refGenome.ref_fasta)
String read_group_id = "ID:" + base_file_name
String sample_name = "SM:" + base_file_name
String platform_info = "PL:illumina"
command <<<
set -eo pipefail
mv ~{refGenome.ref_fasta} .
mv ~{refGenome.ref_fasta_index} .
mv ~{refGenome.ref_dict} .
mv ~{refGenome.ref_amb} .
mv ~{refGenome.ref_ann} .
mv ~{refGenome.ref_bwt} .
mv ~{refGenome.ref_pac} .
mv ~{refGenome.ref_sa} .
bwa mem \
-p -v 3 -t 16 -M -R '@RG\t~{read_group_id}\t~{sample_name}\t~{platform_info}' \
~{ref_fasta_local} ~{input_fastq} > ~{base_file_name}.sam
samtools view -1bS -@ 15 -o ~{base_file_name}.aligned.bam ~{base_file_name}.sam
samtools sort -@ 15 -o ~{base_file_name}.sorted_query_aligned.bam ~{base_file_name}.aligned.bam
>>>
output {
File analysisReadySorted = "~{base_file_name}.sorted_query_aligned.bam"
}
runtime {
memory: "48 GB"
cpu: 16
docker: "ghcr.io/getwilds/bwa:0.7.17"
}
}
```
## Putting the workflow together
A workflow is needed to run the `BwaMem` task we just built. The workflow's input variables are defined by the workflow JSON metadata, and are then passed on as inputs in our `BwaMem` call. When the `BwaMem` call is complete, the workflow's output File variable is defined based on the task's output. Lastly, we have a parameter_meta component in our workflow that describes each workflow input variable as documentation.
For the workflow to actually "see" the task, the task will either need to be imported at the top of the workflow (just under the `version 1.0` string), or included in the same file as the workflow. For simplicity, we will put the workflow and the task in the same file.
<script src="https://gist.github.com/fhdsl-robot/e0c75399546cd4557cab717d6b6aa109.js"></script>
## Testing your first task
To test your first task and your workflow, you should have expectation of output is. For this first `BwaMem` task, we just care that the BAM file is created with aligned reads. You can use `samtools view output.sorted_query_aligned.bam` to examine the reads and pipe it to wordcount `wc` to get the number of total reads. This number should be almost identical as the number of reads from your input FASTQ file if you run `wc input.fastq`. In other tasks, we might have a more precise expectation of what the output file should be, such as containing the specific somatic mutation call that we have curated.
Here is an example JSON with the [test data](https://figshare.com/articles/dataset/WDL_101_Dataset/25447528) needed to run this single-task workflow:
```
{
"mutation_calling.sampleFastq": "/path/to/Tumor_2_EGFR_HCC4006_combined.fastq",
"mutation_calling.ref_fasta": "/path/to/Homo_sapiens_assembly19.fasta",
"mutation_calling.ref_fasta_index": "/path/to/Homo_sapiens_assembly19.fasta.fai",
"mutation_calling.ref_dict": "/path/to/Homo_sapiens_assembly19.dict",
"mutation_calling.ref_pac": "/path/to/Homo_sapiens_assembly19.fasta.pac",
"mutation_calling.ref_sa": "/path/to/Homo_sapiens_assembly19.fasta.sa",
"mutation_calling.ref_amb": "/path/to/Homo_sapiens_assembly19.fasta.amb",
"mutation_calling.ref_ann": "/path/to/Homo_sapiens_assembly19.fasta.ann",
"mutation_calling.ref_bwt": "/path/to/Homo_sapiens_assembly19.fasta.bwt"
}
```
<details>
<summary><b>The example JSON using the Fred Hutch HPC</b></summary>
{
"mutation_calling.sampleFastq": "/fh/fast/paguirigan_a/pub/ReferenceDataSets/workflow_testing_data/WDL/wdl_101/HCC4006_final.fastq",
"mutation_calling.ref_fasta": "/fh/fast/paguirigan_a/pub/ReferenceDataSets/genome_data/human/hg19/Homo_sapiens_assembly19.fasta",
"mutation_calling.ref_fasta_index": "/fh/fast/paguirigan_a/pub/ReferenceDataSets/genome_data/human/hg19/Homo_sapiens_assembly19.fasta.fai",
"mutation_calling.ref_dict": "/fh/fast/paguirigan_a/pub/ReferenceDataSets/genome_data/human/hg19/Homo_sapiens_assembly19.dict",
"mutation_calling.ref_pac": "/fh/fast/paguirigan_a/pub/ReferenceDataSets/genome_data/human/hg19/Homo_sapiens_assembly19.fasta.pac",
"mutation_calling.ref_sa": "/fh/fast/paguirigan_a/pub/ReferenceDataSets/genome_data/human/hg19/Homo_sapiens_assembly19.fasta.sa",
"mutation_calling.ref_amb": "/fh/fast/paguirigan_a/pub/ReferenceDataSets/genome_data/human/hg19/Homo_sapiens_assembly19.fasta.amb",
"mutation_calling.ref_ann": "/fh/fast/paguirigan_a/pub/ReferenceDataSets/genome_data/human/hg19/Homo_sapiens_assembly19.fasta.ann",
"mutation_calling.ref_bwt": "/fh/fast/paguirigan_a/pub/ReferenceDataSets/genome_data/human/hg19/Homo_sapiens_assembly19.fasta.bwt"
}
</details>
<iframe src="https://docs.google.com/forms/d/e/1FAIpQLSeEKGWTJOowBhFlWftPUjFU8Rfj-d9iXIHENyd8_HGS8PM7kw/viewform?embedded=true" width="640" height="886" frameborder="0" marginheight="0" marginwidth="0">Loading…</iframe>