Releases: 4dn-dcic/tibanna
Releases · 4dn-dcic/tibanna
0.5.8
invoke log
can be used to stream log or postrun json file.invoke log --job-id=<jobid>
for loginvoke log --job-id=<jobid> -p
for postrun json
- postrun json file now contains Cloudwatch metrics for memory/CPU and disk space for all jobs.
"Job": {
"status": "0",
"Metrics": {
"max_cpu_utilization_percent": 86.4,
"max_mem_used_MB": 14056.421875,
"max_mem_utilization_percent": 45.124831006539534,
"max_disk_space_utilization_percent": 72.0912267060547,
"total_mem_MB": 31150.08203125,
"max_mem_available_MB": 17093.66015625,
"max_disk_space_used_GB": 64.4835815429688
},
invoke rerun
has config override options such as--instance-type
,--shutdown-min
,--ebs-size
and--key-name
(and--overwrite-input-extra
for pony) rerun a job with a different configuration.
Pony
encode-atacseq-aln
andencode-atacseq-postaln
qc handling- The
awsem_job_id
field of the workflow run metadata is automatically filled.
0.5.7
- Spot instance is now supported. To use a spot instance, use
"spot_instance": true
in theconfig
field in the input execution json.
e.g.
"spot_instance": true,
"spot_duration": 360
- For pony, email notification can be enabled by adding
"email": true
in theconfig
field of the input execution json. This currently works only for Pony.
0.5.6
- CloudWatch set up permission error fixed
invoke kill
works with jobid (previously it worked only with execution arn)
invoke kill --job-id=<jobid> [--sfn=<stepfunctionname>]
- A more comprehensive monitoring using
invoke stat -v
that prints out instance ID, IP, instance status, ssh key and password.
To update an existing Tibanna on AWS, do the following
invoke setup_tibanna_env --buckets=<bucket1>,<bucket2>,...
invoke deploy_tibanna --sfn-type=unicorn --usergroup=<usergroup_name>
e.g.
invoke setup_tibanna_env --buckets=leelab-datafiles,leelab-tibanna-log
invoke deploy_tibanna --sfn-type=unicorn --usergroup=default_3225
0.5.5
- Now memory, Disk space, CPU utilization are reported to CloudWatch at 1min interval from the Awsem instance.
- To turn on Cloudwatch Dashboard (a collective visualization for all of the metrics combined), add
"cloudwatch_dashboard" : true
toconfig
field of the input execution json.
0.5.4
- Problem of EBS mounting with newer instances (e.g. c5, t3, etc) fixed.
- Now a common AMI is used for CWL v1, CWL draft3 and WDL and it is handled by awsf/aws_run_workflow_generic.sh
- To use the new features, redeploy run_task_awsem lambda.
git pull invoke deploy_core run_task_awsem --usergroup=<usergroup> # e.g. usergroup=default_3046
0.5.3
- For WDL workflow executions, now a more comprehensive log is collected and sent to the log bucket. The log can be found in
<logbucket>/<jobid>.debug.tar.gz
.
$ aws s3 ls s3://tibanna-output/R3sq7ImGXXfe
2018-12-03 06:03:14 126294 R3sq7ImGXXfe.debug.tar.gz
2018-12-03 06:03:11 0 R3sq7ImGXXfe.error
2018-12-03 02:35:50 0 R3sq7ImGXXfe.job_started
2018-12-03 06:03:13 67838 R3sq7ImGXXfe.log
2018-12-03 06:03:12 5576 R3sq7ImGXXfe.postrun.json
- A file named
<jobid>.input.json
is now sent to the log bucket at the start of all Pony executions, to allow an easy rerun in the future after the step function information expires.
$ aws s3 ls s3://tibanna-output/lP1n1EaZOTKO
2018-12-04 16:19:45 1954 lP1n1EaZOTKO.input.json
2018-12-04 16:20:28 0 lP1n1EaZOTKO.job_started
2018-12-04 16:43:52 29879 lP1n1EaZOTKO.log
2018-12-04 16:43:52 11023 lP1n1EaZOTKO.postrun.json
2018-12-04 16:43:52 0 lP1n1EaZOTKO.success
- Space usage info is added at the end of the log file for WDL executions.
bigbed
files are registered to Higlass (pony).- Benchmark for
encode-chipseq
supported. This includes double-nested array input support for Benchmark. quality_metric_chipseq
andquality_metric_atacseq
created automatically (Pony).- An empty extra file array can be handled now (Pony).
- When Benchmark fails, now Tibanna returns which file is missing.
0.5.2
- User permission error for setting postrun jsons public fixed (at
invoke setup_tibanna_env
) --no-randomize
option forinvoke setup_tibanna_env
command to turn off adding random number
at the end of usergroup name.- Throttling error upon mass file upload for md5/fastqc trigger fixed.
0.5.1
-
Support for conditional alternative output for Unicorn
-
To specify alternative names for a conditional output, the input json should contain the
alt_cond_output_argnames
field. This is useful for WDL, when different outputs are generated depending on conditions."output_target": { "cond_merge.cond_merged": "some_sub_dirname/my_first_cond_merged_file" }, "alt_cond_output_argnames": { "cond_merge.cond_merged": ["cond_merge.paste.pasted", "cond_merge.cat.concatenated"] },
-
-
Pony's md5 trigger elasticbeanstalk hrottling errors upon mass file upload partially fixed.
0.5.0
0.4.9
- input files can now be renamed upon download to ec2 instance where a workflow will be executed. It can be specified by adding a
rename
field to the input file dictionary. This can be useful when the user has a certain file naming system on s3 and the workflow itself requires a different file name as input.
ex)
'args': {
...
'input_files': {
'bwaIndex': {
'bucket_name': 'some_bucket',
'object_key': 'some_directory/5sd4flvlgrxje.bwaIndex.tgz',
'rename': 'some_renamed_directory/hg38.bwaIndex.tgz'
}
}
...
}
- pony:
- Now a wrokflow type that creates an extra file of an input file (instead of creating a new processed file object) can be handled.
- Now md5 trigger checks extra file status as well.