Skip to content

Releases: 4dn-dcic/tibanna

0.5.8

04 Feb 18:33
7689acb
Compare
Choose a tag to compare
  • invoke log can be used to stream log or postrun json file.
    • invoke log --job-id=<jobid> for log
    • invoke log --job-id=<jobid> -p for postrun json
  • postrun json file now contains Cloudwatch metrics for memory/CPU and disk space for all jobs.
    "Job": {
        "status": "0", 
        "Metrics": {
            "max_cpu_utilization_percent": 86.4, 
            "max_mem_used_MB": 14056.421875, 
            "max_mem_utilization_percent": 45.124831006539534, 
            "max_disk_space_utilization_percent": 72.0912267060547, 
            "total_mem_MB": 31150.08203125, 
            "max_mem_available_MB": 17093.66015625, 
            "max_disk_space_used_GB": 64.4835815429688
        },
  • invoke rerun has config override options such as --instance-type, --shutdown-min, --ebs-size and --key-name (and --overwrite-input-extra for pony) rerun a job with a different configuration.

Pony

  • encode-atacseq-aln and encode-atacseq-postaln qc handling
  • The awsem_job_id field of the workflow run metadata is automatically filled.

0.5.7

17 Jan 04:59
Compare
Choose a tag to compare
  • Spot instance is now supported. To use a spot instance, use "spot_instance": true in the config field in the input execution json.

e.g.

"spot_instance": true,
"spot_duration": 360
  • For pony, email notification can be enabled by adding "email": true in the config field of the input execution json. This currently works only for Pony.

0.5.6

21 Dec 17:41
Compare
Choose a tag to compare
  • CloudWatch set up permission error fixed
  • invoke kill works with jobid (previously it worked only with execution arn)
invoke kill --job-id=<jobid> [--sfn=<stepfunctionname>]
  • A more comprehensive monitoring using invoke stat -v that prints out instance ID, IP, instance status, ssh key and password.

To update an existing Tibanna on AWS, do the following

invoke setup_tibanna_env --buckets=<bucket1>,<bucket2>,...
invoke deploy_tibanna --sfn-type=unicorn --usergroup=<usergroup_name>

e.g.

invoke setup_tibanna_env --buckets=leelab-datafiles,leelab-tibanna-log
invoke deploy_tibanna --sfn-type=unicorn --usergroup=default_3225

0.5.5

14 Dec 21:57
Compare
Choose a tag to compare
  • Now memory, Disk space, CPU utilization are reported to CloudWatch at 1min interval from the Awsem instance.
  • To turn on Cloudwatch Dashboard (a collective visualization for all of the metrics combined), add "cloudwatch_dashboard" : true to config field of the input execution json.

0.5.4

14 Dec 16:57
Compare
Choose a tag to compare
  • Problem of EBS mounting with newer instances (e.g. c5, t3, etc) fixed.
  • Now a common AMI is used for CWL v1, CWL draft3 and WDL and it is handled by awsf/aws_run_workflow_generic.sh
    • To use the new features, redeploy run_task_awsem lambda.
    git pull
    invoke deploy_core run_task_awsem --usergroup=<usergroup>  # e.g. usergroup=default_3046
    

0.5.3

04 Dec 22:51
Compare
Choose a tag to compare
  • For WDL workflow executions, now a more comprehensive log is collected and sent to the log bucket. The log can be found in <logbucket>/<jobid>.debug.tar.gz.
$ aws s3 ls s3://tibanna-output/R3sq7ImGXXfe
2018-12-03 06:03:14     126294 R3sq7ImGXXfe.debug.tar.gz
2018-12-03 06:03:11          0 R3sq7ImGXXfe.error
2018-12-03 02:35:50          0 R3sq7ImGXXfe.job_started
2018-12-03 06:03:13      67838 R3sq7ImGXXfe.log
2018-12-03 06:03:12       5576 R3sq7ImGXXfe.postrun.json
  • A file named <jobid>.input.json is now sent to the log bucket at the start of all Pony executions, to allow an easy rerun in the future after the step function information expires.
$ aws s3 ls s3://tibanna-output/lP1n1EaZOTKO
2018-12-04 16:19:45       1954 lP1n1EaZOTKO.input.json
2018-12-04 16:20:28          0 lP1n1EaZOTKO.job_started
2018-12-04 16:43:52      29879 lP1n1EaZOTKO.log
2018-12-04 16:43:52      11023 lP1n1EaZOTKO.postrun.json
2018-12-04 16:43:52          0 lP1n1EaZOTKO.success
  • Space usage info is added at the end of the log file for WDL executions.
  • bigbed files are registered to Higlass (pony).
  • Benchmark for encode-chipseq supported. This includes double-nested array input support for Benchmark.
  • quality_metric_chipseq and quality_metric_atacseq created automatically (Pony).
  • An empty extra file array can be handled now (Pony).
  • When Benchmark fails, now Tibanna returns which file is missing.

0.5.2

21 Nov 01:35
b38f7b7
Compare
Choose a tag to compare
  • User permission error for setting postrun jsons public fixed (at invoke setup_tibanna_env)
  • --no-randomize option for invoke setup_tibanna_env command to turn off adding random number
    at the end of usergroup name.
  • Throttling error upon mass file upload for md5/fastqc trigger fixed.

0.5.1

19 Nov 22:39
4d5a745
Compare
Choose a tag to compare
  • Support for conditional alternative output for Unicorn

    • To specify alternative names for a conditional output, the input json should contain the alt_cond_output_argnames field. This is useful for WDL, when different outputs are generated depending on conditions.

      "output_target": {
        "cond_merge.cond_merged": "some_sub_dirname/my_first_cond_merged_file"
      },
      "alt_cond_output_argnames": {
        "cond_merge.cond_merged": ["cond_merge.paste.pasted", "cond_merge.cat.concatenated"]
      },
      
  • Pony's md5 trigger elasticbeanstalk hrottling errors upon mass file upload partially fixed.

0.5.0

09 Nov 05:26
11151aa
Compare
Choose a tag to compare
  • Tibanna Pony now supports double-nested array input and WDL workflow metadata.

0.4.9

07 Nov 22:00
e80d790
Compare
Choose a tag to compare
  • input files can now be renamed upon download to ec2 instance where a workflow will be executed. It can be specified by adding a rename field to the input file dictionary. This can be useful when the user has a certain file naming system on s3 and the workflow itself requires a different file name as input.
    ex)
'args': {

    ...

    'input_files': {
           'bwaIndex': {
                'bucket_name': 'some_bucket',
                'object_key': 'some_directory/5sd4flvlgrxje.bwaIndex.tgz',
                'rename': 'some_renamed_directory/hg38.bwaIndex.tgz'
            }
    }

    ...

}
  • pony:
    • Now a wrokflow type that creates an extra file of an input file (instead of creating a new processed file object) can be handled.
    • Now md5 trigger checks extra file status as well.