Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release v011 #249

Merged
merged 20 commits into from
Jul 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 7 additions & 2 deletions .github/workflows/run_workflows.yml
Original file line number Diff line number Diff line change
Expand Up @@ -168,13 +168,18 @@ jobs:

- name: cwl-docker-extract (i.e. recursively docker pull)
if: always()
run: cd workflow-inference-compiler/ && pytest -k test_cwl_docker_extract
run: cd workflow-inference-compiler/ && pytest tests/test_examples.py -k test_cwl_docker_extract
# For self-hosted runners, make sure the docker cache is up-to-date.

- name: PyTest Run Workflows
if: always()
# NOTE: Do NOT add coverage to PYPY CI runs https://github.com/tox-dev/tox/issues/2252
run: cd workflow-inference-compiler/ && pytest -k test_run_workflows_on_push --workers 8 --cwl_runner cwltool # --cov
run: cd workflow-inference-compiler/ && pytest tests/test_examples.py -k test_run_workflows_on_push --workers 8 --cwl_runner cwltool # --cov

- name: PyTest Run REST Core Tests
if: always()
# NOTE: Do NOT add coverage to PYPY CI runs https://github.com/tox-dev/tox/issues/2252
run: cd workflow-inference-compiler/ && pytest tests/test_rest_core.py -k test_rest_core --cwl_runner cwltool

# NOTE: The steps below are for repository_dispatch only. For all other steps, please insert above
# this comment.
Expand Down
29 changes: 9 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,54 +1,43 @@
# Workflow Inference Compiler
# Sophios

[![doc-buid-status](https://readthedocs.org/projects/workflow-inference-compiler/badge/?version=latest)](https://workflow-inference-compiler.readthedocs.io/en/latest/)

Scientific computing can be difficult in practice due to various complex software issues. In particular, chaining together software packages into a computational pipeline can be very error prone. Using the [Common Workflow Language](https://www.commonwl.org) (CWL) greatly helps, but like many other workflow languages users still need to explicitly specify how to connect inputs & outputs. The Workflow Inference Compiler allows users to specify computational protocols at a very high level of abstraction, it automatically infers almost all connections between inputs & outputs, and it compiles to CWL for execution.
Scientific computing can be difficult in practice due to various complex software issues. In particular, chaining together software packages into a computational pipeline can be very error prone. Using the [Common Workflow Language](https://www.commonwl.org) (CWL) greatly helps, but like many other workflow languages users still need to explicitly specify how to connect inputs & outputs. Sophios allows users to specify computational protocols at a very high level of abstraction, it automatically infers almost all connections between inputs & outputs, and it compiles to CWL for execution.

## Documentation
The documentation is available on [readthedocs](https://workflow-inference-compiler.readthedocs.io/en/latest/).
## Example Workflows
The following repositories contain example workflows:

[Molecular Modeling Workflows](https://github.com/PolusAI/mm-workflows)

[Image Workflows](https://github.com/PolusAI/image-workflows)

Like CWL, the compiler is general purpose and is not limited to any specific domain.
You do not need to install these to use wic. They are completely optional.

(But obviously if you're just getting started and you don't have any workflows of your own, you probably want to install at least one of them.)
## Quick Start
See the [installation guide](docs/installguide.md) for more details, but:

For pip users:

`pip install wic` # Please read the next sentence
`pip install sophios`

Unlike conda, **pip cannot install the binary system dependencies needed to actually run most workflows!**
In order to execute the CWL workflows that are generated by `sophios`, `cwltool` and all of its underlying dependencies need to be present in the system. Unfortunately
`pip` has no capability to resolve and install these dependencies. PLease refer to the `cwltool` [installation guide](https://cwltool.readthedocs.io/en/latest/#install) to prepare the system to run CWL workflows.

If you want to actually run workflows, you (or your sysadmin) will have to manually install and configure additional software!

For conda users / developers:

See the [installation guide for developers](docs/dev/installguide.md)

```
wic --yaml ../workflow-inference-compiler/docs/tutorials/helloworld.wic --graphviz --run_local --quiet
sophios --yaml ../workflow-inference-compiler/docs/tutorials/helloworld.wic --graphviz --run_local --quiet
```

The Workflow Inference Compiler is a [Domain Specific Language](https://en.wikipedia.org/wiki/Domain-specific_language) (DSL) based on the [Common Workflow Language](https://www.commonwl.org). CWL is fantastic, but explicitly constructing the Directed Acyclic Graph (DAG) associated with a non-trivial workflow is not so simple. Instead of writing raw CWL, you can write your workflows in a much simpler yml DSL. For technical reasons edge inference is far from unique, so ***`users should always check that edge inference actually produces the intended DAG`***.
Sophios is a [Domain Specific Language](https://en.wikipedia.org/wiki/Domain-specific_language) (DSL) based on the [Common Workflow Language](https://www.commonwl.org). CWL is fantastic, but explicitly constructing the Directed Acyclic Graph (DAG) associated with a non-trivial workflow is not so simple. Instead of writing raw CWL, users can write workflows in a much simpler yml DSL. For technical reasons edge inference is far from unique, so ***`users should always check that edge inference actually produces the intended DAG`***.

## Edge Inference

The key feature is that in most cases, you do not need to specify any of the edges! They will be automatically inferred for you based on types, file formats, and naming conventions. For more information, see the [user guide](docs/userguide.md#edge-inference-algorithm) If for some reason edge inference fails, there is a syntax for creating [explicit edges](docs/userguide.md#explicit-edges).
The key feature is that in most cases, users do not need to specify any of the edges! They will be automatically inferred for users based on types, file formats, and naming conventions. For more information, see the [user guide](docs/userguide.md#edge-inference-algorithm) If for some reason edge inference fails, there is a syntax for creating [explicit edges](docs/userguide.md#explicit-edges).

## Subworkflows

Subworkflows are very useful for creating reusable, composable building blocks. As shown above, recursive subworkflows are fully supported, and the edge inference algorithm has been very carefully constructed to work across subworkflow boundaries.

## Explicit CWL

Since the yml DSL files are automatically compiled to CWL, users should not have to know any CWL. However, the yml DSL is secretly CWL that is simply missing almost all of the tags! In other words, the compiler merely adds missing information to the files, and so if you know CWL you are free to explicitly add the information yourself. Thus, the yml DSL is intentionally a [leaky abstraction](https://en.wikipedia.org/wiki/Leaky_abstraction).
Since the yml DSL files are automatically compiled to CWL, users should not have to know any CWL. However, the yml DSL is secretly CWL that is simply missing almost all of the tags! In other words, the compiler merely adds missing information to the files, and so if the users know CWL, they are free to explicitly add the information themselves. Thus, the yml DSL is intentionally a [leaky abstraction](https://en.wikipedia.org/wiki/Leaky_abstraction).

## Python API
In addition to the underlying declarative yaml syntax, there is an API for writing WIC workflows in python. The python API is philosophically the exact opposite: users should not have to know any CWL, and in fact all CWL features are hidden unless explicitly exposed.
71 changes: 70 additions & 1 deletion docs/dev/devguide.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,4 +90,73 @@ INFO [job test.cwl] completed success
]
```

As can be seen from the output json blob the order returned is not the same as input order.
As can be seen from the output json blob the order returned is not the same as input order.


## Partial Failures
When the partial failures feature is enabled although the subprocess for the workflow step itself will pass, the post-processing javascript can potentially crash as seen below. The Sophios compiler only semantically understands Sophios/CWL. It is theoretically impossible to correct mistakes in the embedded JS of any arbitrary workflow. The corresponding cwl snippet is also shown.
```
outputs:

topology_changed:
type: boolean
outputBinding:
glob: valid.txt
loadContents: true
outputEval: |
${
// Read the contents of the file
const lines = self[0].contents.split("\n");
// Read boolean value from the first line
const valid = lines[0].trim() === "True";
return valid;
}
```
```
stdout was: ''
stderr was: 'evalmachine.<anonymous>:45
const lines = self[0].contents.split("\n");
^
TypeError: Cannot read properties of undefined (reading 'contents')
```
To fix this the developer needs to add a javascript snippet to check if the self object being globbed exists, shown below.
```
outputs:

topology_changed:
type: boolean
outputBinding:
glob: valid.txt
loadContents: true
outputEval: |
${
// check if self[0] exists
if (!self[0]) {
return null;
}
// Read the contents of the file
const lines = self[0].contents.split("\n");
// Read boolean value from the first line
const valid = lines[0].trim() === "True";
return valid;
}
```

## Workflow Development
When adding new .cwl or .wic files its best to remove the .wic folder containing paths to .cwl and .yml files
```
rm -r ~/wic
```

## Singularity
When building images with Singularity its best to clean the cache to avoid potential errors with cwltool or cwl-docker-extract.
```
singularity cache clean
```

## Toil
When working with toil be sure to clean the working state as well as the configuration file, otherwise if you change input flags the configuration file will not be updated.
```
toil clean
rm -r ~/.toil
```
2 changes: 1 addition & 1 deletion docs/dev/installguide.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ You should now have the `wic` executable available in your terminal.
To test your installation, you can run the example in README.md:

```
wic --yaml ../workflow-inference-compiler/docs/tutorials/helloworld.wic --run_local --quiet
sophios --yaml ../workflow-inference-compiler/docs/tutorials/helloworld.wic --run_local --quiet
```

You can also run the automated test suite. Note that the tests are based on the workflows; if you have more workflows, the tests will take longer.
Expand Down
2 changes: 1 addition & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Workflow Inference Compiler documentation
Sophios documentation
=======================================================

.. toctree::
Expand Down
7 changes: 3 additions & 4 deletions docs/installguide.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,10 @@

For pip users:

`pip install wic` # Please read the next sentence
`pip install sophios`

Unlike conda, **pip cannot install the binary system dependencies needed to actually run most workflows!**

If you want to actually run workflows, you (or your sysadmin) will have to manually install and configure additional software!
In order to execute the CWL workflows that are generated by `sophios`, `cwltool` and all of its underlying dependencies need to be present in the system. Unfortunately
`pip` has no capability to resolve and install these dependencies. PLease refer to the `cwltool` [installation guide](https://cwltool.readthedocs.io/en/latest/#install) to prepare the system to run CWL workflows.

For conda users / developers:

Expand Down
10 changes: 10 additions & 0 deletions docs/tutorials/conditional_example.wic
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
steps:
toString:
in:
input: !ii 27
out:
- output: !& string_int
echo:
when: '$(inputs.message < "27")'
in:
message: !* string_int
Loading
Loading