Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scatter over maps #106

Closed
antonkulaga opened this issue Apr 28, 2017 · 20 comments
Closed

scatter over maps #106

antonkulaga opened this issue Apr 28, 2017 · 20 comments
Labels
enhancement An enhancement to the WDL language.

Comments

@antonkulaga
Copy link

It would be nice to get methods to iterate over maps with scatter. It will also be good to have access to map.keySet and map.valueSet to be able to process keys or values of maps separately.

@vdauwera
Copy link

Totally agree with this -- input in a ticket to be able to get map keys and values a while back.

@vdauwera vdauwera added the enhancement An enhancement to the WDL language. label Aug 6, 2017
@cjllanwarne
Copy link
Contributor

cjllanwarne commented Jan 8, 2018

Just noticed this. I totally agree (and actually Cromwell already allows this so it's hopefully possible to just add to the spec and move straight to "implemented")

@cjllanwarne
Copy link
Contributor

eg in Cromwell you can do this:

workflow foo {
  Map[String, Int] map
  scatter (pair in map) {
    String key = pair.left
    Int value = pair.right
  }

  output {
    # Automatically gathered from inside the scatter:
    Array[String] keys = key
    Array[Int] values = value
  }
}

@patmagee
Copy link
Member

patmagee commented Mar 8, 2018

@cjllanwarne @geoffjentry is this something thats already supported as @cjllanwarne suggest? does it just need documentation?

@antonkulaga
Copy link
Author

Yes, I opened that issue a year ago when such feature did not exist. Now it works, I will check docs, if docs are fine I will close this

@orodeh
Copy link
Contributor

orodeh commented Apr 25, 2019

@cjllanwarne, this doesn't work on Cromwell v40. Could you take a look please?

@DavyCats
Copy link
Contributor

Just as a note here, the spec seems to specifaclly indicate the scattering can only be done over Arrays:

The $scatter_iteration_statement has two parts: the "item" and the "collection". For example, scatter(x in y) would define x as the item, and y as the collection. The item is always an identifier, while the collection is an expression that MUST evaluate to an Array type. The item will represent each item in that expression. For example, if y evaluated to an Array[String] then x would be a String.

This seems to be the case for draft-2, 1.0 and development.

@EvanTheB
Copy link
Contributor

EvanTheB commented Jun 6, 2019

@cjllanwarne @orodeh Here is the error I get on 38:

java -jar /home/dice/tmp/womtool-38.jar validate joint-discovery-gatk3.wdl 
Failed to process workflow definition 'JointGenotyping' (reason 1 of 1): Invalid type for scatter variable 'chr': Map[String, Array[String]]```

From this:

Map[String, Array[String]] unpadded_intervals = read_json(output_intervals)
scatter (chr in unpadded_intervals) {

@jdidion
Copy link
Collaborator

jdidion commented Jun 12, 2019

@DavyCats I read that as the expression must be coerceable to an array type. The coercion table (https://github.com/openwdl/wdl/blob/master/versions/1.0/SPEC.md#type-coercion) suggests Array[T] can accept an array-like type in which elements are coercible to T. Map elements should be coercible to Pair[K, V]. Indeed, in the definition of the flatten function, the WDL spec explicitly states "...Map[X, Y] can be coerced to Array[Pair[X, Y]]."

@DavyCats
Copy link
Contributor

@jdidion The comment under the flatten function indeed supports the notion that Maps are Array-like and its elements can be coerced to Pairs, but this comment seems wildly out of place.

I don't believe the spec anywhere explicitly states that Maps are 'Array-like' (although I might be wrong). Array-like in the table refers to the various versions of list/array implementations in different programming languages, not to other types in WDL. ("The below table references String-like, Integer-like, etc to refer to values in a particular programming language.")
Also note that there is no explicit coercion between Map and Array mentioned in the type-coercion table, whereas the coercions between String and File are explicilty mentioned. So I don't think the section on type-coercion actually supports Map-Array coercions.

There seems to be a discrepancy (or at least some unclearity/ambiguity) here.

Please note that I have no issue with allowing coercion between Maps and Arrays, I just don't believe the 1.0 specs supports it. I actually added a number of functions to facilitate these coercions more explicilty in the developmental specs some time ago (#219).

@jdidion
Copy link
Collaborator

jdidion commented Feb 22, 2021

This is now solved by as_pairs and as_map functions in v1.1.

@jdidion jdidion closed this as completed Feb 22, 2021
@SHuang-Broad
Copy link

Is there a workaround in the meantime for folks using v1.0?
Thanks!

@jdidion
Copy link
Collaborator

jdidion commented Sep 2, 2021

AFAIK Cromwell supports iteration over maps in WDL 1.0. But keep in mind that will not be portable to other implementations.

@SHuang-Broad
Copy link

Hmm, that will be great, but

    Map[String, String] xxx = read_map(<2-col-tsv-file>)
    scatter (pair in xxx) {
        String key= pair.left
        String val = pair.right
    }

gives Invalid type for scatter variable 'pair': Map[String, String].
I'm using womtool v67 for validation.

@gileshall
Copy link

gives Invalid type for scatter variable 'pair': Map[String, String]. I'm using womtool v67 for validation.

Bump, this is still the case for v84. @jdidion any thoughts about @SHuang-Broad question about a workaround for 1.0?

@ba3lwi
Copy link

ba3lwi commented Feb 24, 2023

Bumping it again, supporting @gileshall comment for v85

@summerghw
Copy link

still got this.
java -jar /data1/tools/womtool-85.jar validate RNAseq.wdl
Failed to process workflow definition 'RNAseq' (reason 1 of 1): Invalid type for scatter variable 'pair': Map[String, String]

@patmagee
Copy link
Member

@summerghw @ba3lwi just a note that this is not the Cromwell repo and the Cromwell team may not regularly look here. This was also not explicitly solved until v1.1 of wdl which Cromwell does not yet support.

You are unlikely to get any support from them by commenting on this ticket. I would suggest you file a bug on their issue bird if you feel like this is a problem. Otherwise you could try using miniwdl check to see if it works?

@summerghw
Copy link

@summerghw @ba3lwi just a note that this is not the Cromwell repo and the Cromwell team may not regularly look here. This was also not explicitly solved until v1.1 of wdl which Cromwell does not yet support.

You are unlikely to get any support from them by commenting on this ticket. I would suggest you file a bug on their issue bird if you feel like this is a problem. Otherwise you could try using miniwdl check to see if it works?

@summerghw
Copy link

@patmagee thank you for reply. I use array(pair)slove the problom in wdl version 1.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement An enhancement to the WDL language.
Projects
None yet
Development

No branches or pull requests