Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes for output discovery #17266

Merged
merged 2 commits into from
Jan 11, 2024

Conversation

bernt-matthias
Copy link
Contributor

@bernt-matthias bernt-matthias commented Jan 11, 2024

Fixes: #17263 .. guess both bugs are there forever (>10 years)

  1. primary_output_assigned needs to be reset for each output

    lets say there are two outputs: datasets (a normal dataset)
    and discovered (discovered datasets). if datasets is
    processed after discovered then primary_output_assigned
    is still true from the first loop (where discovered was precessed).
    this leads to overwriting of info (e.g. name) of the dataset output
    (in the if primary_output_assigned branch at the end of the outer
    loop).

  2. assign_primary_output must be obeyed for all outputs (not just the
    first).

    sticking to the above example, assign_primary_output would
    work only if dataset is processed after discovered

I added a test case for the 2nd part. No idea how to test the first (I manually tested it with planemo serve).

#16203 is still broken

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

@github-actions github-actions bot added this to the 24.0 milestone Jan 11, 2024
1. `primary_output_assigned` needs to be reset for each output

   lets say there are two outputs: `datasets` (a normal dataset)
   and `discovered` (discovered datasets). if `datasets` is
   processed after `discovered` then `primary_output_assigned`
   is still true from the first loop (where `discovered` was precessed).
   this leads to overwriting of info (e.g. name) of the `dataset` output
   (in the `if primary_output_assigned` branch at the end of the outer
   loop).

2. assign_primary_output must be obeyed for all outputs (not just the
   first).

   sticking to the above example, `assign_primary_output` would
   work only if `dataset` is processed after `discovered`
@jmchilton
Copy link
Member

This looks right to me. I wonder if this was meant as a niche feature that should have just not been documented and only worked with one output - but given the investment we've put into it, I think this looks really good.

@jmchilton jmchilton merged commit b3c256d into galaxyproject:dev Jan 11, 2024
54 checks passed
@bernt-matthias bernt-matthias deleted the topic/assign_primary branch January 11, 2024 14:18
@bernt-matthias
Copy link
Contributor Author

Thanks for the review. Any suggestions regarding backporting?

@mvdbeek
Copy link
Member

mvdbeek commented Jan 15, 2024

First check that this didn't cause any regressions ? And then I'd say not far, since there can't really be any tools relying on this. 23.1 ?

@bernt-matthias bernt-matthias restored the topic/assign_primary branch January 15, 2024 10:58
@bernt-matthias
Copy link
Contributor Author

To me 23.2 would be sufficient. Most important to me is that galaxyproject/tools-iuc#5699 can continue. Wondering if discover_datasets or from_workdir is prefered to cover the case where a single file is to be discovered via a some pattern.

@mvdbeek
Copy link
Member

mvdbeek commented Jan 15, 2024

discover_datasets is always inferior to something static like from_workdir or mv thing*.txt $output

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

discover_datasets broken if the tool has other outputs
3 participants