-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collection jobs record dataset collection elements instead of HDAs #15729
Comments
Nah, we just don't know how to display non-dataset elements in the UI. I don't think that's why you can't rerun, but if you can share a history that'd be helpful. |
What do you mean. It is supposed to be a dataset element - that's kind of the point. |
SRRblabla is a dataset collection element, this has been that way ever since we've shown input datasets in the job info page. But if you can share a history that'd be great. |
which element in that history should I look at ? |
any from any simple list. They are all broken. |
That's better of course, puh. So just an annoying bug then. |
Why is this only happening for WFs then, and not for regular job runs on collections? |
It also happens (and has been happening) for normal jobs, just less frequently. |
I don't follow I'm afraid why is this about frequencies? Shouldn't it be either/or? |
We used to track the HDA associated associated with a DCE. You've probably noticed that if you re-run a job from a list you'd re-run on the hda, because the element identifier of the element you consume is gone. This change fixes that (among other things). We've always tracked DCEs when the input parameter was a collection, and the actual input was a nested collection (e.g. a list input on a list:list HDCA). We have now removed the special casing for HDAs and pass through the element instead. So now you see this more often. |
I see, but I thought I had tested earlier today that if you manually run a tool over a collection (not an individual job), the UI has no issue with showing the provenance correctly (that's actually why I thought it can't be the UI in the first place). On my phone now so hard to check again, but that's what I think I remember. |
I'll check that, but this should be all the same. |
Addresses part of galaxyproject#15729
Addresses part of galaxyproject#15729
Any chance that this can still get fixed before Smorgasbord 3? |
Yeah, I'll pick this back up. |
Addresses part of galaxyproject#15729
Addresses part of galaxyproject#15729
Addresses part of galaxyproject#15729
Addresses part of galaxyproject#15729
So I think I fixed the job info page now. Is rerunning these jobs really broken, or did it just seem that way because of the text where you'd expect a dataset ? The import of your example history takes forever, but checking on my instance that all seems to work correctly. |
It is really broken on .eu. I can create a minimal example there for you I guess. |
That would be helpful, thanks! |
Ok, WF to reproduce the issue: history created with it: You should be able to observe the following:
|
Addresses part of galaxyproject#15729
Addresses part of galaxyproject#15729
Should be fixed by #15744, I'll probably update .org later today |
Edit by @mvdbeek
Nothing is wrong here, we've improved the provenance tracking, but the UI and job rerun routes don't know how to deal with dataset collection elements, these are minor tweaks, not severe bugs.
Describe the bug
When you're running any workflow that is mapping over collections, Galaxy records as input for any collection element not the dataset ID, but a malformed version of the element identifier. With that the possibility to rerun jobs is completely lost!!
Galaxy Version and/or server at which you observed the bug
Galaxy Version: 23.0rc1 on .org and .eu
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Each element should have its input dataset from the input collection recorded, but in reality what gets recorded is what's shown below
Screenshots
The text was updated successfully, but these errors were encountered: