-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
File output of multiple tasks? #110
Comments
That should not be the case, so it is definitely a bug. On a side note, when implementing the WorkQueue simulator, I noticed that WRENCH does not allow an input file to used by more than one job. Is there a reason for that? |
An input file cannot be used by more than one job? perhaps you mean "one task"? But regarldess, that should not be the case and I am quite sure that we have tests in which this is the case.... |
I removed the addFileToMap() helper function, which was useless as it was used only twice, and tried to do too much, which led to problems. This is way back when I was learning C++ :) I double-checked that the code allows a single file to be input to multiple tasks, but checks that a file cannot be output of multiple tasks. So I believe the code does the right thing now. This is not committed yet (see below) But now test WorkflowLoadFromJSONTest.LoadValidJSON fails because in that JSON we have a file that’s output of multiple tasks. More specifically, the file "columns.txt" is listed as output to 40+ tasks !!! (for instance: output of individuals_ID0000001 and output of individuals_ID0000002). So, I am not sure what to do. For the test of course we can rename this file in 40+ different ways. But since this is a real Pegasus JSON file, I am concerned. Is this allowed???? And if it is, what does it mean... |
Just merge a branch into master that addresses the above in the following way:
The changes above likely break the wrench-pegasus implementation because the code in tools/pegasus currently ignores the "auxiliary" and "transfer" tasks. (there are TODO in there). Furthermore, the WorkflowLoadFromJSONTest.LoadValidJSON test is now disabled, as it is also broken. What remains to be done is:
|
I've been thinking about this, and thought I'd write down my thoughts here so as not to forget. The WRENCH model: the workflow consists of tasks that read/write files, thus defining data dependencies, and can also incorporate control dependencies that are not data dependencies. In Pegasus workflows we find, in addition to the above
Part of the impedance mismatch above is that Pegasus workflows encode some implementation decisions applied to abstract workflows. This really cannot be part of WRENCH as it's a non-WMS-generic, but Pegasus-specific approach. Goal: how do we keep WRENCH simple/generic, yet make it possible to use simulate Pegasus easily, AND allow people who do not care about Pegasus to import sensible Pegasus workflows (as abstract workflows) because Pegasus is a good source of workflows? Questions: Here is a list of questions, and if the answers are what I think they are, then at the end of this post I have a proposal for a solution.
Proposal::
The challenge for step #2 above is how to include this Pegasus-specific stuff in WRENCH, without making WRENCH too Pegasus-specific. This was done before by creating special task types, but that seems strange because it wouldn't be relevant to other WMSs. But the WorkflowTask is is now just one type. BUT, the WMS developer can of course extend the WorkflowTask class, and encode anything workflow-specific things (e.g., including auxiliary/transfer/whatever tasks). And so, a Pegasus simulator will inspect the dynamic type and take appropriate action if the task is now a "normal" WRENCH task. |
Here are some comments to your questions and proposal: Questions
Proposal
|
ok, this all sounds good:
I'll start working on implementing the above, and see where that leads us. |
Looking at a few DAX files, I don;t see any "auxiliary" or "transfer" jobs, like in the JSON. Is it the case that DAX files only show abstract workflows? |
The DAX used for simulation is an extension of the actual abstract workflow – i.e., it only contains the compute jobs extended with task runtimes and in some cases sizes of input/output files. The JSON file is extracted from the actual execution, thus it contains all tasks for the executable workflow. |
i just pushed changes to master (and, again, forgot to include the issue number in the commit message). given that dax and json files are qualitatively different, the api have now 3 methods:
The first two methods above are implemented (with the JSON one removing auxiliary and transfer tasks... will see how it needs to be modified if/when it fails on some workflows). The 3rd one is to be implemented, and we'll have to figure out how to do it. Here are some options:
|
After discussion with Rafael, option 1 above is the best choice. So, what remains to be done:
|
I am closing this issue since the Pegasus simulator (https://github.com/wrench-project/pegasus) has a parser ( |
One of our users in Rennes mentioned that he had problems related to the fact that for some DAX or JSON workflows, a single file is output of multiple tasks! This causes problems because then he would have weird behaviors in FileRegistry services, since one task would produce a file called "foo", and then another task would find is already somewhere even it's supposed to produce it.
I just assumed those were bogus workflow descriptions, and added a simple check in the WorkflowTask::addOutputFile method to throw an exception if a file is already output of another task. To my surprise, that broke one of our tests! The JSON workflow in the WorkflowLoadFromJSONTest.LoadValidJSON test has such a strange behavior. The error message in my exception is explicit:
Trying to set file 'columns.txt' as output of task 'individuals_ID0000002', but this file is already the output of task 'individuals_ID0000001'
So, is it valid to have a single file be the output of multiple tasks??? Or is the JSON in that test actually bogus.
The text was updated successfully, but these errors were encountered: