-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bug in upload_calculation
for CalcJobs
with local codes
#3707
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I assume you checked that the test fails before the fix?
aiida/engine/daemon/execmanager.py
Outdated
with NamedTemporaryFile(mode='wb+') as handle: | ||
handle.write(code.get_object_content(f, mode='rb')) | ||
handle.flush() | ||
handle.seek(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the seek
really necessary here? Doesn't transport.put
open it itself anyway? If it's necessary, might warrant a comment as to why.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, it is not necessary, I will remove it. The code is a copy of a block a few lines lower. There I already put a comment that it should be improved using node.open
once transport.put
supports filelike objects to be passed, which is issue #2579 . I decided not to copy the comment again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed the PR because of the request, but I am adding the disclaimer that I'm still not very familiar with the objects handled here, so take my comments with a grain of salt.
_, node = launch.run_get_node(ArithmeticAddCalculation, **inputs) | ||
uploaded_files = os.listdir(node.dry_run_info['folder']) | ||
for filename in self.local_code.list_object_names(): | ||
self.assertTrue(filename in uploaded_files) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I seem to understand that the test is comparing the files uploaded (node.dry_run_info['folder']
) with the ones set up in the code (local_code.list_object_names()
), but I couldn't find anything that tests that the configuration is correct (i.e. line 46 orm.Code(...)
), or at least couldn't grep any orm.Code
in aiida/backends/tests/orm/*
and I didn't find anything in aiida/backends/tests/engine/test_process_builder
. So if there is a problem with that it might not trigger this assertion? Perhaps we could either add another test or compare explicitly with the expected files?
handle.write(code.get_object_content(f, mode='rb')) | ||
handle.flush() | ||
handle.seek(0) | ||
transport.put(handle.name, f) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be because I'm still not familiar with transport
and code
objects, but from the current commit description (which only mentions that get_folder_list
no longer exists) I don't understand why the change in the parameters of transport.put(...)
. It seems to change the loop over folders for a loop over objects (file names, correct?), but then I don't understand why you make a temporary file and pass the name of that instead of something related to the looping objects. Also, I don't understand the process of writing content, flushing and rewinding said temporary file.
Again, I'm not sure if this merits any code comment, but maybe just a brief explanation for this changes in the commit message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That change is because the new API doesn't expose directly an absolute path to the file to be put on the remote computer (which wouldn't be possible on an object store backend). Since transport.put
wants a filename, the content is first copied to a temporary file.
This part was clear to me, but I have been doing a fair bit of migrating between these two.. so maybe the commit could indeed be more explicit about that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will include the reasoning for this and a reference to #2579 which will allow this hack to be removed.
Yes, it failed. |
@@ -139,8 +139,12 @@ def upload_calculation(node, transport, calc_info, folder, inputs=None, dry_run= | |||
for code in input_codes: | |||
if code.is_local(): | |||
# Note: this will possibly overwrite files | |||
for f in code.get_folder_list(): | |||
transport.put(code.get_abs_path(f), f) | |||
for f in code.list_object_names(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't work if list_object_names
contains directories, right? Is that ensured not to happen elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it should never contain directotries. If you look at the Code.set_files
implementation, which is called to store the files of the local code, it will copy all the files to top level without nesting.
def set_files(self, files):
"""
Given a list of filenames (or a single filename string),
add it to the path (all at level zero, i.e. without folders).
Therefore, be careful for files with the same name!
I just noticed that I changed that code when the repository interface changed, and I am not sure it does 100% the same. The new implementation checks os.path.isfile
which wasn't there explicitly. Old implementation:
for f in files:
self.repository.add_path(f, os.path.split(f)[1])
However that also does not seem compatible with the docstring which says that you can pass both filenames and directories. I don't think that add_path
was doing the correct hing either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, keeping it as files-only seems like a safe thing to do for now -- we can always add directory compatibility if it's needed. Might be good to assert it in the current piece of code, so it's a bit more explicit this is not supported?
Side note: This Therefore, be careful for files with the same name!
could easily be improved by just checking first there are no duplicate file names, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but that seems a job for another time. If ever anyone actually starts even using local codes..
This code path was not tested at all and so the remaining occurrence of `get_folder_list` which is part of the old repository interface that was removed in `v1.0.0` went unnoticed. The fix was forced to write files to temporary files and flush them instead of using a filelike object, because `Transport.put` only accepts absolute filepaths for the moment.
66653f2
to
422c0ff
Compare
Fixes #3705
This code path was not tested at all and so the remaining occurrence of
get_folder_list
which is part of the old repository interface that wasremoved in
v1.0.0
went unnoticed.