-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix WSL permisson bug for renaming directories #1796
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1796 +/- ##
==========================================
+ Coverage 44.2% 44.22% +0.02%
==========================================
Files 83 83
Lines 21529 21539 +10
Branches 5644 5644
==========================================
+ Hits 9517 9526 +9
+ Misses 10959 10943 -16
- Partials 1053 1070 +17
Continue to review full report at Codecov.
|
rmgpy/rmg/main.py
Outdated
@@ -1268,7 +1268,16 @@ def make_seed_mech(self, first_time=False): | |||
if os.path.exists(seed_dir): # This is a seed from a previous RMG run. Delete it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why you delete the seed from a previous RMG run while you move the seed from a previous iteration to a temporary directory in case you run into errors. Probably I am not fully aware how restart and seed works, but what happens if I want to use a seed from a previous RMG run and do not copy it out of the seed folder? Will it be deleted at this point and my seed from the previous RMG run is gone for good? In this case wouldn't it be good to also just move the seed to a temporary directory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question! Currently, if the user restarts from the seed in this folder, we make a copy in a new folder called previous_restart
so that the seed doesn't get lost. See below.
Lines 488 to 499 in 6483418
# Copy the restart files to a separate folder so that the job does not overwrite it | |
restart_dir = os.path.join(self.output_directory, 'previous_restart') | |
core_restart = os.path.join(restart_dir, 'restart') | |
edge_restart = os.path.join(restart_dir, 'restart_edge') | |
filters_restart = os.path.join(restart_dir, 'filters') | |
util.make_output_subdirectory(self.output_directory, 'previous_restart') | |
shutil.copytree(self.core_seed_path, core_restart) | |
shutil.copytree(self.edge_seed_path, edge_restart) | |
os.mkdir(filters_restart) | |
shutil.copyfile(self.filters_path, os.path.join(filters_restart, 'filters.h5')) | |
shutil.copyfile(self.species_map_path, os.path.join(filters_restart, 'species_map.yml')) | |
If the user instead were to run a new RMG job from this folder the seed from the previous job would be lost. This was the behavior before I implemented the restart stuff, but I am willing to consider other options if you think this is worth while.
rmgpy/rmg/main.py
Outdated
@@ -1268,7 +1268,16 @@ def make_seed_mech(self, first_time=False): | |||
if os.path.exists(seed_dir): # This is a seed from a previous RMG run. Delete it | |||
shutil.rmtree(seed_dir) | |||
else: # This is a seed from the previous iteration. Move it to a temporary directory in case we run into errors | |||
os.rename(seed_dir, os.path.join(temp_seed_dir)) | |||
try: | |||
os.rename(seed_dir, temp_seed_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you considered the tempdirectory option? I am curious if it would have advantages in this case. https://docs.python.org/3/library/tempfile.html#tempfile.tempdir
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had not, thanks for the suggestion! It looks interesting, but I am not sure if I can move existing files inside of a temporary directory. Since we already have the previous seed files in a non-temporary place, it should be much faster to rename/move the files (which does not actually move the data on the disk) rather than copying. So if we can move these files into the temporary directory without making a copy then it might be interesting to implement. The only added advantage I think is just the automatic cleanup, but it would be cleaner code.
eb99475
to
62afb90
Compare
62afb90
to
b67fd8b
Compare
I have refactored how we handle the first_time code in saving the seed mechanism. I need to verify still that this change doesn't break functionality but I think it is at least ready for a first review. @mliu49 do you also have time to take a quick look? |
rmgpy/rmg/main.py
Outdated
os.rename(seed_dir, os.path.join(temp_seed_dir)) | ||
try: | ||
os.rename(seed_dir, os.path.join(temp_seed_dir)) | ||
except PermissionError: # The Windows Subsystem for Linux (WSL) can have problems with renaming |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know if the permissions error is with overwriting or renaming? For example, could the issue potentially be resolved by deleting the temp_seed_dir before renaming?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I have seen I believe this is an issue with os.rename
itself in the WSL, and occurs regardless of if we are trying to overwrite an existing directory. In fact, in this function the renaming is not being used to overwrite an existing directory. If I can find the links I based this on I will post them here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't find the original issues I was looking at, but these links seem relevant:
https://github.com/microsoft/WSL/issues/1529
https://github.com/microsoft/WSL/issues/3395
https://github.com/microsoft/vscode-remote-release/issues/208
https://github.com/Microsoft/WSL/issues/3738
https://code.visualstudio.com/docs/remote/wsl#_i-see-eaccess-permission-denied-error-trying-to-rename-a-folder-in-the-open-workspace
https://github.com/Microsoft/WSL/issues/2097
I am putting these in a code block so that this PR doesn't get linked to all of these issues
rmgpy/rmg/main.py
Outdated
with open(os.path.join(self.output_directory, 'restart_from_seed.py'), 'w') as f: | ||
f.write('restartFromSeed(path=\'seed\')\n\n') | ||
with open(self.input_file, 'r') as input_file: | ||
f.write(''.join(input_file.readlines())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you do f.write(input_file.read())
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks for the catch!
rmgpy/rmg/main.py
Outdated
causes RMG to make a seed mechanism out of the current chem_annotated.inp and species_dictionary.txt | ||
this seed mechanism is outputted in a seed folder within the run directory and automatically | ||
added to as the (or replaces the current) 'Seed' thermo and kinetics libraries in database | ||
|
||
`initialize_seed_mech` should be called once before this function is ever called. | ||
|
||
This also writes the filter tensors to the `filters` sub-folder for restarting an RMG job from a seed mechanism |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you reword this docstring? It's written a bit weirdly and isn't really accurate.
I think it should start with "This method saves a seed mechanism..." or "Save a seed mechanism...". It is also not generating it from the chem_annotated.inp and species_dictionary.txt but rather the actual reaction model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated! I should have updated this the last time I modified this function in a PR
WSL can occasionally throw permission errors when trying to rename non-empty directories, even when the user owns all of the files and directories.
Move all first_time only functions to a separate function that is called during the job initialization. This improves performance and readability
b67fd8b
to
5839d70
Compare
I'm not sure if this is related to the changes in this PR, but I just tested with the superminimal example, and running the restart job leads to the following error:
|
This probably means that the filters were never saved. Which iteration was the restart file from (i.e. when you ran the super-minimal job the first time, how many iterations did you let it go through before you stopped the job?) I had tested this on the minimal example and it seemed to work, so I am surprised that the super-minimal job is having this error. I might need to re-think how I test the restart functionality |
Quick update: I am able to replicate the issue on my end. I ran the job to completion and then restarted the job, which should work. I am looking into what is going on now. |
I think I have figured it out. The super-minimal job does not use filtering, so the filters are never saved. This should be handled by the try/except block in 1561-1593 of main.py, but the error thrown is an AttributeError instead of a key error. I am pretty sure that this would have been tested when we did the original restart PR, but perhaps it was overlooked. Additionally, it is possible that the behavior of h5py has changed so that it throws a different error. Either way, this issue is not a result of this PR, but it should be an easy fix. @mliu49 would you prefer me to include the necessary changes in this PR or a separate one? I am planning on adding a commit with the fix, along with some functional tests for restarting the minimal job and the super-minimal job |
Please open a separate PR. If you've confirmed that this PR does not change behavior, then I think we can go ahead and merge this. |
I just checked current master and can confirm that this issue also affects it, so this error is not a result of the changes in this PR. I'll open up an issue and a separate PR to fix this. Thanks for the catch @mliu49 ! |
Motivation or Problem
fixes #1786
As noted in the issue this fixes, the Windows Subsystem for Linux sometimes throws a permission error
when trying to renaming non-empty directories, even when the user has the permissions set properly on the directories and files to do so. This is problematic, because we rename the seed directory folder temporarily every time we save a new seed mechanism so that the previous seed mechanism is not lost in case we encounter an error when saving the new one. Renaming the directory does not require copying large amounts of data, so this is the preferred solution in general, but as noted can cause problems for WSL users.
There appear to be several issues opened over a few years complaining about this issue if I am not mistaken, so it appears to be a reoccurring problem. Given this, it is best to handle this case ourselves.
Description of Changes
A try/except statement was added that avoids renaming the directory in the event of a permission error and instead copies the directory to a new location, which does not throw a permission error as far as I can tell in WSL. Note that this solution is much slower than renaming, so it should only be done as a last resort, which is exactly what we have implemented here.
Testing
See the issue. We have tested this branch to confirm that at the very least the WSL bug goes away. It might be worth double checking that the files are copied over and then later deleted as desired.