Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite of storage backend using fs #145

Merged
merged 4 commits into from
Oct 8, 2020

Conversation

rhodes73
Copy link
Contributor

Thanks for creating imagetagger. It's a great tool. I use it together with Google Cloud Storage and I therefore did a rather large refactoring, abstracting the file storage layer of imagetagger.

Now many other filesystems like S3 or GCS are supported through PyFilesystem2 (see here for a list of supported file systems.)

There were no tests for the file storage layer and I did not write any new ones with this refactoring. I did some manual testing and am successfully using this filesystem abstraction in my Google Cloud deployment. Feel free to write some tests yourself. I might also add some later on.

I thought I contribute this potentially useful change back in case it is useful for other people. However, feel free to adapt or completely reject this pull request.

@rhodes73
Copy link
Contributor Author

There are still 3 open TODOs related to setting group permissions, where some input from @Akasch would be appreciated.

image_file.seek(0)
imageset_dir.upload(img_fname, image_file)
# TODO rfrigg: Check if necessary
# shutil.chown(file_new_path, group=settings.UPLOAD_FS_GROUP)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Akasch Can you explain why this is needed? I guess it might be needed for an OS filesystem, but not for a cloud filesystem like GCS. I can look into adding some code that adds this line back in case an OS filesystem is used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ftsell Shall we remove these TODOs since you tested everything and it works as expected?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes although I'm not 100% sure either, I think it is safe to do so.

@rhodes73
Copy link
Contributor Author

rhodes73 commented Jul 9, 2020

I'm closing this PR since there seems to be no interest in contributions.

@rhodes73 rhodes73 closed this Jul 9, 2020
@ftsell
Copy link
Member

ftsell commented Jul 9, 2020

It's not that we are not interested but rather that there is currently not much activity because of COVID. I agree that we should have at least responded though. Be assured that it was not meant as a personal offense.

I'm reopening this because we still are interested in integrating this feature.

@ftsell ftsell reopened this Jul 9, 2020
@rhodes73
Copy link
Contributor Author

rhodes73 commented Jul 9, 2020

@ftsell No offense taken :) Thanks for re-opening in this case.

@ftsell ftsell self-requested a review October 2, 2020 09:39
Copy link
Member

@ftsell ftsell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general these are really nice improvements and I appreciate the effort you have put in; thank you very much!

I have tested the proposed changes by running manage.py directly and inside a kubernetes deployment but only with a local file-system. I do not expect cloud based storage to behave any different since that is the whole point of using fs.
In both scenarios I was able to verify that uploading, downloading, viewing and editing of tools and images work as expected. The zip-daemon threw a minor error which was easily correctable.

I have some minor suggestions which mostly have to do with where settings are stored.

I have also attached patch files which merge master back into your branch and fix the problems/suggestions I have found.

Roman Frigg and others added 2 commits October 2, 2020 14:41
Now many other filesystems like S3 or GCS are supported through the fs
filesystem abstraction layer.
previously, the zip-daemon would sometimes run into an error when the
tmp-directory for a specific imageset did not already exist. A fix has
been included to create the directory in such a case.
@rhodes73
Copy link
Contributor Author

rhodes73 commented Oct 2, 2020

@ftsell I rebased my branch on the latest master and applied your patch 008 for the zip daemon. Patches 001 until 006 are not different from the commits on master and were automatically introduced because of the rebase. 007 is unrelated to this PR, so I didn't apply it. 009 is about moving things from base_settings.py to settings.py.example, but this patch failed when I tried to apply it. Maybe easier if you just add separate PRs for patches 007 and 009.

@ftsell
Copy link
Member

ftsell commented Oct 3, 2020

@roman-captural

007 is unrelated to this PR

Not exactly because the additional dependency is only required with the addition of fs. And because you are introducing the dependency to fs in your PR I would very much like to make sure that the application continues to work even in the docker deployment.

Opening another PR for improving the settings structure is something I can agree with so you can leave out patch 009 if you want to.

the dependency was only added to the Dockerfile and not requirements.txt
since it is present in non-minimized python environments.
@rhodes73
Copy link
Contributor Author

rhodes73 commented Oct 3, 2020

Not exactly because the additional dependency is only required with the addition of fs. And because you are introducing the dependency to fs in your PR I would very much like to make sure that the application continues to work even in the docker deployment.

I see, sorry about this. It wasn't clear from the commit message. I added another commit with patch 007

@ftsell
Copy link
Member

ftsell commented Oct 3, 2020

@roman-captural

It wasn't clear from the commit message.

true

@ftsell ftsell self-requested a review October 3, 2020 10:12
Copy link
Member

@ftsell ftsell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alright; looks fine to me 👍

I spoke with @timonegk and he wanted to do a review as well so that's why this is only a comment and not approval.

@timonegk
Copy link
Member

timonegk commented Oct 4, 2020

Rebasing 1a0264f introduced a small problem in settings.py.example: the paths given for DATA_PATH, TOOLS_PATH and IMAGE_PATH appear to be absolute paths but are instead interpreted relative to FS_URL in the code.

@ftsell
Copy link
Member

ftsell commented Oct 8, 2020

Rebasing 1a0264f introduced a small problem in settings.py.example: the paths given for DATA_PATH, TOOLS_PATH and IMAGE_PATH appear to be absolute paths but are instead interpreted relative to FS_URL in the code.

@timonegk is right but something simple like this should be enough though.

@rhodes73
Copy link
Contributor Author

rhodes73 commented Oct 8, 2020

@timonegk Thanks for your review and for testing this again after the rebase. I'm currently a bit short on time to look at it in detail. I applied @ftsell's patch (thanks!). I hope everything is functional now, so we can get this PR merged.

@timonegk
Copy link
Member

timonegk commented Oct 8, 2020

Looks great now, thank you for the feature and sorry again for the long delay!

@ftsell ftsell merged commit 6a4e5e0 into bit-bots:master Oct 8, 2020
@rhodes73 rhodes73 deleted the abstract-filesystems branch October 8, 2020 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants