-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PB-617 Introduce parallel bucket setup #423
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
schtibe
force-pushed
the
feat-pb-617-parallel-buckets
branch
from
June 20, 2024 14:59
37b7394
to
4814b05
Compare
schtibe
force-pushed
the
feat-pb-617-parallel-buckets
branch
24 times, most recently
from
June 24, 2024 12:08
8c48f72
to
19449d3
Compare
schtibe
commented
Jun 24, 2024
schtibe
commented
Jun 24, 2024
schtibe
force-pushed
the
feat-pb-617-parallel-buckets
branch
from
June 24, 2024 12:11
19449d3
to
9d5bea6
Compare
schtibe
commented
Jun 24, 2024
These patterns have to go to the managed s3 bucket instead of the existing (legacy) one
The variables weren't pointing to the correct settings, thus the pylint in vscode wasn't properly working
The env wasn't loaded properly in pylint and thus some of the configuration couldn't be found. This lead to errors the vscode execution of pylint, and thus the errros weren't available in the editor
Parameterized is needed for parametrizing unit tests.
Test whether it selected the correct bucket based on the given patterns
The parallel bucket tests don't touch the API, therefore it doesn't make sense to have them in either tests_10 or tests_09
We need django-environ for parsing the list for the managed_patterns. Since we have this library, we can safely drop our own implementation that parses strings from the environment and use django-environ
Have the means to provide a list of (regex) patterns in the environment for selecting the correct s3 bucket
Have an extendable dict for the configuration of the buckets. This simplifies the settings access and makes it more versatile
Split the boto3 storage implementation into two separate versions in order to target the correct s3 bucket
Introduce new file field which can dynamically select the storage of the file based on the collection's name
Instead of having to use assume_role explicitly we use the proper way of accessing the bucket via service account
Improve the names of the settings. Make them more unified (everything connected to s3 starts with S3_).
schtibe
force-pushed
the
feat-pb-617-parallel-buckets
branch
3 times, most recently
from
July 8, 2024 16:13
4032e37
to
22d2f8d
Compare
It's impossible to reproduce the service-account situation for minio, hence we access it the legacy way with access key
schtibe
force-pushed
the
feat-pb-617-parallel-buckets
branch
from
July 9, 2024 06:50
22d2f8d
to
60cc43c
Compare
boecklic
approved these changes
Jul 9, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🪣 🪣 🎉
(approved after intense personal discussion)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Dual bucket setup
These changes allow for having two buckets in service-stac. Some of the uploads will go to the "managed" bucket based on the file patterns.
The changes concern the multipart upload. Based on the collection, different presigned urls will be generated from the buckets.
Other things included in this PR: