Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor store creation from URL #71

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
399ec86
Rebase store creation onto main
Nov 7, 2022
8ccc4f2
precommit
Nov 7, 2022
28b481c
copy tests
Nov 7, 2022
00e8661
refactor imports
Nov 7, 2022
8222eeb
fix s3 test setup
Nov 7, 2022
ca78166
make tests pass
Nov 7, 2022
a56fd00
fix imports
Nov 7, 2022
d372280
fix boto tests
Nov 7, 2022
bc33155
document boto3
Nov 8, 2022
73a8601
Document gcstore and add live testing
Nov 8, 2022
f25bfb8
Add permissions to CI
Nov 8, 2022
2d7c040
where is the env variable?
Nov 8, 2022
905d8bc
hmmm?
Nov 8, 2022
fa73acf
base64 urlsafe
Nov 8, 2022
31f7a5c
what kind of credential file is this?
Nov 8, 2022
b0efb6c
try creation from id token
Nov 8, 2022
f3692aa
just use dict for auth
Nov 8, 2022
ae550df
load project from env
Nov 8, 2022
f33534b
Maybe load credentials like this?
Nov 8, 2022
0742e8e
simplest approach
Nov 8, 2022
6b00270
switch to id token??
Nov 8, 2022
7e4344b
add scopes
Nov 8, 2022
1ec0744
access token with scope?
Nov 8, 2022
f68af5b
use path for credentials
Nov 9, 2022
17dc1b5
idtoken instead?
Nov 9, 2022
a29e4c9
add audience
Nov 9, 2022
cc5737f
just google default?
Nov 9, 2022
879a9cf
another binding
Nov 9, 2022
05e919d
use random bucket name
Nov 9, 2022
78a6fd8
clean up gcs management
Nov 9, 2022
b3a7caf
complete testrun
Nov 9, 2022
7bd2e71
Check store equality
Nov 9, 2022
d295b2b
remove old stuff from ci.yml
Nov 9, 2022
c8b8fbd
Document gcstore
Nov 10, 2022
f96ef74
redis & documentation
Nov 10, 2022
bc7bf08
pre-commit
Nov 10, 2022
47d0564
Test more stores
Nov 10, 2022
6383b32
Only attempt google auth if we have permission to
Nov 10, 2022
6595f2a
precommit
Nov 10, 2022
c22088a
Documentation
Nov 11, 2022
1441b94
skip live tests if no credentials are available
Nov 11, 2022
fd5681c
Why can't we auth to google?
Nov 11, 2022
775c74c
wut
Nov 11, 2022
0c1ac6e
like this??
Nov 11, 2022
aa6225d
Check with -z
Nov 11, 2022
665bad3
Check for step outcome
Nov 11, 2022
acd597b
Put miniconda back in
Nov 11, 2022
0ac7d03
Document env variable check
Nov 11, 2022
ff637de
Merge branch 'simonmain' into new-store-creation
Jan 17, 2023
d3d550b
remove bucket creation when only creating reference
Jan 17, 2023
bb944bc
Add environment variable setting back in
Jan 18, 2023
a721347
trigger
Jan 18, 2023
a143936
test wrappers in more detail
Jan 18, 2023
73084c4
expand s3fs testing
Jan 18, 2023
a917f9c
Resolve old wrapper popping
Jan 19, 2023
392cb8c
Add init.py
Jan 19, 2023
dc52161
Fix splitting into host and port
Jan 19, 2023
106e4c1
fix testing imports
Jan 19, 2023
cbf8caa
Cleanup
Jan 19, 2023
b45467e
Update changelog
Jan 19, 2023
c96bf86
Remove live GCS testing
Jan 19, 2023
72522ba
Trigger CI
Jan 19, 2023
540bbaa
Trigger CI
Jan 19, 2023
ec3e27f
Don't create credentials if we have them in the environment
Jan 19, 2023
a3702cb
Move credential stuff to different branch
Jan 19, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -110,5 +110,10 @@ Pipfile.lock

# minimalkv
store/
old/

# Exploratory code
exp.py
exploration_scripts/

# terraform
terraform/.terraform*
8 changes: 8 additions & 0 deletions docs/changes.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,14 @@
Changelog
*********

1.7.0
=====
* Deprecated ``get_store``, ``url2dict``, ``_parse_userinfo``, and ``extract_params``.

* ``get_store_from_url`` should be used to create stores from a URL

* Added ``from_url`` and ``from_parsed_url`` to each store.

1.6.0
=====

Expand Down
1 change: 1 addition & 0 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ dependencies:
- azure-storage-blob
- boto
- boto3
- mypy-boto3-s3
- docker-compose
- dulwich
- fsspec
Expand Down
4 changes: 2 additions & 2 deletions minimalkv/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@
from minimalkv._get_store import get_store, get_store_from_url
from minimalkv._key_value_store import KeyValueStore, UrlKeyValueStore
from minimalkv._mixins import CopyMixin, TimeToLiveMixin, UrlMixin
from minimalkv._store_creation import create_store
from minimalkv._old_store_creation import create_store
from minimalkv._old_urls import url2dict
from minimalkv._store_decoration import decorate_store
from minimalkv._urls import url2dict

try:
import pkg_resources
Expand Down
24 changes: 17 additions & 7 deletions minimalkv/_boto.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,23 @@ def _get_s3bucket(
from boto.s3.connection import S3ResponseError # type: ignore
from boto.s3.connection import OrdinaryCallingFormat, S3Connection

s3con = S3Connection(
aws_access_key_id=access_key,
aws_secret_access_key=secret_key,
host=host,
is_secure=False,
calling_format=OrdinaryCallingFormat(),
)
s3_connection_params = {
"aws_access_key_id": access_key,
"aws_secret_access_key": secret_key,
"is_secure": False,
"calling_format": OrdinaryCallingFormat(),
}

# Split up the host into host and port.
if ":" in host:
host, port = host.split(":")
s3_connection_params["host"] = host
s3_connection_params["port"] = int(port)
else:
s3_connection_params["host"] = host

s3con = S3Connection(**s3_connection_params)

# add access key prefix to bucket name, unless explicitly prohibited
if force_bucket_suffix and not bucket.lower().endswith("-" + access_key.lower()):
bucket = bucket + "-" + access_key.lower()
Expand Down
138 changes: 132 additions & 6 deletions minimalkv/_get_store.py
Original file line number Diff line number Diff line change
@@ -1,18 +1,25 @@
from functools import reduce
from typing import Any
from typing import Any, Dict, List, Optional, Type
from warnings import warn

from uritools import SplitResult, urisplit

from minimalkv._key_value_store import KeyValueStore
from minimalkv._urls import url2dict


def get_store_from_url(url: str) -> KeyValueStore:
def get_store_from_url(
url: str, store_cls: Optional[Type[KeyValueStore]] = None
) -> KeyValueStore:
"""
Take a URL and return a minimalkv store according to the parameters in the URL.

Parameters
----------
url : str
Access-URL, see below for supported formats.
store_cls : Optional[Type[KeyValueStore]]
The class of the store to create.
If the URL scheme doesn't match the class, a ValueError is raised.

Returns
-------
Expand Down Expand Up @@ -52,7 +59,118 @@ def get_store_from_url(url: str) -> KeyValueStore:
json_b64_encoded = base64.urlsafe_b64encode(b).decode()

"""
return get_store(**url2dict(url))
from minimalkv._hstores import (
HAzureBlockBlobStore,
HDictStore,
HFilesystemStore,
HGoogleCloudStore,
HS3FSStore,
)
from minimalkv.fs import FilesystemStore
from minimalkv.memory import DictStore
from minimalkv.memory.redisstore import RedisStore
from minimalkv.net.azurestore import AzureBlockBlobStore
from minimalkv.net.gcstore import GoogleCloudStore
from minimalkv.net.s3fsstore import S3FSStore

scheme_to_store: Dict[str, Type[KeyValueStore]] = {
"azure": AzureBlockBlobStore,
"hazure": HAzureBlockBlobStore,
"s3": S3FSStore,
"hs3": HS3FSStore,
"boto": HS3FSStore,
"gcs": GoogleCloudStore,
"hgcs": HGoogleCloudStore,
"fs": FilesystemStore,
"file": FilesystemStore,
"hfs": HFilesystemStore,
"hfile": HFilesystemStore,
"filesystem": HFilesystemStore,
"memory": DictStore,
"hmemory": HDictStore,
"redis": RedisStore,
}

parsed_url = urisplit(url)
# Wrappers can be used to add functionality to a store, e.g. encryption.
# Wrappers are separated by `+` and can be specified in two ways:
# 1. As part of the scheme, e.g. "s3+readonly://..." (old style)
# 2. As the fragment, e.g. "s3://...#wrap:readonly" (new style)
wrappers = extract_wrappers(parsed_url)

# Remove wrappers from scheme
scheme_parts = parsed_url.getscheme().split("+")
# pop off the type of the store
scheme = scheme_parts[0]

if scheme not in scheme_to_store:
raise ValueError(f'Unknown storage type "{scheme}"')

store_cls_from_url = scheme_to_store[scheme]
if store_cls is not None and store_cls_from_url != store_cls:
raise ValueError(
f"URL scheme {scheme} does not match store class {store_cls.__name__}"
)

query_listdict: Dict[str, List[str]] = parsed_url.getquerydict()
# We will just use the last occurrence for each key
query = {k: v[-1] for k, v in query_listdict.items()}

store = store_cls_from_url.from_parsed_url(parsed_url, query)

# apply wrappers/decorators:
from minimalkv._store_decoration import decorate_store

wrapped_store = reduce(decorate_store, wrappers, store)

return wrapped_store


def extract_wrappers(parsed_url: SplitResult) -> List[str]:
"""
Extract wrappers from a parsed URL.

Wrappers allow you to add additional functionality to a store, e.g. encryption.
Wrappers are specified in the fragment part of the URL, e.g. "s3://...#wrap:readonly+urlencode"

Wrappers can also be specified as part of the scheme, e.g. "s3+readonly+urlencode://...".
This is deprecated and will be removed in a future version.

Parameters
----------
parsed_url: SplitResult
The parsed URL.

Returns
-------
wrappers: List[str]
The list of wrappers.
"""
# split off old-style wrappers, if any:
parts = parsed_url.getscheme().split("+")
# pop off the type of the store
parts.pop(0)
old_wrappers = list(reversed(parts))

# find new-style wrappers, if any:
fragment = parsed_url.getfragment()
fragments = fragment.split("#") if fragment else []
wrap_spec = list(filter(lambda s: s.startswith("wrap:"), fragments))
if wrap_spec:
fragment_wrappers = wrap_spec[-1].partition("wrap:")[
2
] # remove the 'wrap:' part
new_wrappers = list(fragment_wrappers.split("+"))
else:
new_wrappers = []

# can't have both:
if old_wrappers and new_wrappers:
raise ValueError(
"Adding store wrappers via store type as well as via wrap parameter are not allowed. Preferably use wrap."
)

return old_wrappers + new_wrappers


def get_store(
Expand Down Expand Up @@ -118,12 +236,20 @@ def get_store(
Key value store of type ``type`` as described in ``kwargs`` parameters.

"""
from minimalkv._store_creation import create_store
warn(
"""
get_store will be removed in the next major release.
If you want to create a KeyValueStore from a URL, use get_store_from_url.
""",
DeprecationWarning,
stacklevel=2,
)
from minimalkv._old_store_creation import create_store
from minimalkv._store_decoration import decorate_store

# split off old-style wrappers, if any:
parts = type.split("+")
type = parts.pop(-1)
type = parts.pop(0)
decorators = list(reversed(parts))

# find new-style wrappers, if any:
Expand Down
10 changes: 10 additions & 0 deletions minimalkv/_hstores.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,10 @@
from minimalkv.memory import DictStore
from minimalkv.memory.redisstore import RedisStore
from minimalkv.net.azurestore import AzureBlockBlobStore
from minimalkv.net.boto3store import Boto3Store
from minimalkv.net.botostore import BotoStore
from minimalkv.net.gcstore import GoogleCloudStore
from minimalkv.net.s3fsstore import S3FSStore


class HDictStore(ExtendedKeyspaceMixin, DictStore): # noqa D
Expand Down Expand Up @@ -39,6 +41,14 @@ def size(self, key: str) -> bytes:
return k.size


class HS3FSStore(ExtendedKeyspaceMixin, S3FSStore): # noqa D
pass


class HBoto3Store(ExtendedKeyspaceMixin, Boto3Store): # noqa D
pass


class HGoogleCloudStore(ExtendedKeyspaceMixin, GoogleCloudStore): # noqa D
pass

Expand Down
48 changes: 46 additions & 2 deletions minimalkv/_key_value_store.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
from io import BytesIO
from types import TracebackType
from typing import IO, Iterable, Iterator, List, Optional, Type, Union
from typing import IO, Dict, Iterable, Iterator, List, Optional, Type, Union

from uritools import SplitResult

from minimalkv._constants import VALID_KEY_RE
from minimalkv._mixins import UrlMixin
Expand Down Expand Up @@ -98,7 +100,7 @@ def get_file(self, key: str, file: Union[str, IO]) -> str:
implement a specialized function if data needs to be written to disk or streamed.

If ``file`` is a string, contents of ``key`` are written to a newly created file
with the filename ``file``. Otherwise the data will be written using the
with the filename ``file``. Otherwise, the data will be written using the
``write`` method of ``file``.

Parameters
Expand Down Expand Up @@ -462,6 +464,48 @@ def __exit__(
"""
self.close()

@classmethod
def from_url(cls, url: str) -> "KeyValueStore":
"""Create a Store from a URL.

Parameters
----------
url : str
URL to create store from.

Returns
-------
store : KeyValueStore
Store created from URL.
"""
from minimalkv import get_store_from_url

store = get_store_from_url(url, store_cls=cls)
if not isinstance(store, cls):
raise ValueError(f"Expected {cls}, got {type(store)}")
return store

@classmethod
def from_parsed_url(
cls, parsed_url: SplitResult, query: Dict[str, str]
) -> "KeyValueStore":
"""
Build a KeyValueStore from a parsed URL.

Parameters
----------
parsed_url: SplitResult
The parsed URL.
query: Dict[str, str]
Query parameters from the URL.

Returns
-------
store : KeyValueStore
The created KeyValueStore.
"""
raise NotImplementedError


class UrlKeyValueStore(UrlMixin, KeyValueStore):
"""Class is deprecated. Use the :class:`.UrlMixin` instead.
Expand Down
Loading