-
-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Improved source management #10
[WIP] Improved source management #10
Conversation
Mappings for differnt sources
sorry I've been slow to look at this, will review tomorrow. 🙏 |
Thank you! I'll also take in a few improvement suggestions from the issues after you've reviewed the general approach. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps we can revert the renaming of main.py -> settings.py, that would make this PR much easier to review.
Then I'll review properly, the changed you've explained in the PR body look broadly sensible, but I'm a bit lost on code changes at this point.
Makefile
Outdated
@@ -25,7 +25,7 @@ mypy: | |||
|
|||
.PHONY: test | |||
test: | |||
coverage run -m pytest --durations=10 | |||
coverage run -m pytest --durations=10 -W ignore::DeprecationWarning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we shouldn't ignore all DeprecationWarning
s, also please move any warning ignores here
pydantic_settings/__init__.py
Outdated
@@ -1,6 +1,6 @@ | |||
import warnings | |||
|
|||
from .main import BaseSettings | |||
from .settings import BaseSettings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the rationale behind renaming main
to settings
? I'd prefer that we kept the old name, also if we do rename, let's do so in a separate PR so the other changes in the file are easier to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing serious TBH, just semantics. Will change!
Preliminary work to add support for multiple secrets dirs as discussed in issue. The field name is taken in the same sequence as the directories provided. |
Any chance you could look at it soon? @samuelcolvin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the slow response.
Having thought about this, I think we should make a change to how sources work, here's my proposal:
We have a PydanticBaseSource
abc which sources should inherit from.
Here's a very rough sketch of what the abc might look like:
from abc import ABC, abstractmethod
class PydanticBaseSource(ABC):
def __init__(self, settings_cls: type[BaseSettings]):
self.settings_cls = settings_cls
self.config = settings_cls.__config__
self.case_sensitive = self.config.case_sensitive
...
# anything else that is generally useful for subclasses
@abstractmethod
def get_field_value(self, names: tuple[str, ...], field: ModelField) -> Any:
"""
Get the value for a field, should return `Empty` if the field is not available.
"""
pass
def prepare_field(self, value: Any, field: ModelField) -> Any:
# default implementation does `json.loads` for "complex" fields
...
def __call__(self) -> dict[str, Any]:
# default implementation iterates over fields and calls `get_field_value`
# and `prepare_field`
...
I'm flexible on how PydanticBaseSource
is implemented, if you think there's a better way, let me know.
We would then define subclasses of PydanticBaseSource
like EnvSource
, SecretSource
, DotEnvSource
.
Advantages of this:
- we can split out as much common logic into methods which mean the logic can be adjusted, or the default implementation can be used easily
- it should make it easier to customise the existing sources while duplicating as little code as possible
This should allow SourceMapper
and the source providers to be combined into one cleaner abstraction.
I hope this makes sense, I'm sorry for the change of direction, but having read your PR and thought about this a bit more, I think this is the best way forward.
Let me know what you think?
) | ||
if sources: | ||
return deep_update(*reversed([source(self) for source in sources])) | ||
def get_attribute(name: str) -> Any: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think better that this becomes a method on the class. I know that will mean passing lots of arguments, but it'll be much easier to read and maintain.
if sources: | ||
return deep_update(*reversed([source(self) for source in sources])) | ||
def get_attribute(name: str) -> Any: | ||
attr_with_fallback: Dict[str, Tuple[Any, str]] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be defined in __init__
, you might need to use PrivateAttr
.
if not value and name in attr_with_fallback: | ||
value, warning = attr_with_fallback.get(name, (None, None)) | ||
if warning: | ||
warnings.warn(warning, DeprecationWarning) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to check attr_with_fallback
once, e.g. something like
if not value:
try:
value, warning = attr_with_fallback[name]
except KeyError:
...
else:
...
parameter_value = parameter.default | ||
if parameter.kind in {parameter.POSITIONAL_OR_KEYWORD, parameter.KEYWORD_ONLY}: | ||
kwargs[parameter.name] = parameter_value | ||
elif parameter.kind in {parameter.VAR_KEYWORD, parameter.VAR_POSITIONAL}: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See pydantic/pydantic#4673, Config
will become a dict in V2, I think we should accept **kwargs
and just pass the entire Config
dict.
But happy to defer this until after that change is made and V2 is in beta.
# init_kwargs are treated separately from all other sources. The | ||
# Config.sources need not include the init_kwargs. If the user wishes to | ||
# override the init_kwargs, they can do so with customize_sources | ||
_mappers: List[SettingsSourceCallable] = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a particular reason we're creating this on every call, rather than when creating the class?
Hey @samuelcolvin, If the mapping logic isn't consistent, the current implementation would be better for addressing that issue. That way, we can mix and match the mappers with the sources. eg. Although creating ABCs for the sources with the common mapping methods and implementing each source as its implementation sounds excellent, there are some caveats:
Having said that, if you still think having ABC for the sources is better. Let me know. I'll create a new PR. Cheers & happy new year, |
@adriangb @PrettyWood @hramezani @tiangolo please let me us know what you all think? If there's a consensus between the 6 of us, I'm happy to accept it. |
I'm slightly inclined towards subclasses rather than list of dicts, here's my rationale. Some of my points are probably invalid just because I'm not understanding the problem correctly or because I'm missing something. First:
DXAbout developer experience: When asking users to write code, I tend to prefer to ask them to write functions rather than classes because it's the simplest callable primitive they could have to write, this would be a +1 for a list of dicts with source providers and mappers. But, given that the parameters of the functions have to be complex (e.g. a So, not sure about this for both options, I think in this case I would be inclined to the subclasses. FlexibilityAbout flexibility, I guess a list of dicts with source providers and mappers would allow better composition, allowing to mix and match functions, that's also why I tend to prefer functions rather than classes in most cases. Nevertheless, for each of the dicts, it would require defining the provider and the mapper in the If the alternative ended up being some type of mixin classes, I would be strongly against classes with mixins. But I don't think that would be necessary or even feasible. I think probably more important would be to focus on use cases and the probability that something would be used or not. Although I agree that a list of dicts with explicit provider and mapper would be more flexible, how much would it be expected that it's needed to mix and match different functions for providers and mappers? If the expectation is more towards all the combinations of each provider with each mapping (many to many), then it would make more sense to have a list of dicts, as this would allow combining those if the alternative with classes would need a provider from one class and a mapper from another, because in that case, using classes, it would be needed to inherit one thing from one class (e.g. the provider) but override the other (e.g. the mapper) duplicating that code (if that code already lives in another class). Nevertheless... I would actually expect that the use cases would be more a one to many or one to one between providers and mappers, not many to many. And if that's true, having subclasses would cover most/all use cases and wouldn't require defining explicitly the combinations of provider and mapper in the list of dicts. And here's where I think it would make sense to consider use cases to guide the decisions a bit. Use casesI'm listing some use cases I can imagine. There might be some key use case that discards my thinking and moves the needle towards a list of dicts instead of subclasses, but up to now, it seems to me that subclasses cover everything I can imagine. Env varsThe first and original use case. And I think this is the one with the most custom logic. The provider would be quite simple, just the environ dict. The mapper would be the most complex, parsing strings into JSON data, handling upper / lower cases. Env var prefixes, possibly lists of values as comma separated in a single env var (CSV). I don't really see another use case that would need this complex mapper apart from this one for env vars and that would also need a complex provider from another use case. So, this provider would probably only be used with this mapper, so, one to one or one to many, not many to many. Dot env fileThis would be the same mapper as the one from Env vars, and the provider would be overridden to read from a file. But here the provider would be used only with the same mapper, so, again, one to many. Docker / Kubernetes secretsThis would read from a file or set of files, so, a custom provider. But as the file would most probably be JSON or YAML, no custom mapper would be needed, just the standard Pydantic serialization. So, not many to many. Remote storageThis would be Redis, Hashicorp Vault, even a SQL DB, or S3 storage, or remote secret/password management system. But in all the cases, the custom logic would be just the provider, because whatever is the remote thing, it would have native data types. The mapper would be the default Pydantic serialization. So, not many to many. I haven't imagined a use case that really needs a complex provider AND a complex mapper, and that could benefit from being able to re-use the two methods from different sub-classes, which is the main reason why I would think that a list of dicts would be better than subclasses. All the use cases I can think of can be well served by just subclasses (or default classes). Subclasses concernsSome arguments about the current concerns about subclasses:
But as this is already the case for using Pydantic models or BaseSettings, I wouldn't think this is particularly problematic or strange for users, they already have to create subclasses to use other parts of Pydantic. I would imagine that not requiring users to create subclasses would be an advantage mainly if that would allow them to reuse some code or function they might already have, apart from the extra import and extra line/complexity. But the main restriction is already in that users have to create functions with a very specific signature, to receive a value, and a very Pydatnic-specific
This is already the case for Pydantic's
Yes, but if my thought process is right, it will provide a much better developer experience, safer code (with better static analysis support), etc. which is one of the main reasons to not want to mix concerns, to have better code, but in this case, I think the benefits of subclasses up to now seem better than the drawbacks. The funny thing is, I tend to not like classes, tend to avoid them, and prefer plain functions when possible, even more when requiring users to write them. But in this case, I think the benefits of subclasses, up to now, seem to be better than their drawbacks. In particular, I think the strongest benefit for subclasses is having the chance of the editor autocompleting the signature of the method I have to override. |
amazing thoughtful response @tiangolo, thank you. I agree with you, but I'll let others reply before making an decision. |
Hey @tiangolo , Before the start of the discussion, we consider the following as given:
First, here's the architecture of the Here's a summary of what an individual Component does, how it is implemented and what it means for the user: Option 1: Componentsa. BaseSettings
b. SourceMapper
c. SourceProvider
Perhaps we want to reduce the abstractions to not overengineer from the get-go. If that's the case, we've now two options. Option 2: Merging BaseSetting and SourceMapping
Option 3: Creating an ABC with the logic of SourceMapping and implementing each source as a subclass of this ABC.
Perhaps those contexts would help us to make a proper decision. Let me know if I've missed something and if my understanding needs to be readjusted. Cheers & Happy New Year to you all. |
Sorry for the slow response, on balance I still think we should go with @Geosynopsis I totally understand if you don't agree with this and would prefer to step back from this piece as it's no going the way you thought best, but if you'd be willing to continue to work on it, that would be absolutely great. Let me know. |
Hey @samuelcolvin, no worries, if you consider going with ABC is best for it; I will modify and send you PR as soon as I find some time to spare. |
👍 Awesome, thank you. |
Hey @samuelcolvin,
|
Additionally, looking at the changes coming with V2, |
@Geosynopsis we used your excellent idea and impelemented it in 0ecac22 |
Great work, @hramezani! 🎉 |
Hey @samuelcolvin,
Could you have a look at the changes? It's still a work in progress. But it would be great to have your input on the general ideas.
Here's what the changes include:
Separation between the source provider, source mapper and Base Settings
[Breaking-Change] as of now, but can add a legacy layer. The SourceMapper call takes
Iterator(ModelFields)
instead ofBaseSettings
.The
Config
exposes thesource_providers
attribute as an interface to manipulate the source_providers to be used. The BaseSettings calls the providers in the same order except,init_kwargs
(at this point in time) which gets the first priority. Butcustomize_sources
can still be used to manipulate order.Further Improvements
Config
to provide an option for specific use cases when the user wants case-sensitivity, nestings etc. for one source but not for another.Reference: #6 (comment)