Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add references support #8662

Closed
wants to merge 54 commits into from
Closed

Add references support #8662

wants to merge 54 commits into from

Conversation

bartv
Copy link
Contributor

@bartv bartv commented Jan 20, 2025

Description

Add references support (to the dataclasses branch)

@bartv bartv requested review from wouterdb and sanderr January 20, 2025 14:12
Copy link
Contributor

@wouterdb wouterdb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass

Comment on lines +337 to +338
if references.is_reference_of(value, float):
return True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the main open issue I think

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that the issue is that we don't always want to allow references, so this is too accepting?

We could have a Reference(value_type=Type) type, who's validate checks exactly this. Then validate callers could always be explicit in whether they want to allow references? I'd have to double check which callers exist, but I expect that for each of them being explicit would be both feasible and more clear.

Or perhaps a validate_reference method on Type. Would have the advantage that we can rely on polymorphism for the Float -> float mapping etc. But it's less flexible in the sense that adding support for more special cases (Unknown etc) requires a special method for each.


for type_name, handler_definition in Commander.get_providers():
code_manager.register_code(type_name, handler_definition)
types.add(type_name)

# Register all reference and mutator code to all resources. This is very coarse grained and can be optimized once
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it register everything that is loaded, even if not in use. There is something to say for that, but it is very greedy.

Comment on lines +61 to +156
class Argument(pydantic.BaseModel):
"""Base class for reference (resolver) arguments"""

type: str
""" The type of the argument. It is used a discriminator to select the correct class
"""

name: str
""" The name of the argument in the reference constructor
"""

@abc.abstractmethod
def get_arg_value(self, resource: "inmanta.resources.Resource") -> object:
"""Get the value for the argument to be able to construct the reference again

:param resource: The resource on which the reference has been defined
"""


class LiteralArgument(Argument):
"""Literal argument to a reference"""

type: typing.Literal["literal"] = "literal"
value: PrimitiveTypes

def get_arg_value(self, resource: "inmanta.resources.Resource") -> object:
return self.value


class ReferenceArgument(Argument):
"""Use the value of another reference as an argument for the reference"""

type: typing.Literal["reference"] = "reference"
id: uuid.UUID

def get_arg_value(self, resource: "inmanta.resources.Resource") -> object:
return resource.get_reference_value(self.id)


class GetArgument(Argument):
"""Get a value from the resource body as value"""

type: typing.Literal["get"] = "get"
dict_path_expression: str

def get_arg_value(self, resource: "inmanta.resources.Resource") -> object:
return None


class PythonTypeArgument(Argument):
"""Use the python type as an argument. This is mostly used for references that are generic for a
type, such as core::AttributeReference
"""

type: typing.Literal["python_type"] = "python_type"
value: str

def get_arg_value(self, resource: "inmanta.resources.Resource") -> object:
return getattr(builtins, self.value)


class ResourceArgument(Argument):
"""This argument provides the resource itself to the reference"""

type: typing.Literal["resource"] = "resource"

def get_arg_value(self, resource: "inmanta.resources.Resource") -> object:
return resource


ArgumentTypes = typing.Annotated[
LiteralArgument | ReferenceArgument | GetArgument | PythonTypeArgument | ResourceArgument,
pydantic.Field(discriminator="type"),
]
""" A list of all specific types of arguments. Pydantic uses this to instantiate the correct argument class
"""


class BaseModel(pydantic.BaseModel):
"""A base model class for references and mutators"""

type: ReferenceType
args: list[ArgumentTypes]


class ReferenceModel(BaseModel):
"""A reference"""

id: uuid.UUID


class MutatorModel(BaseModel):
"""A mutator"""


C = typing.TypeVar("C", bound="Base")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand, this defines the serialized form of the reference

type: typing.ClassVar[ReferenceType]

def __init__(self, **kwargs: object) -> None:
self._arguments: typing.Mapping[str, object] = kwargs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it wouldn't make more sense to e.g. pick up all values from __dict__ that don't start with _

I.e. this requires all arguments to be passed down, while this information is already available?

model = value.serialize()
arguments.append(ReferenceArgument(name=name, id=model.id))

case "resource", _, inmanta.resources.Resource:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the third argument match in case of subtyping?

def serialize(self) -> BaseModel:
"""Serialize to be able to add them to a resource"""

def serialize_arguments(self) -> Tuple[uuid.UUID, list[ArgumentTypes]]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need some more error reporting in a later stage I think

case _:
raise TypeError(f"Unable to serialize argument `{name}` of `{self}`")

data = json.dumps({"type": self.type, "args": arguments}, default=util.api_boundary_json_encoder, sort_keys=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may not be entirely stable, because the argument could be re-ordered when manipulating _arguments

return self._model


class DataclassRefeferenMeta(type):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class DataclassRefeferenMeta(type):
class DataclassReferenceMeta(type):

@@ -292,7 +386,12 @@ def object_to_id(

@classmethod
def map_field(
cls, exporter: Optional["export.Exporter"], entity_name: str, field_name: str, model_object: "proxy.DynamicProxy"
cls,
exporter: Optional["export.Exporter"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect the exporter object exposed to every export function to enabled the resource to register a reference even if it has no mutator (for plugins that have built-in, not transparent support)

self._references_model[model.id] = model

for mutator in self.mutators: # type: ignore
mutator = references.mutator.get_class(mutator["type"]).deserialize(references.MutatorModel(**mutator), self)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to define a base class resource to signal serialization failure , just to guide developer towards cleaner exceptions

Copy link
Contributor

@sanderr sanderr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intermediate comments

Comment on lines +394 to +395
RESOURCE_ATTRIBUTE_REFERENCES: typing.Final[str] = "references"
RESOURCE_ATTRIBUTE_MUTATORS: typing.Final[str] = "mutators"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should not be exposed in the model, should they?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd need to check if references are allowed somewhere in the arg validation.

src/inmanta/ast/type.py Show resolved Hide resolved
Comment on lines +337 to +338
if references.is_reference_of(value, float):
return True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that the issue is that we don't always want to allow references, so this is too accepting?

We could have a Reference(value_type=Type) type, who's validate checks exactly this. Then validate callers could always be explicit in whether they want to allow references? I'd have to double check which callers exist, but I expect that for each of them being explicit would be both feasible and more clear.

Or perhaps a validate_reference method on Type. Would have the advantage that we can rely on polymorphism for the Float -> float mapping etc. But it's less flexible in the sense that adding support for more special cases (Unknown etc) requires a special method for each.

Comment on lines +48 to +49
An instance of this class is used to indicate that this value is a reference and can only be resolved
at execution time in agent.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can only be resolved at execution time in agent.

Not entirely true iirc, though probably so in 90% of the use cases.

Comment on lines +313 to +314
# TODO: we need to make sure that the executor knows it should load the mutator and executor code before running the handler
# of a resource that uses references.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done now, isn't it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file probably needs some stable_api annotations



class mutator:
"""This decorator register a mutator under a specific name"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not meant to be exposed to end users for now?

return type.__getattr__(cls) # type: ignore


class Reference[T: RefValue](Base):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, the main interface starts here, and everything (except type vars etc) above this line is internals? Might be more readable to move this to the top, or move the internals to a submodule?

@@ -559,7 +559,7 @@ def _custom_json_encoder(o: object) -> Union[ReturnTypes, "JSONSerializable"]:

from inmanta.data.model import BaseModel

if isinstance(o, BaseModel):
if isinstance(o, (BaseModel, pydantic.BaseModel)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this one. It may be safe, but I'd need to double check what exactly is custom about our own models. I think we may not be able to guarantee round-trip compatibility (or timezone-awareness) of native pydantic models.

Copy link
Contributor

@sanderr sanderr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More comments.

Note to self: to review: references.py and resources.py.

Comment on lines +281 to +288
if not hasattr(container, "__getitem__"):
return False

if set and not hasattr(container, "__setitem__"):
return False

if remove and not hasattr(container, "__delitem__"):
return False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer isinstance checks on the ABCs here, but iirc there' a separate ticket to implement this properly.

model = value.serialize()
arguments.append(ReferenceArgument(name=name, id=model.id))

case "resource", _, inmanta.resources.Resource:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we special-case this, we should make sure to document that.

Comment on lines +325 to +327
:param name: This name is used to indicate the type of the reference
"""
self.name = name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will need to be changed somewhat with what we discussed earlier today. i.e. we'll want to accept a Python type and convert it with to_dsl_type. Would be even nicer if we can derive the Python type from the type subscription (i.e. the value for the generic type var) of the Reference, but I'm not sure how feasible that would be tbh.

Copy link
Contributor

@sanderr sanderr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, added some comments.

:param path: The current path we are working on in the tree
"""
match value:
case list() | proxy.SequenceProxy():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, but we might be able to replace this simply with

Suggested change
case list() | proxy.SequenceProxy():
case Sequence:

Comment on lines +481 to +484
const.RESOURCE_ATTRIBUTE_MUTATORS not in obj_map
or obj_map[const.RESOURCE_ATTRIBUTE_MUTATORS] is None
or const.RESOURCE_ATTRIBUTE_REFERENCES not in obj_map
or obj_map[const.RESOURCE_ATTRIBUTE_REFERENCES] is None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const.RESOURCE_ATTRIBUTE_MUTATORS not in obj_map
or obj_map[const.RESOURCE_ATTRIBUTE_MUTATORS] is None
or const.RESOURCE_ATTRIBUTE_REFERENCES not in obj_map
or obj_map[const.RESOURCE_ATTRIBUTE_REFERENCES] is None
obj_map.get(const.RESOURCE_ATTRIBUTE_MUTATORS) is None
or obj_map.get(const.RESOURCE_ATTRIBUTE_REFERENCES) is None

Comment on lines +248 to +249
case dict() | proxy.DictProxy():
return {key: collect_references(value_reference_collector, value, f"{path}.{key}") for key, value in value.items()}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, we're recursing on data we don't control, right? i.e. the attribute as specified by the user. Doesn't that mean that we can't assume that key matches dictpaths invariants? e.g. what if it contains . or * or other characters with special meaning?

self._add_reference(reference)

self.mutators.append(
references.ReplaceValue(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made my mind up, I like this approach.

Comment on lines +196 to +202
case list():
for v in value:
self._add_reference(v)

case dict():
for k, v in value.values():
self._add_reference(v)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a leftover from a previous implementation? As far as I can tell, the way this method is called, it would always be a Reference.

self.reference = reference

def resolve(self) -> T:
return typing.cast(T, getattr(self.reference, self.attribute_name))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we transparently skip what would otherwise be model validation (e.g. when assigned to a resource's attribute, should we add some validation here?

@inmantaci inmantaci deleted the branch dataclasses February 6, 2025 15:32
@inmantaci inmantaci closed this Feb 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants