Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

openshift_checks: enable variable conversion #4922

Merged
merged 1 commit into from
Aug 9, 2017
Merged

openshift_checks: enable variable conversion #4922

merged 1 commit into from
Aug 9, 2017

Conversation

sosiouxme
Copy link
Member

@sosiouxme sosiouxme commented Jul 28, 2017

get_var can now do type conversion and raise an error if it doesn't work.

Later this could be integrated with defined check parameters to give the user more info about what was expected. Also check parameters could have indications for whether they're deprecated and defined/used.

Finally somewhere in here we could deal with the lazy evaluation of jinja2 in variables.

Copy link
Contributor

@rhcarvalho rhcarvalho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 👍 👍 good stuff

else:
raise OpenShiftCheckException(
'There is a bug in this check. '
'It is trying to convert a variable to unknown type "{}"'.format(convert)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work really well for ~50% of the values of convert:

In [1]: str(str)
Out[1]: "<class 'str'>"

try:
value = reduce(operator.getitem, keys, self.task_vars)
except (KeyError, TypeError):
if "default" in kwargs:
return kwargs["default"]
raise OpenShiftCheckException("'{}' is undefined".format(".".join(map(str, keys))))

convert = kwargs.get("type")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion:

convert = kwargs.get("type", lambda x: x)
try:
    return convert(value)
except ValueError:
    raise OpenShiftCheckException(...)

Wherever we use .get_var(...), we'd then always pass a function to the type optional argument, and never a string. Passing a string would be more cumbersome anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like a good idea. I was thinking we might want logic more complicated to e.g. extract version strings or convert byte suffixes or something, but I guess passing a function would cover that too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

although, note the logic under bool... part of the reason for doing this is to not have to repeat that everywhere.

I suppose we could still special-case known conversion functions like bool though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note the logic under bool...

Hmm, why would we add that YAML-like conversion here?

For bool, any non-empty string converts to True.

Now this gets me thinking that "covert" is normally from A to B, can we say that A is always of type str?

Copy link
Member Author

@sosiouxme sosiouxme Jul 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bool("false") is True. Ansible inventory variables typically come in as strings, except where we've set them with facts already (although users can also pass in json). This has already caused multiple bugs. See the change in logging/logging.py - someone who set openshift_hosted_deploy_logging=False in their inventory would see the checks expecting logging to be deployed. It's a really easy mistake to make.

@@ -81,6 +81,7 @@ class TestCheck(OpenShiftCheck):
@pytest.mark.parametrize("keys,expected", [
(("foo",), 42),
(("bar", "baz"), "openshift"),
(("bar.baz",), "openshift"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC we had this before and then removed the "extra feature" in favor of a single way of doing it.
Are you intending to use both get_var('openshift.foo.bar') and get_var('openshift', 'foo', 'bar')?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not? It's two extra lines and it matches what we do elsewhere in ansible code. Personally I would like the dotted version better but I don't want to go through the code and change them all.

I'm not as set on the "single obvious way to do it" ethos of python...

@pytest.mark.parametrize("keys, convert", [
(("foo",), "bogus"),
(("bar", "baz"), "int"),
(("bar.baz"), "float"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Funny that this test uses all strings, and the other uses all functions as the convert argument. I think dropping strings altogether makes the design and usage simpler.

@sosiouxme
Copy link
Member Author

@rhcarvalho updated

elif convert in ["float", float]:
return float(value)
elif convert in ["bool", bool]:
if convert is bool:
return str(value).lower() in ["true", "yes", "1"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this again. What behavior are we trying to mimic here?

For YAML 1.1, there are a bunch of values that are interpreted as a boolean: http://yaml.org/type/bool.html
I think that's what Ansible refer to in their docs, when they mention the yes / no case for instance.

For YAML 1.2, only true and false are the canonical forms, though playing with ruamel.yaml I see it also considers True and TRUE to be bool, while TRue parses as a string.

For the Ansible Inventory format, there is nothing explicit about bools. All we know is "The format for /etc/ansible/hosts is an INI-like format".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot about y|n and on|off.

Ansible is sort of a mishmash of ini and yaml, since you have the inventory as well as code, group variable files, etc.

My intuition is less about following a standard and more about not surprising the users. Which is a little bit chicken and egg, since following a standard should lead to consistent user expectations, in theory.

In the end, we should probably not implement our own here, and just re-use whatever the jinja bool filter already does. Then it will at least be consistent with what the rest of Ansible does, and "not our problem."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it would be possible to delegate/reuse any part of Ansible, jinja, ruamel.yaml, etc, it would be advantageous for us 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ansible implements its own bool filter. It's not exactly a standard but... we can be consistent.

@@ -107,30 +107,27 @@ def get_var(self, *keys, **kwargs):
return kwargs["default"]
raise OpenShiftCheckException("'{}' is undefined".format(".".join(map(str, keys))))

convert = kwargs.get("type")
convert = kwargs.get("convert", lambda x: x)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps the default is really to have strings... in which case str as default would be a good deal.

Copy link
Member Author

@sosiouxme sosiouxme Jul 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are cases where the user can pass in JSON - disk availability for instance... but it might be best anyway to be explicit about expected values for those. convert=None could be used if it's too cumbersome to bundle an actual function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh my, even JSON... I forgot about that ;)

The best would be to exterminate the INI format and command line flags in the next Ansible version, and made convert it all to Haskell -- hhaha, or maybe not :P

elif convert in ["float", float]:
return float(value)
elif convert in ["bool", bool]:
if convert is bool:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: instead of changing what bool (the function) means behind the scenes, we could have an explicit conversion function that handles all the corner cases we want, that would be similar to how API like Django Models and other Python libraries work.

We'd essentially declare type=SpecialBool (surely name SpecialBool anything different than that...) for the variables where we want the special handling. That also leaves room for us to use bool in the future without breaking any then existing code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really doubt there is an actual use case for using the built-in bool as the converter. In this context it is essentially non_empty so calling it bool would be highly confusing and lead to bugs.

'{error}'.format(var=".".join(keys), type=convert, error=str(error))
'Cannot convert inventory variable to expected type:\n'
' "{var}={value}"\n'
'{error}'.format(var=".".join(keys), value=value, error=str(error))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the str around error is unnecessary?

' "{var}={value}"\n'
'the given converter cannot be used or failed unexpectedly:\n'
'{error}'.format(var=".".join(keys), value=value, error=error)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this means that if the function returns something that is not a ValueError we want to report a slightly different message?!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ValueError ought to mean the user passed in a wrong value.

Any other kind of error indicates the conversion function writer did not handle the value correctly or at least gracefully (KeyError and similar). The user might have done something wrong too, but if it's not obvious to them then they can blame our code.

Are there other error categories that should be included / called out?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I think about it, OpenShiftCheckException (from any custom converters we write) should probably just be passed through.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I think about it, OpenShiftCheckException (from any custom converters we write) should probably just be passed through.

I hope we will not get to that? I thought the conversion functions would be more like pure-Python simple functions, just to operate on basic data types.

The ValueError + others separation is good, we'll just need to keep it in mind and let our custom functions raise ValueError on bad input and not something else accidental like you mention KeyError and similar. Again, I hope perhaps there will be nothing custom apart from the boolean conversion?

Copy link
Member Author

@sosiouxme sosiouxme Aug 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could see other converters being useful; code we already have could use a standard byte size converter and deployment_type validator. I don't think it's too much scope creep / YAGNI to just reraise OpenShiftCheckException as a third category of "what can go wrong".

@@ -15,7 +15,7 @@ class LoggingCheck(OpenShiftCheck):
logging_namespace = "logging"

def is_active(self):
logging_deployed = self.get_var("openshift_hosted_logging_deploy", type=bool, default=False)
logging_deployed = self.get_var("openshift_hosted_logging_deploy", convert=bool, default=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

convert=bool reads well, though convert=str is a bit strange since we perhaps expect the input to be a string already?! I liked type, though it is the name of a builtin function and may make pylint complain ;-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"type" is nice and intuitive for the most of what we're doing now with values coming out of task_args. I don't think pylint complains since it's never used as a name. But I think we may come up with more interesting standardized conversion and validation functions that don't just translate to a type. For instance, byte size converters, or validation that deployment_type is one of the known values.

So it made sense to me to give it a more general name that doesn't imply a simple typecasting. I'm not set on convert though, suggestions welcome... I could easily see make, transform, expect, ensure ... or accepting multiple names (type, convert, validate), to support not having to remember which it is :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or accepting multiple names (type, convert, validate), to support not having to remember which it is :)

Hahahahha 🤣

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still open to other names... normalize... coerce ....

@@ -97,13 +97,37 @@ def get_var(self, *keys, **kwargs):
OpenShiftCheckException when a key is not found or returning a default value
provided as a keyword argument.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new "convert" kwarg merits a comment.

' "{var}={value}"\n'
'the given converter cannot be used or failed unexpectedly:\n'
'{error}'.format(var=".".join(keys), value=value, error=error)
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ValueError ought to mean the user passed in a wrong value.

Any other kind of error indicates the conversion function writer did not handle the value correctly or at least gracefully (KeyError and similar). The user might have done something wrong too, but if it's not obvious to them then they can blame our code.

Are there other error categories that should be included / called out?

@sosiouxme
Copy link
Member Author

aos-ci-test

@sosiouxme
Copy link
Member Author

@rhcarvalho you ok with this? Can I get a LGTM?

@openshift-bot
Copy link

success: "aos-ci-jenkins/OS_3.6_NOT_containerized, aos-ci-jenkins/OS_3.6_NOT_containerized_e2e_tests" for 4e6bf28 (logs)

@openshift-bot
Copy link

success: "aos-ci-jenkins/OS_3.6_containerized, aos-ci-jenkins/OS_3.6_containerized_e2e_tests" for 4e6bf28 (logs)

Copy link
Contributor

@rhcarvalho rhcarvalho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine, LGTM

@@ -18,7 +18,7 @@ class LoggingCheck(OpenShiftCheck):
logging_namespace = "logging"

def is_active(self):
logging_deployed = self.get_var("openshift_hosted_logging_deploy", default=False)
logging_deployed = self.get_var("openshift_hosted_logging_deploy", convert=bool, default=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this fixing an existing bug?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I filed a bug. But yeah, it's a minor bug that logging is considered always on if defined...

@sosiouxme
Copy link
Member Author

[merge]

@sosiouxme
Copy link
Member Author

yum flakes openshift/origin#10162 and openshift/origin#8571

[merge]

@openshift-bot
Copy link

Evaluated for openshift ansible merge up to 4e6bf28

@openshift-bot
Copy link

continuous-integration/openshift-jenkins/merge FAILURE (https://ci.openshift.redhat.com/jenkins/job/merge_pull_request_openshift_ansible/817/) (Base Commit: 5fe4c8c) (PR Branch Commit: 4e6bf28)

@sosiouxme
Copy link
Member Author

@sdodson sdodson merged commit 6528031 into openshift:master Aug 9, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants