Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update BooleanFieldMapper(boolean fields) to support ignore_malformed… #29522

Closed
wants to merge 1 commit into from
Closed

Update BooleanFieldMapper(boolean fields) to support ignore_malformed… #29522

wants to merge 1 commit into from

Conversation

mP1
Copy link

@mP1 mP1 commented Apr 15, 2018

…=true

  • Also skips passed arrays and objects where boolean expected
  • Lots of tests for all data types and strict/lenient parser
  • Introduced and use pkg private constants replacing literals within BooleanFieldMapper
  • Have you signed the contributor license agreement?
    YES

  • Have you followed the contributor guidelines?
    YES

  • If submitting code, have you built your formula locally prior to submission with gradle check?
    YES

  • If submitting code, is your pull request against master? Unless there is a good reason otherwise, we prefer pull requests against master and will backport as needed.
    YES, against sundays master, will rebase again as required.

  • If submitting code, have you checked that your submission is for an OS that we support?
    YES (im OSX)

  • If you are submitting this code for a class then read our policy for that.
    NO not in a class

@cbuescher cbuescher added review :Search Foundations/Mapping Index mappings, including merging and defining field types labels Apr 16, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

value = fieldType().nullValue();
}
} else {
if (indexCreatedVersion.onOrAfter(Version.V_6_0_0_alpha1)) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when i grabbed master the other day this was included, and now its gone. Im not sure if i should keep the lenient parsing support.

Will remove if required.

Copy link
Member

@cbuescher cbuescher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mP1 thanks for opening this PR. I left a few questions and remarks, I really like the detailed work on the tests but didn't go into depth on comments there yet because I want to confirm a few things about the general goal of this PR first.

You mention this change relates to #11498 but I couldn't quiet get the relation yet. If I understand correctly, the issue deals with ignore_malformed on integer fields while this is about the boolean fields? Could you elaborate?

I think the PR looks good on many levels, but I'm not completely sure if adding this option to boolean fields is something that was discussed in any issue already and is something that we really want to do, if you can point me to any comment or discussion about this that would be helpful. Otherwise I think we should discuss the generall goals before further diving into further reviews.

@@ -115,6 +144,9 @@ public BooleanFieldMapper build(BuilderContext context) {
}
builder.nullValue(XContentMapValues.nodeBooleanValue(propNode, name + ".null_value"));
iterator.remove();
} else if (propName.equals("ignore_malformed")) {
builder.ignoreMalformed(XContentMapValues.nodeBooleanValue(propNode,name + ".ignore_malformed"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: whitespace between propNode,name

if (context.indexSettings() != null) {
return new Explicit<>(IGNORE_MALFORMED_SETTING.get(context.indexSettings()), false);
}
return NumberFieldMapper.Defaults.IGNORE_MALFORMED;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably introduce an own default value for BooleanFieldMapper, that way the two defaults are not tied to each other (or should they be?)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed introduced Defaults.IGNORE_MALFORMED for BooleanFieldMapper.Defaults.

// need to consume array/object, otherwise parser doesnt skip to the next field as expected
parser.skipChildren();

throw new IllegalArgumentException(parser.currentToken().toString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the reason for the try/catch that was introduced? If so, can the control flow somehow be done withouth thrrowing the exception here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see my longer comment slightly below. if the legacy stuff is removed this will be simplified and you will need to double check again.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider the throw a nasty way to keep flow control while i await your commentary whether i should keep/remove legacy lenient support that was recently removed.

value = fieldType().nullValue();
}
} else {
if (indexCreatedVersion.onOrAfter(Version.V_6_0_0_alpha1)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering what the general goal is for backwards-compatiblity here? Should this change got to some 6.x branch? If yes, I think its version should be used here, otherwise probably V_7_0_0?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When i grabbed master the other day, this entire block was present with all the version checking and logging.

When i rebased earlier today, the legacy check/handling was removed.

Should i remove the legacy code and my associated tests ? Just wondering if the legacy should be remove entirely ?

Your comments for lines above/below with this block are all legacy code which i have kept. I only added the try/catch and skipChildren bit to fix "skipping" over the object/array so the parser would be the proper spot to continue parsing the next field.


if (!parser.isBooleanValue()) {
String rawValue = parser.text();
deprecationLogger.deprecated("Expected a boolean for property [{}] but got [{}]",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why you log a deprecation her? Might be missing some context, but I thought this PR wasn't about deprecation but about adding some option to the field mapper.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see my comment above about legacy lenient support recently removed.

@cbuescher cbuescher self-assigned this Apr 16, 2018
@mP1
Copy link
Author

mP1 commented Apr 16, 2018

@cbuescher

You mention this change relates to #11498 but I couldn't quiet get the relation yet. If I understand correctly, the issue deals with ignore_malformed on integer fields while this is about the boolean fields? Could you elaborate?

yeh probably right was probably a mistake to quote 11498.

I think the PR looks good on many levels, but I'm not completely sure if adding this option to boolean fields is something that was discussed in any issue already and is something that we really want to do, if you can point me to any comment or discussion about this that would be helpful. Otherwise I think we should discuss the generall goals before further diving into further reviews.

Ignore the mention of 11498.

Its just a simple fix for BooleanFieldMapper(boolean fields) so ignore_malformed=true defined for a single field mapping works as expected.

@mP1
Copy link
Author

mP1 commented Apr 16, 2018

@cbuescher

The real issue before you review too hard is whether i should remove the lenient handling which i kept in my commit. As i said before when i grabbed master the other day it was still there, and then it disappeared and i wasnt sure whether i should keep it and it looked kind of useful and removing it would break legacy support others might be expecting.

  1. keep legacy handling of booleans (ive got tests for that now)
  2. delete legacy handling of booleans (remove my tests)

Which should i do 1) or 2) ?

The remove legacy support commit is 2018/5/11 Adrien Grand #29224 4918924.

Pretty sure you wont the legacy stuff removed, but please confirm.

…=true

- Also skips passed arrays and objects where boolean expected
- Lots of tests for all data types and strict/lenient parser
- Introduced and use pkg private constants replacing literals within BooleanFieldMapper
@mP1
Copy link
Author

mP1 commented Apr 16, 2018

@cbuescher

Hi mate, i removed the legacy lenient boolean support which makes my BooleanFieldMapper work as expected but with the improvement mentioned in the PR description.

Please forget the previous comments/questions and lets start again!

Copy link
Member

@cbuescher cbuescher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for removing the legacy checks, I think this code should then go to master only first. If we backport, those would need to be take care of separately.
I agree with most of the code changes and I really like the level of detail in the added tests, but I think the test code is a bit too scattered, making each test hard to read. I left a few comments as an example but think this applies to other places as well. Before diving into those changes again, I'd like to make sure we agree on adding the ignore_malformed option to boolean fields at all. If there is no issue yet that discussed this we need to get an agreement of this first.

parser.skipChildren();
}
} else {
// if value is really an array/object/number let parser crash and burn!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

;-) nice comment, I'd like it to stay but could you make an addition that this should be the exceptional case? Maybe something like "Usually we expect a boolean here, but ..."

@@ -130,26 +132,176 @@ public void testSerialization() throws IOException {
assertEquals("{\"field\":{\"type\":\"boolean\",\"doc_values\":false,\"null_value\":true}}", Strings.toString(builder));
}

public void testParsesBooleansStrict() throws IOException {
String mapping = Strings.toString(XContentFactory.jsonBuilder()
public void test_null() throws Exception{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the level of detail in these tests, the degree of splitting them in into so many test methods and helpers is maybe a little bit hard to read. The following are just some suggestions of how to maybe group some of the tests.
e.g the null, true and false test could be grouped into one case that checks allowed fields fort the allowMalformed field.

this.parseFails("0");
}

public void test_1() throws Exception{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you group all the failure tests for different input strings into one tests that e.g. loops over all of the values you want to check?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This group has several values and its better to test them individually, so we get failures for each case, if it becomes one test, then all we know is that it fails the first which is less helpful.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By creating these helpers its very simple to add a new test for a new input, most of each test is boilerplate, that leaves each test method to be down to the bare minimum.

Take an input(value, ignore_malformed(Y/N), execute, and some expectation (fail or expected value).

Super clean, super concise, super easy to add more tests for more inputs as necessary.

.endObject());
}

public void test_object_IgnoreMalformed() throws Exception{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although the code is clear, the level of indirection here makes it hard for the reader to figure out that this (and similar other) test does. As a suggestion, what about combining test_object/test_object_IgnoreMalformed, then the code from sourceWithObject can be inlined in the test case. Also I would make the assertion on field1 and field2 explicit, even if that means a few more lines of code. In this case I would trade repetition for readability.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a suggestion, what about combining test_object/test_object_IgnoreMalformed, then the code from sourceWithObject can be inlined in the test case.

Again if you combine them into one method, and it fails you dont know if the "2nd" test is also broken because the test fails and aborts on the first.

More small tests are better than one bigger test that fails on the first.

@mP1
Copy link
Author

mP1 commented Apr 16, 2018

Before diving into those changes again, I'd like to make sure we agree on adding the ignore_malformed option to boolean fields at all

yes thats the purpose of this commit. Its concentrating only on boolean fields.

@cbuescher
Copy link
Member

If there is no issue yet that discussed this we need to get an agreement of this first

I just found the summary of a longer discussion of ignore_malformed in #12366 #12366 (comment), I think the same arguments apply here and we need to get to an agreement whether we want to support this option on boolean fields first. Given the amount of discussions this might take some while, so I'm marking this as "stalled" in the meantime.
@mP1 thanks for the PR, the fix itself and the testing looks good so far but we need to get agreement on this first.

@mP1
Copy link
Author

mP1 commented Apr 16, 2018

I just found the summary of a longer discussion of ignore_malformed in #12366 #12366 (comment), I think the same arguments apply here and we need to get to an agreement whether we want to support this option on boolean fields first.

Currently ignore_malformed is a concept or feature that is supported on some types and thats confusing/annoying because we want it for all types to help solve the problem of skipping bad field values.

Some people in #12366 for example have suggested other hack(unofficial) solutions, which might disappear and cause them future problems. It seems to me that ignore_malformed with all its nasty reasons exists and should be present on all field types, if the user choses to use it.

There are a few other tickets discussing ignore_malformed for other types, and im going to guess they all probably want this feature for all types. I know i do and to try and help ive created another PR for text -> #29532

@cbuescher cbuescher removed the discuss label Apr 20, 2018
@cbuescher
Copy link
Member

cbuescher commented Apr 20, 2018

This was discussed internally and we agreed that we only want to proceed once #29494 has been fully discussed and implemented.

@javanna
Copy link
Member

javanna commented Aug 16, 2018

I am removing the stalled label as the #29494 is now closed.

@javanna javanna removed the stalled label Aug 16, 2018
@rjernst rjernst removed the review label Oct 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Search Foundations/Mapping Index mappings, including merging and defining field types
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants