Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TextFieldMapper honour ignore_malformed and handle object/array values. #29532

Closed
wants to merge 1 commit into from
Closed

TextFieldMapper honour ignore_malformed and handle object/array values. #29532

wants to merge 1 commit into from

Conversation

mP1
Copy link

@mP1 mP1 commented Apr 16, 2018

  • Have you signed the contributor license agreement?
    YES

  • Have you followed the contributor guidelines?
    YES

  • If submitting code, have you built your formula locally prior to submission with gradle check?
    YES
    Only a few org.elasticsearch.test.rest.IntegTestZipClientYamlTestSuiteIT tests but they seem unrelated, as exceptions report unrelated probs. Guessing this testsuite is fragile/being fixed.

  • If submitting code, is your pull request against master? Unless there is a good reason otherwise, we prefer pull requests against master and will backport as needed.
    YES

  • If submitting code, have you checked that your submission is for an OS that we support?
    YES(osx)

  • If you are submitting this code for a class then read our policy for that.
    NOT part of a class

- correct consumes object/array when ignore_malformed=true, so indexing
  can continue
- fixes #12366
@cbuescher cbuescher added review :Search Foundations/Mapping Index mappings, including merging and defining field types labels Apr 16, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

@cbuescher
Copy link
Member

As mentioned already in #29522, there is quiet some discussion about wether this option should be supported for String fields: #12366 (comment)
We should wait for a decision before proceeding with this PR.

@mP1
Copy link
Author

mP1 commented Apr 16, 2018

As mentioned already in #29522, there is quiet some discussion about wether this option should be supported for String fields: #12366 (comment)
We should wait for a decision before proceeding with this PR.

The answer to that imho has already been made, thats why Ignore_malformed was introduced, the real problem is it hasnt been implemented across all types. Its been 18 months and lots of people voting but no actual reason why this shoudlnt happen.

@cbuescher
Copy link
Member

The answer to that imho has already been made

This, unfortunately, is a misunderstanding. While there are a lot of upvotes on #12366, there is also disagreement and #12366 (comment) clearly states there are two camps and no decision made yet, asking for more feedback. Since then there has been ongoing discussions, leading among other things to the proposal in #29494. Currently I think the idea is to implement that first and then decide how to procede in adding the ignore_malformed support to more field types.

@mP1
Copy link
Author

mP1 commented Apr 16, 2018

Since then there has been ongoing discussions, leading among other things to the proposal in #29494.

Its very annoying for many users that try ignore_malformed and it doesnt actually work. The documentation says you the feature is available and the problem is it doesnt work for nearly all types, which means a lot of people are just wasting time.

If the intention is to disable until #29494, then the doco should add a line that says this feature is not yet supported for all types and should not be used or something like that. I know i wasted time trying to use it and then finding out it works a bit for some cases and not others, and i had no idea why when using elastic as a black box.

The first step to doing #29494 is to implement ignore_malformed for all types. How these failures are reported is the next step. If you want to delay implementation for all types until #29494 is done, then it should say so, and all the other tickets should be closed and say "wont be done until #29494"

@mP1
Copy link
Author

mP1 commented Apr 17, 2018

Forgot to mention but i can use feature flags to disable my new new ignore_malformed changes to this (TextFieldMapper) and (BooleanFieldMapper) in production and only turn this new stuff on in my tests. I would also include a test to ensure that the old behaviour happens when this feature flag is "off", so users wont know the difference between and after this commit.

The idea would be i would do other mappers, all of them "off" via the feature flag.

#29494 is a easy to do, basically everywhere the mapper catches/handles a ignore_malformed, it will record the field name. I would add a method to the ParserContext. Something like

ParserContext{
  void addIgnoreMalformed(String field)
  List<String> ignoreMalformedFields();
}

internally it adds the field to a list.

I would then add a if feature-flag is true, create a "_ignore_malformed" field on the document. This feature-flag would only be turned on for this test. Again the exact details of the field name ,its location in the document and more can be finalised later but at least most of the work would be done.

This is what the feature flag thing would look like.

/*package private*/ class MapperFeatureFlags{
  // false disables all my changes here and other requests.
  // basically all my changes do a if MapperFeatureFlags.IGNORE_MALFORMED

   // this is only turned by Tests and reset to flag after the test finishes using @Before & @After.
   static boolean IGNORE_MALFORMED = false;

 // this would allow early adopters to use this if they really want. (optional)
  static {
     IGNORE_MALFORMED = Boolean.get("org.elasticsearch.index.mapper.MapperFeatureFlags.IGNORE_MALFORMED");
  }
}

Anyway id like to get this done, but i need help to make this happen. The feature flag thing solves the problem of not making it generally available and lets me push ahead and do everything necessary in the meantime.

Sorry for trying to push this along, im just trying to make things happen. But if we solve #29494 in the next fortnight the whole lot could be done and ready. When everything is done we delete MapperFeatureFlags and everything becomes enabled

@jpountz
Copy link
Contributor

jpountz commented Apr 17, 2018

I understand your frustration, but features are so much harder to remove than add, we want to make sure that they have a consistent contract and are not trappy before we commit to them. The text field never supported the ignore_malformed option. Until now this option has only been about malformed strings like dates that do not match the configured format or ip addresses that do neither match the v4 nor v6 format. Since there are existing concerns about the behavior of this option, we would like to address them before expanding its scope.

@ernesto-jimenez
Copy link

@jpountz quick question, based on your comment on #12366, does that mean we could merge this PR?

/cc @cbuescher

@pbarker
Copy link

pbarker commented Jul 5, 2018

PR looks good, anything still holding this up?

@EricMCornelius
Copy link

@jpountz - Would love to see this get picked up for the 6.4.0 release if at all possible - huge boon to those of us using ElasticSearch for semi-structured/messy log analysis.

@jpountz jpountz self-requested a review July 30, 2018 15:10
@rjernst rjernst removed the review label Oct 10, 2018
@EricMCornelius
Copy link

@rjernst - any status updates on this, since I see tags are being changed on it?

Will it make it into 7.x?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Search Foundations/Mapping Index mappings, including merging and defining field types stalled
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ignore_malformed to support ignoring JSON objects ingested into fields of the wrong type
9 participants