Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not fail on duplicated content field filters #85382

Merged

Conversation

arteam
Copy link
Contributor

@arteam arteam commented Mar 28, 2022

Source includes and excludes, both configured in the mappings as well as provided as part of a search and get request, may include duplicate entries. This caused no problems until #83152 which makes Elasticsearch throw an error when we are processing the request. While users can update their request for get and search, this prevents users from upgrading to 8.1 if they have duplicate source includes/excludes configured in their existing mapping.

Set.of() throws a runtime exception if it encounters a duplicated element, Set.copyOf, on the contrary, doesn't do that.
We shouldn't fail if the user configure multiple include or exclude filters for _source fields.

@arteam arteam added :Search Foundations/Mapping Index mappings, including merging and defining field types auto-backport Automatically create backport pull requests when merged v8.0.2 v8.1.2 >bug labels Mar 28, 2022
Set.of() throws a runtime exception if it encounters duplicated elements.
@arteam arteam force-pushed the do-not-fail-on-duplicated-include-exclude-filters branch from 61f3bc8 to 852bcfc Compare March 28, 2022 10:35
@elastic elastic deleted a comment from elasticsearchmachine Mar 28, 2022
@arteam arteam changed the title Don't fail on duplicated content field filters Do not fail on duplicated content field filters Mar 28, 2022
@elasticsearchmachine
Copy link
Collaborator

Hi @arteam, I've created a changelog YAML for you.

@arteam arteam force-pushed the do-not-fail-on-duplicated-include-exclude-filters branch from ba41f42 to a6bb728 Compare March 28, 2022 12:12
@arteam arteam added v8.1.3 and removed v8.1.2 labels Mar 28, 2022
@arteam arteam marked this pull request as ready for review March 28, 2022 16:09
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Mar 28, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@javanna
Copy link
Member

javanna commented Mar 28, 2022

hey @arteam do you have any background on what bug this PR is fixing? Is it something that you encountered, or is there an issue filed for it? Would it be possible to add a description around what it is addressing from a user perspective? Thanks!

@javanna
Copy link
Member

javanna commented Mar 29, 2022

This seems to have been introduced with #83152 , @nik9000 @romseygeek would you mind having a look as you were involved in getting that one change in?

Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, TIL we shouldn't use Set.of() with user-provided input. LGTM!

@javanna
Copy link
Member

javanna commented Mar 29, 2022

Would it make sense to add a test to SourceFieldMapperTests that is affected by this change? The other caller that is affected is the fetch phase, but I haven't found a great place to add a contained test. Maybe a yaml test would do too.

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh. Today I learned.

@nik9000
Copy link
Member

nik9000 commented Mar 29, 2022

Would it make sense to add a test to SourceFieldMapperTests that is affected by this change? The other caller that is affected is the fetch phase, but I haven't found a great place to add a contained test. Maybe a yaml test would do too.

I probably would have added a yaml test myself but I'm not sure we super need one for this. OTOH, I'm paranoid and paranoia is good.

@arteam
Copy link
Contributor Author

arteam commented Mar 29, 2022

I've added a couple of test to SourceFieldMapperTests 👍

@arteam
Copy link
Contributor Author

arteam commented Mar 29, 2022

@elasticmachine update branch

@arteam arteam merged commit 50d1b4f into elastic:master Mar 29, 2022
arteam added a commit to arteam/elasticsearch that referenced this pull request Mar 29, 2022
`Set.of()` throws a runtime exception if it encounters a duplicated element, `Set.copyOf`, on the contrary, doesn't do that.
We shouldn't fail if the user configure multiple include or exclude filters for _source fields.

See https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-source-field.html#include-exclude
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.1

@arteam arteam deleted the do-not-fail-on-duplicated-include-exclude-filters branch March 29, 2022 17:42
@arteam
Copy link
Contributor Author

arteam commented Mar 29, 2022

Thanks Alan, Luca, and Nik!

arteam added a commit that referenced this pull request Mar 29, 2022
`Set.of()` throws a runtime exception if it encounters a duplicated element, `Set.copyOf`, on the contrary, doesn't do that.
We shouldn't fail if the user configure multiple include or exclude filters for _source fields.

See https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-source-field.html#include-exclude
Set.of(includes),
Set.of(excludes),
Set.copyOf(Arrays.asList(includes)),
Set.copyOf(Arrays.asList(excludes)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a chance to review it only after the pr was merged, may be we could consider this for a followup or a future change:

May be we should have a dedicated method for this in CollectionUtils so that intention is clear and nobody revert/simplify code to the shorter Set.of version. May be something like CollectionUtils.setOfCollectionWithPossibleDuplicates(Collection<T> c)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not advocating against your suggestion, but we do have tests for this now so reverting the change would cause test failures

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR introduced a test for duplicate filters. I think that's what you want, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, the test would catch the revert. I was suggesting to have a dedicated set factory that would display the intention (create set from collection that may contain duplicates)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged >bug :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Meta label for search team v8.1.2 v8.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants