Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent fielddata loading for _id #43599

Closed
jpcarey opened this issue Jun 25, 2019 · 4 comments · Fixed by #49166
Closed

Prevent fielddata loading for _id #43599

jpcarey opened this issue Jun 25, 2019 · 4 comments · Fixed by #49166
Labels
>enhancement :Search/Search Search-related issues that do not fall into other categories

Comments

@jpcarey
Copy link
Contributor

jpcarey commented Jun 25, 2019

Describe the feature: Need a way to prevent fielddata memory issues due to aggregating / sorting on _id

@dnhatn dnhatn added the :Search/Search Search-related issues that do not fall into other categories label Jun 26, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

@martijnvg
Copy link
Member

I ran into this today too. There is no way to disable loading of field data in the jvm heap. Because _id field is not configurable and doesn't have doc values enabled, this can cause a large part of the jvm heap be taken if someone sorts or aggregates on the _id field.

Either field data should be allowed to be disabled via configuration or field data should always never be loaded for the _id. The latter maybe a better choice when #46523 has been completed, because then there is no need to sort by _id with search after in order to get consistent sort order.

@AndrewMcQuerry
Copy link

Thanks to @jasontedor and @SpeechlessWick (and others!) for discussing this with us yesterday at the AMA booth.

Just adding this note here to echo the need for some sort of protection against the _id field causing fielddata Heap consumption. There have been multiple instances in both Test and Prod clusters where some user has performed some action (we assume this to be a sort or aggregation) which has caused all of our Data nodes in the cluster to consume large amounts of Heap for fielddata. Our only solution at this time is to simply check or audit the fielddata usage for "_id" and then manually clear the cache when/if we see it consuming large amounts.

@jimczi
Copy link
Contributor

jimczi commented Nov 15, 2019

I opened #49166 to disallow the loading of the _id field through a cluster setting. The goal is to get ride of the fielddata entirely but we're not there yet so the setting could be useful to prevent loading in 7x.

jimczi added a commit to jimczi/elasticsearch that referenced this issue Nov 22, 2019
This change adds a dynamic cluster setting named `indices.id_field_data.enabled`.
When set to `false` any attempt to load the fielddata for the `_id` field will fail
with an exception. The default value in this change is set to `false` in order to prevent
fielddata usage on this field for future versions but it will be set to `true` when backporting
to 7x. When the setting is set to true (manually or by default in 7x) the loading will also issue
a deprecation warning since we want to disallow fielddata entirely when elastic#26472
is implemented.

Closes elastic#43599
jimczi added a commit that referenced this issue Nov 27, 2019
)

This change adds a dynamic cluster setting named `indices.id_field_data.enabled`.
When set to `false` any attempt to load the fielddata for the `_id` field will fail
with an exception. The default value in this change is set to `false` in order to prevent
fielddata usage on this field for future versions but it will be set to `true` when backporting
to 7x. When the setting is set to true (manually or by default in 7x) the loading will also issue
a deprecation warning since we want to disallow fielddata entirely when #26472
is implemented.

Closes #43599
jimczi added a commit that referenced this issue Nov 28, 2019
)

This change adds a dynamic cluster setting named `indices.id_field_data.enabled`.
When set to `false` any attempt to load the fielddata for the `_id` field will fail
with an exception. The default value in this change is set to `false` in order to prevent
fielddata usage on this field for future versions but it will be set to `true` when backporting
to 7x. When the setting is set to true (manually or by default in 7x) the loading will also issue
a deprecation warning since we want to disallow fielddata entirely when #26472
is implemented.

Closes #43599
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search/Search Search-related issues that do not fall into other categories
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants