Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TemplateUpgradeService get stuck in repeatedly upgrading templates after upgrade to 5.6.0 #26673

Closed
imotov opened this issue Sep 15, 2017 · 2 comments
Assignees
Labels

Comments

@imotov
Copy link
Contributor

imotov commented Sep 15, 2017

It manifests itself by the following messages that might linger in the log files for a while after upgrade if x-pack is installed.

[2017-09-13T13:24:57,961][INFO ][o.e.c.m.TemplateUpgradeService] [node1] Starting template upgrade to version 5.6.0, 2 templates will be updated and 0 will be removed 
[2017-09-13T13:24:57,985][INFO ][o.e.c.m.TemplateUpgradeService] [node1] Finished upgrading templates to version 5.6.0 

The problem is occurring because during application of templates the order of elements in the template mapping can get shuffled causing the follow-up check if update is need to fail. I am working on the fix.

@imotov imotov self-assigned this Sep 15, 2017
imotov added a commit to imotov/elasticsearch that referenced this issue Sep 18, 2017
TemplateUpgradeService might get stuck in repeatedly upgrading templates after upgrade to 5.6.0. This is caused by shuffling mappings definition in the template during template serialization. This commit makes the template serialization consistent.

Closes elastic#26673
imotov added a commit that referenced this issue Sep 19, 2017
…26698)

TemplateUpgradeService might get stuck in repeatedly upgrading templates after upgrade to 5.6.0. This is caused by shuffling mappings definition in the template during template serialization. This commit makes the template serialization consistent.

Closes #26673
imotov added a commit that referenced this issue Sep 19, 2017
…26698)

TemplateUpgradeService might get stuck in repeatedly upgrading templates after upgrade to 5.6.0. This is caused by shuffling mappings definition in the template during template serialization. This commit makes the template serialization consistent.

Closes #26673
imotov added a commit that referenced this issue Sep 19, 2017
…26698)

TemplateUpgradeService might get stuck in repeatedly upgrading templates after upgrade to 5.6.0. This is caused by shuffling mappings definition in the template during template serialization. This commit makes the template serialization consistent.

Closes #26673
imotov added a commit that referenced this issue Sep 20, 2017
…26698)

TemplateUpgradeService might get stuck in repeatedly upgrading templates after upgrade to 5.6.0. This is caused by shuffling mappings definition in the template during template serialization. This commit makes the template serialization consistent.

Closes #26673
@admlko
Copy link

admlko commented Feb 1, 2018

I'm still getting this with 6.2.1:

[2018-02-01T10:09:09,146][INFO ][o.e.c.m.TemplateUpgradeService] [analyzer01] Starting template upgrade to version 6.1.2, 1 templates will be updated and 0 will be removed
[2018-02-01T10:09:09,260][INFO ][o.e.c.m.TemplateUpgradeService] [analyzer01] Finished upgrading templates to version 6.1.2
[2018-02-01T10:09:18,168][INFO ][o.e.c.m.TemplateUpgradeService] [analyzer01] Starting template upgrade to version 6.1.2, 1 templates will be updated and 0 will be removed
[2018-02-01T10:09:18,277][INFO ][o.e.c.m.TemplateUpgradeService] [analyzer01] Finished upgrading templates to version 6.1.2

@estolfo
Copy link
Contributor

estolfo commented Feb 13, 2019

I am encountering this issue when running the rest-api YAML tests with the Ruby client in Docker.

The behavior observed: The rest api tests were passing when I ran elasticsearch and the tests outside docker. But when I ran both elasticsearch and the tests in docker, ES stopped responding halfway through the tests. I thought maybe Docker/ES were running out of memory. Here is an example of the error on Jenkins. I've put the error in this gist as well, in case the Jenkins job is no longer available when this issue is investigated.

After inspecting the pending_tasks queue and correlating the tasks with parts of the Elasticsearch codebase, I found that the TemplateUpgradeService was running in between each test when we were deleting all index templates. The code run between each test was: $client.indices.delete_template(name: '*')

I’m guessing the cause of the issue is that the TemplateUpgradeService can get itself into a deadlock if called too often. The only way I was able to resolve this was to not call delete_template with name: '*' and instead delete specific templates in between tests.

Please let me know if there's any other information you need to investigate or if I can help reproduce the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants