You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Linux elasticsearch-7c459c9bc5-l88zr 3.10.0-1160.76.1.el7.x86_64 #1 SMP Tue Jul 26 14:15:37 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Problem Description
We have a single node ES cluster hosting around 15 aliases and which is working perfectly.
We have developed a backup and restore solution using Curator action which was working perfectly fine.
The solution works as follows:
Take snapshot of Elasticsearch Indices and then use the delete action to maintain the latest 2 snapshots.
There is a WA to make snapshot operation working again, which is to Deregister the repo, delete the data from backup mount and recreate the backup repo and register it. After this it start with BAU work. But after certain interval - It keeps ending with same behavior where snapshot operation starts failing,
Expected Behavior
The above mentioned process where snapshot and delete actions are used.
We run the snapshot creation and old snapshot deletion operation in the interval of 6 hours every day.
The expectation here is it should keep taking the latest snapshot and also maintain the latest two snapshots in the repository.
Actual Behavior
Snapshot process mentioned above works till certain time and one day it starts failing with partial snapshot error: Failed to complete action: snapshot. <class 'curator.exceptions.FailedExecution'>: Exception encountered. Rerun with loglevel DEBUG and/or check Elasticsearch logs for more information. Exception: Snapshot PARTIAL completed with state: PARTIAL**
Error in ES logs:
As per the above logs it seems it was not able to find the index related file. But as this is the issue in production and the files were never removed manually. So I am suspecting something is going wrong here.
Specifications
ES version: 7.16.3
Curator version: 5.8.4
Context (Environment)
This is causing issues in production, where we end up having corrupted repository of the backups and only solution we have is to recreate it by losing all the backups. and in worst case we will end up having no use of backup as no way to restore the data because of backup repo corruption.
Thank you very much for your interest in Elasticsearch. Unfortunately the issue you have reported relates to Elasticsearch version 7.16.3 which is very old and has passed end-of-life. We will not investigate issues related to unsupported versions here on Github, so I am closing this to indicate that no action is needed from the Elasticsearch development team. It's possible that you will find a volunteer to help you with this issue on the community forums, but our strong recommendation would be to upgrade to a supported version of Elasticsearch as a matter of some urgency. If you can reproduce your issue on a supported version then please open a fresh bug report.
Quoting the bug report form:
Please also check your OS is supported, and that the version of Elasticsearch has not passed end-of-life. If you are using an unsupported OS or an unsupported version then the issue is likely to be closed.
Elasticsearch Version
7.16.3
Installed Plugins
No response
Java Version
bundled
OS Version
Linux elasticsearch-7c459c9bc5-l88zr 3.10.0-1160.76.1.el7.x86_64 #1 SMP Tue Jul 26 14:15:37 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Problem Description
We have a single node ES cluster hosting around 15 aliases and which is working perfectly.
We have developed a backup and restore solution using Curator action which was working perfectly fine.
The solution works as follows:
Suddenly - We started seeing failures in Snapshot operation as below:
on checking the snapshot repo with below curl:
There is a WA to make snapshot operation working again, which is to Deregister the repo, delete the data from backup mount and recreate the backup repo and register it. After this it start with BAU work. But after certain interval - It keeps ending with same behavior where snapshot operation starts failing,
Expected Behavior
The above mentioned process where snapshot and delete actions are used.
We run the snapshot creation and old snapshot deletion operation in the interval of 6 hours every day.
The expectation here is it should keep taking the latest snapshot and also maintain the latest two snapshots in the repository.
Actual Behavior
Snapshot process mentioned above works till certain time and one day it starts failing with partial snapshot error:
Failed to complete action: snapshot. <class 'curator.exceptions.FailedExecution'>: Exception encountered. Rerun with loglevel DEBUG and/or check Elasticsearch logs for more information. Exception: Snapshot PARTIAL completed with state: PARTIAL**
Error in ES logs:
As per the above logs it seems it was not able to find the index related file. But as this is the issue in production and the files were never removed manually. So I am suspecting something is going wrong here.
Specifications
ES version: 7.16.3
Curator version: 5.8.4
Context (Environment)
This is causing issues in production, where we end up having corrupted repository of the backups and only solution we have is to recreate it by losing all the backups. and in worst case we will end up having no use of backup as no way to restore the data because of backup repo corruption.
Adding link of issue raised for curator: elastic/curator#1697
Steps to Reproduce
All the details are mentioned above.
Logs (if relevant)
On checking further, In ES logs found below errors:
The text was updated successfully, but these errors were encountered: