Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[event log] ILM policy still active for older indices no longer active #97755

Closed
pmuellr opened this issue Apr 20, 2021 · 10 comments
Closed

[event log] ILM policy still active for older indices no longer active #97755

pmuellr opened this issue Apr 20, 2021 · 10 comments
Assignees
Labels
bug Fixes for quality problems that affect the customer experience Feature:EventLog Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)

Comments

@pmuellr
Copy link
Member

pmuellr commented Apr 20, 2021

Report from the field about numerous empty event log indices for older versions of Kibana that have been migrated from.

example:

list of event log indices found in the wild
health status index                           uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   .kibana-event-log-7.8.0-000008  ZhqKaLlmRva6pb-TpAwQ2Q   1   1          0            0       522b           261b
green  open   .kibana-event-log-7.8.0-000009  tizKmfvhQmGx2aEqXGAAtw   1   1          0            0       522b           261b
green  open   .kibana-event-log-7.8.0-000010  o35M3JMxSVq10MtlWbQkRw   1   1          0            0       522b           261b
green  open   .kibana-event-log-7.8.0-000011  vquHuIg8QlCMpmb3XFZN5w   1   1          0            0       416b           208b
green  open   .kibana-event-log-7.8.1-000006  TFk8QFyxTT2InB5AfD54lQ   1   1          0            0       522b           261b
green  open   .kibana-event-log-7.8.1-000007  KQv-raUjSDWQK8i2mgGLIQ   1   1          0            0       522b           261b
green  open   .kibana-event-log-7.8.1-000008  MV0uPYo0Tf2MtyRAMbtjMA   1   1          0            0       522b           261b
green  open   .kibana-event-log-7.8.1-000009  kCVFEyZAQUWKBR2b6sDE4w   1   1          0            0       416b           208b
green  open   .kibana-event-log-7.9.0-000006  vCGLsYH5T6OHpwuFOxQmDA   1   1          0            0       522b           261b
green  open   .kibana-event-log-7.9.0-000007  yfspPGh_SEeFCzmucPxDFw   1   1          0            0       522b           261b
green  open   .kibana-event-log-7.9.0-000008  PL27SNPZSe6xrZCQIBRvZg   1   1          0            0       522b           261b
green  open   .kibana-event-log-7.9.0-000009  UtqvbOGbRWW0taBfsi6-cg   1   1          0            0       416b           208b
green  open   .kibana-event-log-7.9.1-000005  lVU_gpHpSQyMhjC-7HAVtA   1   1          0            0       522b           261b
green  open   .kibana-event-log-7.9.1-000006  WkLIg6gGRkmfjN9opMyIhg   1   1          0            0       522b           261b
green  open   .kibana-event-log-7.9.1-000007  frCIBhogS4e-6l4swl4qjQ   1   1          0            0       522b           261b
green  open   .kibana-event-log-7.9.1-000008  G4phinOxTaOjEvWyxmExZQ   1   1          0            0       416b           208b
green  open   .kibana-event-log-7.9.2-000004  6KXk1utaSXyueF1X-VYKDA   1   1          0            0       522b           261b
green  open   .kibana-event-log-7.9.2-000005  dVLpro0NTkKZ7el2gFdQrQ   1   1          0            0       522b           261b
green  open   .kibana-event-log-7.9.2-000006  0WguWNOWQ_aV0hj-Kxgz3Q   1   1          0            0       522b           261b
green  open   .kibana-event-log-7.9.2-000007  A9UG4-7tQrmYmH8Wfs9tcA   1   1          0            0       416b           208b
green  open   .kibana-event-log-7.9.3-000003  pklvRQRmTIG0ftrhTnarog   1   1          0            0       522b           261b
green  open   .kibana-event-log-7.9.3-000004  YKauTjrYSpGmjIo1Vp-nqw   1   1          0            0       522b           261b
green  open   .kibana-event-log-7.9.3-000005  skFPj-89SJu33Rx7Et9saA   1   1          0            0       522b           261b
green  open   .kibana-event-log-7.9.3-000006  oah2FYQ8QqemSxqFUE0oBg   1   1          0            0       522b           261b
green  open   .kibana-event-log-7.10.0-000003 HyVb-VXhTWyY3ypH7uXEqA   1   1          0            0       522b           261b
green  open   .kibana-event-log-7.10.0-000004 VTO_rBRLRnWKHd53q2f09w   1   1          0            0       522b           261b
green  open   .kibana-event-log-7.10.0-000005 qkwdTPpoSz290pnY1YUN6Q   1   1          0            0       522b           261b
green  open   .kibana-event-log-7.10.0-000006 pgaP4nzjRSeeKHjpaazBrg   1   1          0            0       416b           208b
green  open   .kibana-event-log-7.10.1-000002 DT7J0pJUTuu_av72x_FRPQ   1   1     183332            0     35.6mb         17.8mb
green  open   .kibana-event-log-7.10.1-000003 8-n_Wah5RlaUbnq26Woqzw   1   1          0            0       522b           261b
green  open   .kibana-event-log-7.10.1-000004 8W5BcxfxTSmFw801SnPITQ   1   1          0            0       522b           261b
green  open   .kibana-event-log-7.10.1-000005 xQFt0BPGSyW0N8gRDU2zrw   1   1          0            0       416b           208b
green  open   .kibana-event-log-7.10.2-000001 bHqSATqJTbKJ_haYfb8zGA   1   1     822150            0    158.8mb         79.4mb
green  open   .kibana-event-log-7.10.2-000002 tNMezFVVSRO5pFu3GvVZRg   1   1          0            0       522b           261b
green  open   .kibana-event-log-7.10.2-000003 4ihfskrXSOW6c7oDdjewJw   1   1          0            0       522b           261b
green  open   .kibana-event-log-7.10.2-000004 n257wQHESmW2fc_lxgd_TA   1   1          0            0       416b           208b
green  open   .kibana-event-log-7.11.0-000001 jpf9Qd2MQEOtkGyXKfPfWQ   1   1     565334            0      101mb         50.5mb
green  open   .kibana-event-log-7.11.0-000002 jzXxOq13QEqsXaFQyNx8zw   1   1          0            0       522b           261b
green  open   .kibana-event-log-7.11.0-000003 25tFNxQSQ8CNYL4dMY_Q5g   1   1          0            0       416b           208b
green  open   .kibana-event-log-7.11.1-000001 oVhjsT_3QpqOsezqoTTD6Q   1   1     308808            0     55.4mb         27.7mb
green  open   .kibana-event-log-7.11.1-000002 zUZsUJ8bQNyBXQaK5Fh0RA   1   1          0            0       416b           208b
green  open   .kibana-event-log-7.11.2-000001 OpcQ8DlFRiiEPtA9pZ3sIw   1   1     367730            0       65mb         32.5mb
green  open   .kibana-event-log-7.11.2-000002 vhedK1QfT2GXdpTSnmd-7g   1   1          0            0       416b           208b
green  open   .kibana-event-log-7.12.0-000001 QGephkqTSBe7r3UwFqdtPw   1   1     850850            0    166.7mb         75.6mb

Apparently, the old versions of the indices are still "live" in terms of ILM, so keep rolling over, and deleting themselves, but basically leaving a fixed set of the old ones around - 4 per version it appears.

We should look into how we could fix this, so that once there are no docs in any of the version-specific indices, we turn off ILM. Hopefully there's an ILM config for this, otherwise we may need to do this "by hand".

@pmuellr pmuellr added bug Fixes for quality problems that affect the customer experience Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) Feature:EventLog labels Apr 20, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

@pmuellr
Copy link
Member Author

pmuellr commented Apr 20, 2021

Would also be nice to have some manual work-around for customers that do want this cleaned up. Like, can we just delete these indices, and then it'll stop? I think so ...

@pmuellr
Copy link
Member Author

pmuellr commented Apr 26, 2021

Was thinking that we could do something like this, as a recurring task "cleanup the old empty event log indices":

  • get the current version of Kibana
  • find all the empty event log indices that don't match the current version
  • delete them

One potential problem with this is that if a customer somehow migrated to a new Kibana version, and the task deleted all the event log indices for the version they were previously on, and then for some reason the customer decided to migrate DOWN to an old Kibana version, we'd be in trouble. Because we'd then be using an alias which pointed to a non-existant index, and I'm guessing something would likely go wrong.

I know that migrating down is not really supported. And it's likely that, upon migration, there would be non-empty event log indices that would survive the clean up tasks. So it seems to me that even if we did nothing special about this, things would likely work out fine. But it's also likely true you can imagine a case where it would do something bad.

Perhaps we could have the cleanup task recognize when there are versions of the event log resources that are at a version > than the current stack version, and provide a warning. Might be difficult to future proof, since we don't know how we might change these resource names in the future. Maybe we should provide version info in the resource metadata, where possible?

And then this makes me wonder about the index template and aliases. We should probably delete these as well, during the cleanup, if we recognize there are no indices left after cleaning up the indices. Seems to me that would likely work out fine. In that case, Kibana would start up, not see any of the alias, template, or initial index, and so re-create them.

@mikecote
Copy link
Contributor

I wonder if there's a way for these indices to be created lazily? That way, customers who don't use the event log don't have indices created. I'm not sure if ILM is the blocker for such capability.

@ymao1
Copy link
Contributor

ymao1 commented Jul 1, 2021

Submitted a PR for removing older event log indices from being managed by ILM, but @chrisronline correctly pointed out that while this would mean older indices would not roll over needlessly and create empty indices, they would also not age off after 90 days and they would stick around forever, which is not the behavior we want either.

Will investigate creating a cleanup task for this next.

@chrisronline
Copy link
Contributor

What if we applied a separate ILM policy for these indices? Something that just automatically deletes in the same period but doesn't have any rollover?

@mikecote
Copy link
Contributor

mikecote commented Jul 9, 2021

After doing a bit of research, this sounds like a limitation of ILM at this time. However, I came across elastic/elasticsearch#73349, which requests the rollover process to also clean up empty indices. What are your thoughts on relying on that ES ticket and add our +1 instead of developing a workaround?

@ymao1
Copy link
Contributor

ymao1 commented Jul 12, 2021

@mikecote Nice find! I only managed to find this issue where it looked like an issue that would not be addressed. I am in agreement that we should wait for that issue vs developing a workaround.

@mikecote
Copy link
Contributor

Great, I'm +1 if we want to close this issue in favour of the Elasticsearch ones and forward any feedback to those.

@ymao1
Copy link
Contributor

ymao1 commented Jul 12, 2021

Closing in favor of waiting for elastic/elasticsearch#73349 to be resolved

@ymao1 ymao1 closed this as completed Jul 12, 2021
@kobelb kobelb added the needs-team Issues missing a team label label Jan 31, 2022
@botelastic botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Feature:EventLog Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants