Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

limit index autocreation to certain date range #9480

Closed
KlavsKlavsen opened this issue Jan 29, 2015 · 9 comments
Closed

limit index autocreation to certain date range #9480

KlavsKlavsen opened this issue Jan 29, 2015 · 9 comments
Labels
:Core/Infra/Settings Settings infrastructure and APIs discuss >enhancement

Comments

@KlavsKlavsen
Copy link

I currently have developers from all places feeding logs with logstash, into redis, which then goes into ES.

Often they make a mistake and the logs get a wrong date (or they feed in old logs that I do not want to have, because our ES crashes if I have too much data) - so I run curator nightly to remove old indices.

but today f.ex. a developer made a mistake (tellilng LS that a field containing timestamp in microsecond format, was actually UNIX_MS format) - which made logstash feedin logs with dates like 46447.05.08.. it created ~2400 indices with a 5-digit year.. and made ES go red in the end.

What I would suggest would be to allow for a filter on index autocreation.. so one could define the retention period - and make ES not allow auto-creation of indexes that f.ex. was not within now and now-3months.

a safeguard to help developers mistakes from hurting us all.

@clintongormley
Copy link
Contributor

@KlavsKlavsen we do support simple wildcard patterns on the auto_create_index setting, but these probably wouldn't be sufficient for your needs currently.

Perhaps being able to support real regexes would be enough though.

Also see #9359

@clintongormley clintongormley changed the title limit index autocreation to certain date range Add regex support to auto_create_index Jan 29, 2015
@clintongormley clintongormley added help wanted adoptme :Core/Infra/Settings Settings infrastructure and APIs >enhancement labels Jan 29, 2015
@KlavsKlavsen
Copy link
Author

real regexes would not help with the cases where valid (but out of range) are added to ES.
like old ones (+long past retention) and things from the future.. being able to specify it like a type of sorts.. logstash-%daterange('start','end', 'format') where start could be 'now -3 months', and end could just be 'now' and format be date formats as the 'date' needs. Then you could simply use date (shell command, or the c-lib - or whatever equivalent for the language one wants to use) to parse those to get start and end dates.

$ date -d 'now - 3 months' +%Y.%m.%d
2014.10.29

would be very usable to ensure only dates from within the retention period one wants, is what's accepted. (am I the only with a lot of developers feeding stuff into ES - and making mistakes ? :)

@clintongormley
Copy link
Contributor

(am I the only with a lot of developers feeding stuff into ES - and making mistakes ? :)

Yes, just you :)

To me, it seems like the wrong place to add a check like this. It's just too specific. Then we need to come up with some kind of syntax for specifying these rules in a single setting. I'd say a better place to put this would be in a logstash filter, which can be real code, no?

@KlavsKlavsen
Copy link
Author

problem is - that logstash is what the developers use to parse their logs (I can't do all their work) - and that's where they make mistakes..

@KlavsKlavsen
Copy link
Author

as an example the last one which gave ES a heartbeat, was developer who used logstash to parse a log - with time in microseconds, and used date function with UNIX_MS - which simply uses microseconds as a date just about 44.000 years in the future ;)

it's very easy to make mistakes with logstash - which it happily forwards to ES. I've done it myself very often as well.. and not everyone uses logstash. One developer who needs to feed in a lot, wrote his own instead..

So in general I feel it would be very handy to be able to protect ES a bit more - against unintentional mistakes like the ones mentioned.

@clintongormley clintongormley added discuss and removed help wanted adoptme labels Jan 29, 2015
@clintongormley clintongormley changed the title Add regex support to auto_create_index limit index autocreation to certain date range Jan 29, 2015
@bleskes
Copy link
Contributor

bleskes commented Feb 6, 2015

The tricky part here is that ES doesn't really understand that the index name implies a date at the moment. Recently we added a beforeIndexAddedToCluster hook ( #9514 ) which can be used by a simple plugin to add any custom rejection logic. Also, I think it will be very helpful to make the action.auto_create_index dynamically updatable. Then this becomes a simple cron job that updates the setting based on the current time. Something that can also be added to Curator.

@KlavsKlavsen
Copy link
Author

sounds like a viable plan.

@bleskes bleskes removed the discuss label Feb 6, 2015
@clintongormley
Copy link
Contributor

I wonder if we could support date math patterns (#12209)

@clintongormley
Copy link
Contributor

Closing in favour of #20640

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Settings Settings infrastructure and APIs discuss >enhancement
Projects
None yet
Development

No branches or pull requests

3 participants