Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time filtering overhead #625

Closed
bobrik opened this issue Oct 29, 2013 · 4 comments
Closed

Time filtering overhead #625

bobrik opened this issue Oct 29, 2013 · 4 comments

Comments

@bobrik
Copy link

bobrik commented Oct 29, 2013

If timespan is hovers the whole index, there's no need to add time filtering. For example, you rotate indices daily and request data for 7 days. Then 8 requests happen, each but 6 of them don't need filter, because time filter doesn't filter out anything.

This actually adds huge overhead if you have a lot of data.

Benchmarks! I took real query that holds 2 date_histogram facets and fired requests agains 1 day of real data (6gb on disk).

No time filtering: 6474ms total

567,
561,
885,
750,
591,
557,
874,
555,
562,
572,

Time filtering: 10090ms total

952,
1315,
923,
924,
1282,
917,
927,
938,
971,
941,

That's 55% more time! I think we could automatically remove time filter if we know that it doesn't make sense.

@otisg
Copy link

otisg commented Oct 29, 2013

@bobrik Can you try this benchmark directly against ES, and run it N times in a row to allow caches to warm and be utilized? Another important factor here is index refresh rate.

@bobrik
Copy link
Author

bobrik commented Oct 29, 2013

@otisg I did exactly this. curl against ES in a loop, one after another. What could change index refresh rate for index that does not receive updates anymore?

Also, timespan is changing because time goes on and every refresh generates new time filter that causes cache miss for filters cache. This makes no sense on read-only indices.

@otisg
Copy link

otisg commented Oct 29, 2013

Perfect then! Didn't realize indices you used were static. If they are, ignore the refresh piece.

@bobrik
Copy link
Author

bobrik commented Nov 13, 2013

As mentioned in #693, nginx can actually cache requests entirely and responses could be delivered from cache without any computations for read-only indices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants
@bobrik @otisg @rashidkpc @spalger and others