Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Receive: Add parameter to set out-of-order time window #5839

Merged
merged 7 commits into from
Nov 1, 2022

Conversation

matej-g
Copy link
Collaborator

@matej-g matej-g commented Oct 28, 2022

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

Although we recently updated Prometheus version that experimentally allows to ingest out-of-order samples, our users have no way of setting this as of now. This change adds a parameter to set this time window. In addition it also handles a new type of error (sample too old) that can occur if this feature is enabled.

Verification

Manually tried 😶

Signed-off-by: Matej Gera <matejgera@gmail.com>
Signed-off-by: Matej Gera <matejgera@gmail.com>
Signed-off-by: Matej Gera <matejgera@gmail.com>
yeya24
yeya24 previously approved these changes Oct 28, 2022
Copy link
Contributor

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks good.

Copy link
Contributor

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also add flag for OutOfOrderCapMax?

saswatamcode
saswatamcode previously approved these changes Oct 29, 2022
Copy link
Member

@saswatamcode saswatamcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall!

@yeya24
Copy link
Contributor

yeya24 commented Oct 30, 2022

This should fix #2599 and #2114 🎉

@matej-g
Copy link
Collaborator Author

matej-g commented Oct 31, 2022

Should we also add flag for OutOfOrderCapMax?

I can add that as well, wasn't too sure if this is something end users will need to tweak ordinarily.

@matej-g matej-g dismissed stale reviews from saswatamcode and yeya24 via 3a8f5cf October 31, 2022 10:45
Signed-off-by: Matej Gera <matejgera@gmail.com>
yeya24
yeya24 previously approved these changes Oct 31, 2022
Signed-off-by: Matej Gera <matejgera@gmail.com>
Copy link
Contributor

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yeya24 yeya24 merged commit 835b695 into thanos-io:main Nov 1, 2022
@yeya24
Copy link
Contributor

yeya24 commented Nov 1, 2022

@matej-g I just realized that one more thing. On the shipper side (the component who is responsible for uploading blocks), there is also a overlapping checker.
Do you know if OOO samples produce blocks that is overlapped and also compaction level 2?
This will cause problems since the checker will fail.

@matej-g
Copy link
Collaborator Author

matej-g commented Nov 2, 2022

@matej-g I just realized that one more thing. On the shipper side (the component who is responsible for uploading blocks), there is also a overlapping checker. Do you know if OOO samples produce blocks that is overlapped and also compaction level 2? This will cause problems since the checker will fail.

So, I'm not super familiar with the compaction in Prometheus, but as far as I can understand the code:

  • When "normal" head is compacted, also the OOO head gets compacted - so these blocks at compaction level 1 will overlap (but this should be expected)
  • Once these normal and OOO blocks are compacted further, however, I think there should be no overlap at level 2

So it should not be an issue. Does this make sense @yeya24?

@matej-g
Copy link
Collaborator Author

matej-g commented Nov 2, 2022

@yeya24 if I now look better at the shipper code, I notice we don't ship blocks with > 1 compaction level, so the overlap check should not even kick in. But it makes me wonder why we even run TSDB compaction on receivers if we basically ignore blocks with level > 1 🤔. Scratch that, since compaction in TSDB also takes care of head compaction, I still wonder though why run block compaction.

@yeya24
Copy link
Contributor

yeya24 commented Nov 2, 2022

I notice we don't ship blocks with > 1 compaction level, so the overlap check should not even kick in

Yeah I didn't know how OOO really works so not sure if it produces a level 1 overlapped block compacted or it creates 2 level 1 blocks with a normal one and 1 OOO one.

If it creates 2 level 1 block with a normal one and 1 OOO. Do you know does Prometheus will compact the two if there is overlap?

@matej-g
Copy link
Collaborator Author

matej-g commented Nov 2, 2022

Got you now, I see we don't run compaction because of min == max block duration, yeah we should be safe, thanks for checking 👍

ngraham20 pushed a commit to ngraham20/thanos that referenced this pull request May 18, 2023
* Add flag to specify out-of-order time window

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Handle new sample error

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Adjust CHANGELOG

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Add out-of-order cap max parameter

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Update docs

Signed-off-by: Matej Gera <matejgera@gmail.com>

* Add warning about enabling vertical compaction

Signed-off-by: Matej Gera <matejgera@gmail.com>

Signed-off-by: Matej Gera <matejgera@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants