Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up old metrics using metrics timestamp #45

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mrzacarias
Copy link

Issue: #44

There's no need to keep the metrics always there on PAG, as they are constantly scraped and stored on Prometheus or Cortex long living storage. The attempt to fix the MEM/CPU performance issues generated by this is:

  • Adding a Metric Timestamp support, registering the timestamp sent with the metric post or generating one when PAG receives the metric
  • Use that Metric TS to check which metrics have not been merged on the last X milliseconds (configurable, default is 1 hour) and remove them from the Metric Families struct

This was tested at EverQuote with excellent results:
image

@mrzacarias mrzacarias force-pushed the issue44_mz_cleanup_old_metrics branch from f3ad0ac to 25f0fb1 Compare August 6, 2020 15:39
@mrzacarias mrzacarias force-pushed the issue44_mz_cleanup_old_metrics branch from 25f0fb1 to fa3fefb Compare August 6, 2020 15:41
@bboreham
Copy link
Contributor

Thanks for this - sorry nobody responded before now.
I'd like to take more time to read through before merging.

Copy link
Contributor

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main functionality you've added seems fine, but there are other changes in this PR that either need better justification, or removal.

Doing the cleanup, including copying all the metrics, on every scrape could get expensive, but probably not noticeable for modest usage.

cors := flag.String("cors", "*", "The 'Access-Control-Allow-Origin' value to be returned.")
pushPath := flag.String("push-path", "/metrics/", "HTTP path to accept pushed metrics.")
timeToLiveMs := flag.Int64("time-to-live-ms", 3600000, "How long unmerged metrics will live, in milliseconds (default 1h)")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little user-hostile; flag.Duration would allow specifying as 1h, 10m, etc.

@@ -259,12 +301,15 @@ func (a *aggate) handler(w http.ResponseWriter, r *http.Request) {
}

func main() {
listen := flag.String("listen", ":80", "Address and port to listen on.")
listen := flag.String("listen", ":8080", "Address and port to listen on.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a breaking change for users, and unrelated to the description. Please remove from this PR.

} else if c.err1.Error() != err.Error() {
t.Fatalf("Expected %s, got %s", c.err1, err)
t.Fatalf("Expected '%s', got '%s'", c.err1, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%q ?

a := newAggate()
a := newAggate(3600000)
if c.b != "" {
c.a, c.b, c.want = fmt.Sprintf(c.a, now), fmt.Sprintf(c.b, now), fmt.Sprintf(c.want, now)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this checked or relied on? I can't see that this test makes any use of the timestamp.

@nicolastakashi
Copy link

Hey @bboreham and @mrzacarias is there anything that I help you to merge this?

@bboreham
Copy link
Contributor

Do you want to take over the PR and resolve the comments I made?

@nicolastakashi
Copy link

Hi @bboreham, sure!
Let's give some time to check if the PR owner replies, if not I can handle it.

@nicolastakashi
Copy link

@bboreham a first try: #61
Can you take a look?

@nicolastakashi
Copy link

nicolastakashi commented Aug 15, 2022

Hi, @bboreham sorry for the bother.
HAve you ever had the opportunity to check this?

@nicolastakashi
Copy link

nicolastakashi commented Oct 5, 2022

Hi @bboreham friendly reminder, it would be fantastic to have this feature available.

@bboreham
Copy link
Contributor

bboreham commented Oct 9, 2022

I have commented on #61. I don't have write access to this repo so can't close this PR.

@sleminov
Copy link

Hello. Is there an ETA on when this feature might be available? Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants