-
Notifications
You must be signed in to change notification settings - Fork 743
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add metrics reporting for mutation #1435
Add metrics reporting for mutation #1435
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1435 +/- ##
==========================================
- Coverage 52.05% 52.00% -0.05%
==========================================
Files 80 83 +3
Lines 7429 7593 +164
==========================================
+ Hits 3867 3949 +82
- Misses 3195 3263 +68
- Partials 367 381 +14
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
pkg/controller/mutators/assignmetadata/assignmetadata_controller.go
Outdated
Show resolved
Hide resolved
pkg/mutation/stats_reporter.go
Outdated
|
||
// NewStatsReporter creaters a reporter for webhook metrics. | ||
func NewStatsReporter() (StatsReporter, error) { | ||
ctx, err := tag.New( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto above. Just use context.TODO().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sozercan Removing the use of tag.New()
removes the possibility of creating an error from the instantiation of a reporter
. tag.New()
can produce an error because it has the ability to mutate the key/value pairs stored inside of a context.Context
type. As we feed in an empty context.Context
and add no new information, there is no way for an error to be produced.
This really has two implications:
NewStatsReporter
is reduced to having essentially no logic. Perhaps we should just remove it? We could use thereporter
type (renamed toReporter
) and still specify theStatsReporter
interface in function signatures.reporter
no longer needs to contain acontext
. That context is useless and is essentially cruft. We could just as easily create a fresh context in each of the reporting functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
verifyTags(t, expectedTags, row.Tags) | ||
} | ||
|
||
func checkData(t *testing.T, name string, rowLength int) *view.Row { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Prefer returning error and calling t.Errorf/t.Fatalf from the test itself for clarity. (In general, avoid assertion methods)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Roger. I stole this from other stats_reporter files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly nits and making sure bases are covered. No changes here are required for approval.
pkg/mutation/stats_reporter.go
Outdated
Description: systemIterationsM.Description(), | ||
Measure: systemIterationsM, | ||
// JULIAN - We'll need to tune this. I'm not sure if these histogram sections are valid. | ||
Aggregation: view.Distribution(2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My guess is these are milliseconds? If so, I'd recommend:
1, 2, 5, 7, 10, 20, 50, 70, 100, 200, 500, 700, 1000
That results in more human-readable output that's easier to reason about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the number of times we go through the mutators when mutating an incoming object. The upper-bound on iterations is (number of mutators) + 1
.
As most systems will have something like 1-100 mutators, I figured that having more granularity in the lower buckets would be a good idea. That said, both of these bucket sizes are growing exponentially, which should represent the data well.
1000 mutators seems unlikely to me, but not impossible.
@maxsmythe can you add some color here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, then as-is looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately we're in "not enough information" territory, as we don't know how users will use mutations.
Forwards compatibility is a nice (if vague) target here.
I'd expect most values at the low numbers (more iterations would be due to multiple mutators affecting the same object).
We can start with 1...10, 20, 50, 100
and see where that lands us, with the idea that those buckets are the most likely to be reused in the future (assume most people will have less than 20 overlapping mutators with some granularity to identify larger cases, hit round numbers that will play well with logarithmic buckets).
Cannot reply to pre-existing comment (thanks github), so closing and adding as a convo thread:
Removing context SGTM, but let's keep the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replying to outstanding comment threads, havent dug into the current version of the code yet.
pkg/mutation/stats_reporter.go
Outdated
Description: systemIterationsM.Description(), | ||
Measure: systemIterationsM, | ||
// JULIAN - We'll need to tune this. I'm not sure if these histogram sections are valid. | ||
Aggregation: view.Distribution(2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately we're in "not enough information" territory, as we don't know how users will use mutations.
Forwards compatibility is a nice (if vague) target here.
I'd expect most values at the low numbers (more iterations would be due to multiple mutators affecting the same object).
We can start with 1...10, 20, 50, 100
and see where that lands us, with the idea that those buckets are the most likely to be reused in the future (assume most people will have less than 20 overlapping mutators with some granularity to identify larger cases, hit round numbers that will play well with logarithmic buckets).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, some nits, the biggest one being changing the fibbonaci-sized buckets to logarithmically scaled ones.
pkg/controller/mutators/assignmetadata/assignmetadata_controller.go
Outdated
Show resolved
Hide resolved
This PR incorporates frameworks commit 1dbe2618668df5f48b6a256bdcd724eeb2b1c130 Add ConstraintTemplate v1 (#121) An imperative install CRD install step was added to the Helm Upgrade GH Workflow, as Helm does not update CRDs. Contributes to open-policy-agent#550 Signed-off-by: juliankatz juliankatz@google.com Signed-off-by: juliankatz <juliankatz@google.com>
Signed-off-by: Heba Elayoty <hebaelayoty@gmail.com> Co-authored-by: Sertaç Özercan <852750+sozercan@users.noreply.github.com> Co-authored-by: Max Smythe <smythe@google.com> Signed-off-by: juliankatz <juliankatz@google.com>
…-policy-agent#1404) Add prefix-based glob matching support across Gatekeeper's various match sections. These include: - Constraints (via the regolib match) - Config resource (via the process excluder) - Mutations (via the match package) Fixes open-policy-agent#1009 Signed-off-by: juliankatz <juliankatz@google.com>
Signed-off-by: juliankatz <juliankatz@google.com>
Signed-off-by: juliankatz <juliankatz@google.com>
Signed-off-by: juliankatz <juliankatz@google.com>
Signed-off-by: juliankatz <juliankatz@google.com>
Signed-off-by: juliankatz <juliankatz@google.com>
Signed-off-by: juliankatz <juliankatz@google.com>
As g8r once had only a validation webhook, certain metrics (request_count, request_count_duration_seconds) were implemented without qualifiers. i.e., the fact that they applied to the validation webhook was _assumed_. As g8r now has both validation and mutation webhooks, separate metrics for both are required. These metrics have already been implemented, and have been released since v3.4.0. The original metrics were deprecated (but retained) at the time of that release. As we are approaching v3.6, it is finally time to remove these metrics. This PR removes the code and updates the metrics README. Fixes open-policy-agent#1010 Signed-off-by: juliankatz <juliankatz@google.com>
…1364) When multiple Pods are running mutators, it is possible that they will ingest mutators in different orders. Before this PR, we would essentially randomly pick which mutator to honor and discard future mutators which conflicted. Fixes: open-policy-agent#1216 Signed-off-by: Will Beason <willbeason@google.com> Signed-off-by: juliankatz <juliankatz@google.com>
Signed-off-by: juliankatz <juliankatz@google.com>
Signed-off-by: juliankatz <juliankatz@google.com>
… a test Signed-off-by: juliankatz <juliankatz@google.com>
Bumps [actions/setup-node](https://github.com/actions/setup-node) from 2.2.0 to 2.3.0. - [Release notes](https://github.com/actions/setup-node/releases) - [Commits](actions/setup-node@v2.2.0...v2.3.0) --- updated-dependencies: - dependency-name: actions/setup-node dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Signed-off-by: juliankatz <juliankatz@google.com>
Signed-off-by: juliankatz <juliankatz@google.com>
… a test Signed-off-by: juliankatz <juliankatz@google.com>
Signed-off-by: juliankatz <juliankatz@google.com>
This PR adds reporting functions to gatekeeper's mutator contollers (assign, assignmetadata) and the mutation System. These metrics track requests to the mutation webhook, the ingestion of mutator objects, the number of active mutators in the system, and the number of iterations required by the System to mutate objects. Fixes open-policy-agent#1214 Signed-off-by: juliankatz <juliankatz@google.com>
Signed-off-by: juliankatz <juliankatz@google.com>
Signed-off-by: juliankatz <juliankatz@google.com>
Signed-off-by: juliankatz <juliankatz@google.com>
* Benchmark System.Mutate Signed-off-by: Will Beason <willbeason@google.com> * Early exit Mutate if no mutations Signed-off-by: Will Beason <willbeason@google.com> Co-authored-by: Oren Shomron <shomron@gmail.com>
Signed-off-by: juliankatz <juliankatz@google.com>
…rror return Signed-off-by: juliankatz <juliankatz@google.com>
Signed-off-by: juliankatz <juliankatz@google.com>
Signed-off-by: juliankatz <juliankatz@google.com>
- Make mutation.System take an options struct instead of injection func - put defer anon func in assign_controller into named method - a bit of test cleanup Signed-off-by: juliankatz <juliankatz@google.com>
Signed-off-by: juliankatz <juliankatz@google.com>
Signed-off-by: juliankatz <juliankatz@google.com>
Signed-off-by: juliankatz <juliankatz@google.com>
Signed-off-by: juliankatz <juliankatz@google.com>
Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 2.0.1 to 2.0.2. - [Release notes](https://github.com/codecov/codecov-action/releases) - [Changelog](https://github.com/codecov/codecov-action/blob/master/CHANGELOG.md) - [Commits](codecov/codecov-action@v2.0.1...v2.0.2) --- updated-dependencies: - dependency-name: codecov/codecov-action dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Will Beason <willbeason@google.com> Signed-off-by: juliankatz <juliankatz@google.com>
… mutation-metrics Signed-off-by: juliankatz <juliankatz@google.com>
91db583
to
7acf4a7
Compare
Signed-off-by: juliankatz <juliankatz@google.com>
… mutation-metrics Signed-off-by: juliankatz <juliankatz@google.com>
… mutation-metrics Signed-off-by: juliankatz <juliankatz@google.com>
ee4e956
to
75398e9
Compare
… mutation-metrics Signed-off-by: juliankatz <juliankatz@google.com>
This PR adds reporting functions to gatekeeper's mutator contollers
(assign, assignmetadata) and the mutation System. These metrics track
requests to the mutation webhook, the ingestion of mutator objects, the
number of active mutators in the system, and the number of iterations
required by the System to mutate objects.
Fixes #1214
Signed-off-by: juliankatz juliankatz@google.com