-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(metrics): Add failure_reason label to ingress_controller_configuration_push_count metric #2965
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this behave with these failures under database mode? Initial read suggests this will never mark a conflict with that, since it'll get back some sort of deck failure instead of any underlying go-kong failures.
We may not really be able to do anything yet given that the deck syncer doesn't always have the most useful errors, but a brief read indicates that we do get back an error array that we can walk to see if any are conflicts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please resolve the conflicts
Co-authored-by: Patryk Małek <patryk.malek@konghq.com>
7f4ae3f
to
2fbcb50
Compare
9fd3c63
to
740fec9
Compare
81e27be
to
619cb9b
Compare
619cb9b
to
7b86c89
Compare
I've tested that with Postgres mode and you were right, in case of I extended that with detecting network errors specifically as well as it was cheap enough. |
move changes back into 2.7, some rewrites.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved changelog back to 2.7 with some minor edits. It's still stuck as I try and wait for E2E tests to slowly reveal if they're no longer dying due to environment failures. We already have a minor bump and this is pretty innocuous, so it may as well go out.
deck/format string changes look addressed, so 👍
What this PR does / why we need it:
failure_reason
label to the already existingingress_controller_configuration_push_count
metric which is reported every time configuration is pushed to the data planeconflict
,network
,other
.Which issue this PR fixes:
Part of #2484
Special notes for your reviewer:
I was able to trigger configuration conflicts by adding consumers violating the unique username constraint. Admin API was responding with HTTP code
409
. I'm leveraging this, although it's not documented in Admin API's OpenAPI definitions.In general, it's hard to rely on deck to return any specific error therefore some error heuristics were needed.
PR Readiness Checklist:
Complete these before marking the PR as
ready to review
:CHANGELOG.md
release notes have been updated to reflect any significant (and particularly user-facing) changes introduced by this PR