-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Observe index gateway request count per tenant #9781
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This change adds a new `loki_index_gateway_client_requests_total` which reports the total amount of requests performed by clients against the index gateway. Even though, there is already the histogram metric `loki_index_gateway_request_duration_seconds`, which counts the requests, the new metric also reports the tenant ID as well as the status ("success" or "error"). The tenant ID cannot be reported by the former metric, because it is implemented as part of the generic gRPC client instrumentation. The new metric allows to observe the RPS per tenant, so you can draw conclusions about the required index gateway shard size per tenant. Note that the new metric is only used when the index gateway is running in ring mode. Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
DylanGuedes
approved these changes
Jun 23, 2023
chaudum
force-pushed
the
chaudum/per-tenant-index-gw-request-count
branch
from
June 23, 2023 13:17
5031d87
to
806ef0d
Compare
MichelHollands
approved these changes
Jun 23, 2023
7 tasks
chaudum
added a commit
that referenced
this pull request
Jun 27, 2023
This commit add a counter metric `loki_index_gateway_requests_total` with labels `operation`, `tenant`, `status` for gRPC requests that are served by the index gateway. **What for?** The per-tenant RPS on the index gateway is used to derive the per-tenant shard factor. **Why tracking on the server?** Unlike tracking index gateway RPS on the client side, tracking on the server side does not yield that many series, even in multi-tenant installations with a lot of tenants, because the amount of index gateway instances is relatively small compared to the amount of queriers and frontends. **Special notes for your reviewer**: The previous approach of tracking requests on the client #9781 has been abandoned. Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
chaudum
added a commit
that referenced
this pull request
Jun 27, 2023
This commit add a counter metric `loki_index_gateway_requests_total` with labels `operation`, `tenant`, `status` for gRPC requests that are served by the index gateway. **What for?** The per-tenant RPS on the index gateway is used to derive the per-tenant shard factor. **Why tracking on the server?** Unlike tracking index gateway RPS on the client side, tracking on the server side does not yield that many series, even in multi-tenant installations with a lot of tenants, because the amount of index gateway instances is relatively small compared to the amount of queriers and frontends. **Special notes for your reviewer**: The previous approach of tracking requests on the client #9781 has been abandoned. Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
chaudum
added a commit
that referenced
this pull request
Jun 27, 2023
…#9804) **This is a backport of #9797 to k156** --- This commit add a counter metric `loki_index_gateway_requests_total` with labels `operation`, `tenant`, `status` for gRPC requests that are served by the index gateway. **What for?** The per-tenant RPS on the index gateway is used to derive the per-tenant shard factor. **Why tracking on the server?** Unlike tracking index gateway RPS on the client side, tracking on the server side does not yield that many series, even in multi-tenant installations with a lot of tenants, because the amount of index gateway instances is relatively small compared to the amount of queriers and frontends. **Special notes for your reviewer**: The previous approach of tracking requests on the client #9781 has been abandoned. Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Warning: THIS POTENTIALLY ADDS A LOT OF SERIES (
tenants * pods
)What this PR does / why we need it:
This change adds a new
loki_index_gateway_client_requests_total
which reports the total amount of requests performed by clients against the index gateway.Even though, there is already the histogram metric
loki_index_gateway_request_duration_seconds
, which counts the requests, the new metric also reports the tenant ID as well as the status ("success" or "error"). The tenant ID cannot be reported by the former metric, because it is implemented as part of the generic gRPC client instrumentation.The new metric allows to observe the RPS per tenant, so you can draw conclusions about the required index gateway shard size per tenant.
Note that the new metric is only used when the index gateway is running in ring mode.
Special notes for your reviewer:
Checklist
CONTRIBUTING.md
guide (required)CHANGELOG.md
updatedadd-to-release-notes
labeldocs/sources/upgrading/_index.md
production/helm/loki/Chart.yaml
and updateproduction/helm/loki/CHANGELOG.md
andproduction/helm/loki/README.md
. Example PR