Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce some unnecessary prometheus metrics. #5006

Merged
merged 14 commits into from
Jun 20, 2022

Conversation

mengxin9014
Copy link
Contributor

@mengxin9014 mengxin9014 commented May 26, 2022

What problem does this PR solve?

Issue Number: close #5080

Problem Summary:

What is changed and how it works?

reduce some unnecessary prometheus metrics.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

reduce some unnecessary prometheus metrics.

@ti-chi-bot
Copy link
Member

ti-chi-bot commented May 26, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • JaySon-Huang
  • windtalker

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added do-not-merge/needs-linked-issue release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels May 26, 2022
dbms/src/Storages/MarkCache.h Outdated Show resolved Hide resolved
dbms/src/Storages/BackgroundProcessingPool.cpp Outdated Show resolved Hide resolved
dbms/src/Interpreters/AsynchronousMetrics.cpp Outdated Show resolved Hide resolved
@ti-chi-bot ti-chi-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/needs-linked-issue release-note-none Denotes a PR that doesn't merit a release note. labels Jun 7, 2022
@mengxin9014 mengxin9014 changed the title WIP: delete useless metrics in TiFlash Reduce some unnecessary prometheus metrics. Jun 7, 2022
@ti-chi-bot ti-chi-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 7, 2022
@mengxin9014 mengxin9014 self-assigned this Jun 7, 2022
@mengxin9014
Copy link
Contributor Author

/run-all-tests

@sre-bot
Copy link
Collaborator

sre-bot commented Jun 7, 2022

Coverage for changed files

too many lines from llvm-cov, please refer to full report instead

Coverage summary

Functions  MissedFunctions  Executed  Lines   MissedLines  Cover
18277      9726             46.79%    204970  97453        52.45%

full coverage report (for internal network access only)

@mengxin9014
Copy link
Contributor Author

/run-all-tests

@sre-bot
Copy link
Collaborator

sre-bot commented Jun 7, 2022

Coverage for changed files

too many lines from llvm-cov, please refer to full report instead

Coverage summary

Functions  MissedFunctions  Executed  Lines   MissedLines  Cover
18277      9726             46.79%    204970  97447        52.46%

full coverage report (for internal network access only)

@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 11, 2022
Copy link
Contributor

@windtalker windtalker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Jun 20, 2022
@mengxin9014
Copy link
Contributor Author

/merge

@ti-chi-bot
Copy link
Member

@mengxin9014: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

@mengxin9014: /merge in this pull request requires 2 approval(s).

In response to this:

/merge

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@sre-bot
Copy link
Collaborator

sre-bot commented Jun 20, 2022

Coverage for changed files

too many lines from llvm-cov, please refer to full report instead

Coverage summary

Functions  MissedFunctions  Executed  Lines   MissedLines  Cover
18331      9649             47.36%    205961  96534        53.13%

full coverage report (for internal network access only)

Comment on lines 99 to 105
M(tiflash_tmt_merge_count, "Total number of TMT engine merge", Counter) \
M(tiflash_tmt_merge_duration_seconds, "Bucketed histogram of TMT engine merge duration", Histogram, \
F(type_tmt_merge_duration, {{"type", "tmt_merge_duration"}}, ExpBuckets{0.0005, 2, 20})) \
F(type_tmt_merge_duration, {{"type", "tmt_merge_duration"}}, ExpBuckets{0.001, 2, 20})) \
M(tiflash_tmt_write_parts_count, "Total number of TMT engine write parts", Counter) \
M(tiflash_tmt_write_parts_duration_seconds, "Bucketed histogram of TMT engine write parts duration", Histogram, \
F(type_tmt_write_duration, {{"type", "tmt_write_parts_duration"}}, ExpBuckets{0.0005, 2, 20})) \
F(type_tmt_write_duration, {{"type", "tmt_write_parts_duration"}}, ExpBuckets{0.001, 2, 20})) \
M(tiflash_tmt_read_parts_count, "Total number of TMT engine read parts", Gauge) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove these metrics of tmt engine

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove these metrics of tmt engine

done

@JaySon-Huang
Copy link
Contributor

The panels shown in Grafana by "heatmap" are strange and misleading after we changed the bucket boundary for the following metrics:

  • tiflash_storage_page_gc_duration_seconds
  • tiflash_raft_apply_write_command_duration_seconds
  • tiflash_raft_write_data_to_storage_duration_seconds
  • tiflash_raft_upstream_latency

Maybe leave those metrics drawn by "heatmap" unchanged?

When the time range contains both before upgrade and after upgrade

image
image

When the time range contains only before upgrade

image
image

When the time range contains only after upgrade

image
image

@ti-chi-bot ti-chi-bot added needs-cherry-pick-release-6.0 Type: Need cherry pick to release-6.0 needs-cherry-pick-release-6.1 Should cherry pick this PR to release-6.1 branch. labels Jun 20, 2022
@mengxin9014
Copy link
Contributor Author

The panels shown in Grafana by "heatmap" are strange and misleading after we changed the bucket boundary for the following metrics:

  • tiflash_storage_page_gc_duration_seconds
  • tiflash_raft_apply_write_command_duration_seconds
  • tiflash_raft_write_data_to_storage_duration_seconds
  • tiflash_raft_upstream_latency

Maybe leave those metrics drawn by "heatmap" unchanged?

When the time range contains both before upgrade and after upgrade

image image

When the time range contains only before upgrade

image image

When the time range contains only after upgrade

image image

done.

@mengxin9014
Copy link
Contributor Author

/run-all-tests

Copy link
Contributor

@JaySon-Huang JaySon-Huang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with minor comment

Comment on lines 99 to 104
M(tiflash_tmt_merge_count, "Total number of TMT engine merge", Counter) \
M(tiflash_tmt_merge_duration_seconds, "Bucketed histogram of TMT engine merge duration", Histogram, \
F(type_tmt_merge_duration, {{"type", "tmt_merge_duration"}}, ExpBuckets{0.0005, 2, 20})) \
F(type_tmt_merge_duration, {{"type", "tmt_merge_duration"}}, ExpBuckets{0.001, 2, 20})) \
M(tiflash_tmt_write_parts_count, "Total number of TMT engine write parts", Counter) \
M(tiflash_tmt_write_parts_duration_seconds, "Bucketed histogram of TMT engine write parts duration", Histogram, \
F(type_tmt_write_duration, {{"type", "tmt_write_parts_duration"}}, ExpBuckets{0.0005, 2, 20})) \
M(tiflash_tmt_read_parts_count, "Total number of TMT engine read parts", Gauge) \
F(type_tmt_write_duration, {{"type", "tmt_write_parts_duration"}}, ExpBuckets{0.001, 2, 20})) \
Copy link
Contributor

@JaySon-Huang JaySon-Huang Jun 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these metrics can also be removed

  • tiflash_tmt_merge_count
  • tiflash_tmt_merge_duration_seconds
  • tiflash_tmt_write_parts_count
  • tiflash_tmt_write_parts_duration_seconds
    * tiflash_tmt_read_parts_count

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Jun 20, 2022
@mengxin9014
Copy link
Contributor Author

/merge

@ti-chi-bot
Copy link
Member

@mengxin9014: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 3b46ee5

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Jun 20, 2022
@ti-chi-bot
Copy link
Member

@mengxin9014: Your PR was out of date, I have automatically updated it for you.

At the same time I will also trigger all tests for you:

/run-all-tests

If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@sre-bot
Copy link
Collaborator

sre-bot commented Jun 20, 2022

Coverage for changed files

too many lines from llvm-cov, please refer to full report instead

Coverage summary

Functions  MissedFunctions  Executed  Lines   MissedLines  Cover
18335      9649             47.37%    206177  96568        53.16%

full coverage report (for internal network access only)

@ti-chi-bot ti-chi-bot merged commit 40baeca into pingcap:master Jun 20, 2022
ti-chi-bot pushed a commit to ti-chi-bot/tiflash that referenced this pull request Jun 20, 2022
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #5173.

@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #5174.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-6.0 affects-6.1 needs-cherry-pick-release-6.0 Type: Need cherry pick to release-6.0 needs-cherry-pick-release-6.1 Should cherry pick this PR to release-6.1 branch. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reduce unnecessary prometheus metrics
5 participants