WIP: doc how to find out the bottleneck resources for trouble shouting high latency issue #1927

st1page · 2024-03-05T16:57:24Z

Info

Description
- [ What's changed? Which parts of the docs are affected? ]
Notes
- [ Include any supplementary context or references here. ]
Related code PR
- [ Provide a link to the relevant code PR here, if applicable. ]
Related doc issue

Resolves [ Provide a link to the relevant doc issue here, if applicable. ]

For reviewers

Preview
- [ Paste the preview link to the updated page(s) here. Edit this item after the preview site is ready. To find the updated pages, scroll down to locate and open the Amplify preview link and select the dev version of the documentation. ]
Key points
- [ Parts that may need revision or extra consideration. ]

Before merging

I have checked the doc site preview, and the updated parts look good.
I have acquired the approval from the owner (and optionally the reviewers) of the code PR and at least one tech writer (CharlieSYH, emile-00, & hengm3467).

aws-amplify-us-east-1 · 2024-03-05T17:08:51Z

This pull request is automatically being deployed by Amplify Hosting (learn more).

Access this pull request here: https://pr-1927.d2fbku9n2b6wde.amplifyapp.com

fuyufjh · 2024-03-06T05:51:43Z

LGTM 👍. It would be even better if we can put some example screenshots to illustrate what does the normal and abnormal cases look like.

neverchanje · 2024-03-22T08:49:46Z

docs/troubleshoot/troubleshoot-high-latency.md

+
+  **Grafana dashboard (dev)** > **Cluster Node** > **Node CPU** panel, and find the "cpu usage (avg per core) - compute" time series
+
+### State bottleneck(write & compaction)


IMO the key metric should be the backpressure rate.
If I understand correctly, when it's "slow" and hits some bottleneck, there must be some "backpressured" streaming jobs.
So the first step to identify the job should be identifying the jobs with highest BP value.

Yes, it is contained in the previous chapter "## Diagnosis —— find out the bottleneck streaming job" and has published in our doc

st1page added 3 commits March 6, 2024 00:26

add Diagnosis —— find out the bottleneck resources

d5d6f05

add related metrics

48c7c04

reorder

c4522c4

neverchanje reviewed Mar 22, 2024

View reviewed changes

st1page changed the title ~~doc how to find out the bottleneck resources for trouble shouting high latency issue~~ WIP: doc how to find out the bottleneck resources for trouble shouting high latency issue Mar 25, 2024

st1page marked this pull request as ready for review March 25, 2024 06:19

st1page marked this pull request as draft March 25, 2024 06:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: doc how to find out the bottleneck resources for trouble shouting high latency issue #1927

WIP: doc how to find out the bottleneck resources for trouble shouting high latency issue #1927

st1page commented Mar 5, 2024

aws-amplify-us-east-1 bot commented Mar 5, 2024

fuyufjh commented Mar 6, 2024

neverchanje Mar 22, 2024

st1page Mar 25, 2024 •

edited

Loading


		Grafana dashboard (dev) > Cluster Node > Node CPU panel, and find the "cpu usage (avg per core) - compute" time series

		### State bottleneck(write & compaction)

WIP: doc how to find out the bottleneck resources for trouble shouting high latency issue #1927

Are you sure you want to change the base?

WIP: doc how to find out the bottleneck resources for trouble shouting high latency issue #1927

Conversation

st1page commented Mar 5, 2024

Info

For reviewers

Before merging

aws-amplify-us-east-1 bot commented Mar 5, 2024

fuyufjh commented Mar 6, 2024

neverchanje Mar 22, 2024

Choose a reason for hiding this comment

st1page Mar 25, 2024 • edited Loading

Choose a reason for hiding this comment

st1page Mar 25, 2024 •

edited

Loading