Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

carbonserver: fix a cache hit bug #494

Merged
merged 2 commits into from
Oct 6, 2022

Conversation

cxfcxf
Copy link

@cxfcxf cxfcxf commented Sep 8, 2022

original issue

I was originally writing issue on carbonapi github Issue

after some debug and code reading, i found out issue is with the carbonserver->render->cache function

the PR provided fix would expand cache key by 2 times, i dont like it, but i think you guys want to keep this key so its human readable?
the preferable way is run md5sum on strings.Join(targetKeys, "&") before line 318 so the key would be way less in length but it wont be human readable

let me describe the issue here again

when you generate two queries similarly with the same from&until in the same request

  1. metrics.FRP[PPP]-ADE*-IAD[23].*_*.impression_log_DEBUG_FIELD_VPS_COUNT
  2. metrics.FRP[PP]-ADE*-IAD[23].*_*.impression_log_DEBUG_FIELD_VPS_COUNT

curl would be like this

FROM=1661708057;curl -s "http://carbonapi/render?target=metrics.FRP\[PPP\]-ADE*-IAD\[23\].*_*.impression_log_DEBUG_FIELD_VPS_COUNT&target=metrics.FRP\[PP\]-ADE*-IAD\[23\].*_*.impression_log_DEBUG_FIELD_VPS_COUNT&from=${FROM}&until=-180s&format=json"

Both should return exactly same result (but in our test when cache enabled sometime it wouldnt)
and we have carbonapi->go-carbon through pbv3

the problem

so the problem is when you have carbonserver render cache enabled.
it would take the targets->targets map[timeRange][]target and generate a key based on names&from&until
so in my example, the first query would generate bunch of cache that has key->value which would be used (cache hit) in second query since they should be returning exactly same data (assuming they are not expired)

so when

First query caches the key, it stores the value of the key, which is response type which contains PathExpression of first query's PathExpression

then

Second query would hit some of the cache (First query sets), and it gets the response from cache with First querys PathExpression,

as a result

when this gets returned to carbonapi, carbonapi would treat it as wrong request target, the data would not be merged to second query's exprssion cause Second query has different PathExpression than Frist query

i added some debug code to carbonapi and generate this

wrong

Sep 05 14:49:56 carbonapi-server carbonapi[12688]: expr.go line: 73  873
Sep 05 14:49:56 carbonapi-server carbonapi[12688]: expr.go line: 84 key: {metrics.FRP[PPP]-ADE*-IAD[23].*_*.impression_log_DEBUG_FIELD_VPS_COUNT 1661708059 1662414416} value: 873
Sep 05 14:49:56 carbonapi-server carbonapi[12688]: expr.go line: 95 key: {metrics.FRP[PPP]-ADE*-IAD[23].*_*.impression_log_DEBUG_FIELD_VPS_COUNT 1661708059 1662414416} value: 873
Sep 05 14:49:56 carbonapi-server carbonapi[12688]: render_handler.go target: metrics.FRP[PPP]-ADE*-IAD[23].*_*.impression_log_DEBUG_FIELD_VPS_COUNT, count: 873
Sep 05 14:49:57 carbonapi-server carbonapi[12688]: expr.go line: 73  873
Sep 05 14:49:57 carbonapi-server carbonapi[12688]: expr.go line: 84 key: {metrics.FRP[PPP]-ADE*-IAD[23].*_*.impression_log_DEBUG_FIELD_VPS_COUNT 1661708059 1662414416} value: 873
Sep 05 14:49:57 carbonapi-server carbonapi[12688]: expr.go line: 84 key: {metrics.FRP[PP]-ADE*-IAD[23].*_*.impression_log_DEBUG_FIELD_VPS_COUNT 1661708059 1662414416} value: 773
Sep 05 14:49:57 carbonapi-server carbonapi[12688]: expr.go line: 84 key: { 1661708059 1662414416} value: 100
Sep 05 14:49:57 carbonapi-server carbonapi[12688]: expr.go line: 95 key: {metrics.FRP[PP]-ADE*-IAD[23].*_*.impression_log_DEBUG_FIELD_VPS_COUNT 1661708059 1662414416} value: 773
Sep 05 14:49:57 carbonapi-server carbonapi[12688]: render_handler.go target: metrics.FRP[PP]-ADE*-IAD[23].*_*.impression_log_DEBUG_FIELD_VPS_COUNT, count: 773
Sep 05 14:49:57 carbonapi-server carbonapi[12688]: render_handler.go number: 1646

right

Sep 05 14:56:26 carbonapi-server carbonapi[12688]: expr.go line: 73  873
Sep 05 14:56:26 carbonapi-server carbonapi[12688]: expr.go line: 84 key: {metrics.FRP[PPP]-ADE*-IAD[23].*_*.impression_log_DEBUG_FIELD_VPS_COUNT 1661708059 1662414805} value: 873
Sep 05 14:56:26 carbonapi-server carbonapi[12688]: expr.go line: 95 key: {metrics.FRP[PPP]-ADE*-IAD[23].*_*.impression_log_DEBUG_FIELD_VPS_COUNT 1661708059 1662414805} value: 873
Sep 05 14:56:26 carbonapi-server carbonapi[12688]: render_handler.go target: metrics.FRP[PPP]-ADE*-IAD[23].*_*.impression_log_DEBUG_FIELD_VPS_COUNT, count: 873
Sep 05 14:56:27 carbonapi-server carbonapi[12688]: expr.go line: 73  873
Sep 05 14:56:27 carbonapi-server carbonapi[12688]: expr.go line: 84 key: {metrics.FRP[PPP]-ADE*-IAD[23].*_*.impression_log_DEBUG_FIELD_VPS_COUNT 1661708059 1662414805} value: 873
Sep 05 14:56:27 carbonapi-server carbonapi[12688]: expr.go line: 84 key: {metrics.FRP[PP]-ADE*-IAD[23].*_*.impression_log_DEBUG_FIELD_VPS_COUNT 1661708059 1662414805} value: 873
Sep 05 14:56:27 carbonapi-server carbonapi[12688]: expr.go line: 95 key: {metrics.FRP[PP]-ADE*-IAD[23].*_*.impression_log_DEBUG_FIELD_VPS_COUNT 1661708059 1662414805} value: 873
Sep 05 14:56:27 carbonapi-server carbonapi[12688]: render_handler.go target: metrics.FRP[PP]-ADE*-IAD[23].*_*.impression_log_DEBUG_FIELD_VPS_COUNT, count: 873
Sep 05 14:56:27 carbonapi-server carbonapi[12688]: render_handler.go number: 1746

as you can see some of the data gets the key as { 1661708059 1662414416} because they contian first query's PathExpression due to cache hit

if you would not care about key being too long, the PR is enough to fix the problem
but if you want shorter key, i can change md5 before key generation so it would just be a md5sum + format

easy work around

an easy work around is turn off the render cache

btw i didnt really test the code, i just write it as my golang memory allow, i can add test case or if we wanna go other direction

@cxfcxf cxfcxf changed the title fix a cache hit bug carbonserver: fix a cache hit bug Sep 8, 2022
@@ -288,7 +295,8 @@ func (listener *CarbonserverListener) getRenderCacheKeyAndSize(targets map[timeR
for tr, ts := range targets {
names := make([]string, 0, len(ts))
for _, t := range ts {
names = append(names, t.Name)
pathExpressionMD5 = getMD5HashString(t.PathExpression)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:= or var missed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:= or var missed?

yup, needs :=, sorry

@deniszh deniszh merged commit 094c83a into go-graphite:master Oct 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants