Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix hash name bug (#479) #482

Merged
merged 6 commits into from
Jul 5, 2024
Merged

Conversation

Okabe-Rintarou-0
Copy link
Member

What type of PR is this?
bug

What this PR does / why we need it:

Which issue(s) this PR fixes:
Fixes #479

Special notes for your reviewer:

Does this PR introduce a user-facing change?:


Signed-off-by: 923048992@qq.com <923048992@qq.com>
Signed-off-by: 923048992@qq.com <923048992@qq.com>
// HashName converts a string to a uint32 integer as the key of bpf map
type HashName struct {
numToStr map[uint32]string
// records its tombstone number given a string
tombstones map[string]uint32
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of mem leak. Our requirement is:

  1. a str should be allocated with a consistent value
  2. after it deleted, and then at some time recreated, the value can be different.
  3. It's better to be consistent for the case that kmesh daemon restart, which seems difficult with this implement. Maybe we can persist the maps somewhere.

Copy link
Member Author

@Okabe-Rintarou-0 Okabe-Rintarou-0 Jul 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2. after it deleted, and then at some time recreated, the value can be different.

Requirement 1 means for a str like "foo", if we call StrToNum we will get number 123. Then at anytime we StrToNum("foo") should always get 123? Even if collision happened?
Can you give an example about requirement 2?
Thanks 😊

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the original implemention, there exists another problem.
If A collides with B.
If StrToNum(A) = 123, then StrToNum(B) = 124
Then we delete A, we will delete numToStr[123].
Then we delete B, we will also delete numToStr[123], cause Delete will call StrToNum, where StrToNum(B) will be 123, if A is deleted(since there is no collision any more). In this case, we will leave numToStr[124] = B, which is also a kind of mem leak.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after it deleted, and then at some time recreated, the value can be different.

say foo at t0 is allcated 1234, then at t1, it is deleted. In our case, all the info is deleted from bpf map.

Maybe t2, foo is recreated, then we can allocate it to 2345. It need to be consistent during its lifecycle.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got it. We need to persist the numtostr map in case kmesh restarts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the original implemention, there exists another problem. If A collides with B. If StrToNum(A) = 123, then StrToNum(B) = 124 Then we delete A, we will delete numToStr[123]. Then we delete B, we will also delete numToStr[123], cause Delete will call StrToNum, where StrToNum(B) will be 123, if A is deleted(since there is no collision any more). In this case, we will leave numToStr[124] = B, which is also a kind of mem leak.

to solve this problem, we will still need the tombstone marker

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only need to store string that cause collision.
For example, if A collides with B.
We call StrToNum(A), then StrToNum(B).
A got "123", B got "124"
We only need to persist (B, 124). Since after restart, we call StrToNum(B) we can get StrToNum(B) = 124 from persisted map. And StrToNum(A) = 123 according to hash value.

Signed-off-by: 923048992@qq.com <923048992@qq.com>
Signed-off-by: 923048992@qq.com <923048992@qq.com>
} else if h.numToStr[num] == str {
h.strToNum[str] = num
// Create a new item here, should flush
if err := h.flush(); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we write delta data? Flush all seems very expensive

Copy link
Member Author

@Okabe-Rintarou-0 Okabe-Rintarou-0 Jul 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. What about appending it?
Like:
Before:

{123: "foo"}

After(remove '}' and add new item, and add '}' again, instead of write all data):

{123: "foo", 124: "bar"}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe yaml is simpler

foo: 123
bar: 124

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe yaml is simpler

foo: 123
bar: 124

right, that's a better idea.

if num, exits := h.strToNum[str]; exits {
return num
}

// Using linear probing to solve hash conflicts
for num = hash.Sum32(); num < math.MaxUint32; num++ {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not related to your code。

If num overflows, should we iterate from 0

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's necessary. It should be a ring.

Signed-off-by: 923048992@qq.com <923048992@qq.com>
Signed-off-by: 923048992@qq.com <923048992@qq.com>
Copy link
Member

@hzxuzhonghu hzxuzhonghu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

we can use the pkg sigs.k8s.io/yaml, which is from k8s ecosystem

@kmesh-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hzxuzhonghu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kmesh-bot kmesh-bot merged commit 3079f87 into kmesh-net:main Jul 5, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HashName can output different hash value for same str
3 participants