Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: limit the length of key/value #14645

Merged
merged 15 commits into from
Mar 17, 2023
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,7 @@ Ref: https://keepachangelog.com/en/1.0.0/
* (x/feegrant) [#14294](https://github.com/cosmos/cosmos-sdk/pull/14294) Moved the logic of rejecting duplicate grant from `msg_server` to `keeper` method.
* (store) [#14378](https://github.com/cosmos/cosmos-sdk/pull/14378) The `CacheKV` store is thread-safe again, which includes improved iteration and deletion logic. Iteration is on a strictly isolated view now, which is breaking from previous behavior.
* (x/staking) [#14590](https://github.com/cosmos/cosmos-sdk/pull/14590) `MsgUndelegateResponse` now includes undelegated amount. `x/staking` module's `keeper.Undelegate` now returns 3 values (completionTime,undelegateAmount,error) instead of 2.
* (store)[#14645](https://github.com/cosmos/cosmos-sdk/pull/14645) Add limit to the length of key and value.

### API Breaking Changes

Expand Down
19 changes: 16 additions & 3 deletions store/types/validity.go
Original file line number Diff line number Diff line change
@@ -1,15 +1,28 @@
package types

// AssertValidKey checks if the key is valid(key is not nil)
var (
// 128K - 1
MaxKeyLength = (1 << 17) - 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have preferred these two constants to be variables that can be modified at startup, I think the key-lengths are reasonable but we don't have full knowledge of what other protocols are doing ( I doubt even 'them' have no clear idea of the size of their own keys)

EG in cmd/appd/main.go:

func main() {
       storetypes.MaxKeyLength = my safe limit
       storetypes.MaxValueLength = my safe limit
       
       rootCmd().Execute ....
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable to me. It would allow new chains to be much more restrictive on key lengths for example. Existing chains can scan their keys in usage and put a limit based on their real world data.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, done.

// 2G - 1
MaxValueLength = (1 << 31) - 1
Comment on lines +6 to +7
Copy link

@peterbourgon peterbourgon Jan 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default size of a LevelDB log file is 2MB and compaction occurs at 4MB. It is a core assumption of LevelDB operations, like compaction, that these files will contain many key/val pairs. Many promises of the database, including but not limited to performance, rely on that assumption.

The SDK uses the default settings (edit: specifically these default settings) when creating a GoLevelDB backend. If the SDK writes a single key/val pair of larger than 4MB, then it will occupy its own unique file. This may not be a problem if there are only a handful of key/val pairs that are that large, but if there are more than just a few, then it does become an issue.

Copy link
Collaborator Author

@yihuang yihuang Jan 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large keys definitely hurt performance because they are usually part of the index structure to do binary search, and need to be loaded into ram to be performant. I guess no one is actually using large keys in practice.
Maybe a better thing to do is setting a smaller limit in sdk, but third-parties can enlarge it since it's a variable, it's safer than having a bigger limit in sdk while third-party setting a smaller one. But we'll have more communication to do in this way, WDYT? @webmaster128 @testinginprod

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CosmWasm never allowed values greater than 128 KiB. I don't know how much wrapping overhead there is in wasmd (probably none) and iavl. But I guess any limit >= 256 KiB would not hurt us.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large keys

We are talking about values, right?

Copy link
Collaborator Author

@yihuang yihuang Jan 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keys ;D I think there's less worries about large values

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 2-4MB limits I discuss above are impacted by the total size of a key/val pair, not just the key size or val size in isolation. My main point is that if you want to increase either of those limits, or both of them, to something approaching those 2-4MB limits, then you would also need to increase the limits of LevelDB itself; increasing the former without changes to the latter would result in really bad outcomes.

)

// AssertValidKey checks if the key is valid(key is not nil, not empty and within length limit)
func AssertValidKey(key []byte) {
if len(key) == 0 {
panic("key is nil")
panic("key is nil or empty")
}
if len(key) > MaxKeyLength {
panic("key is too large")
}
}

// AssertValidValue checks if the value is valid(value is not nil)
// AssertValidValue checks if the value is valid(value is not nil and within length limit)
func AssertValidValue(value []byte) {
if value == nil {
panic("value is nil")
}
if len(value) > MaxValueLength {
panic("value is too large")
}
tac0turtle marked this conversation as resolved.
Show resolved Hide resolved
}
Comment on lines +10 to 28

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid functions like these need to return errors rather than panicking, so that callers can detect when the validation checks have failed. Otherwise, there's no reason to have these functions at all, because any access of invalid values would trigger panics, anyway.