Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle protocol upgrades in congestion control #11923

Open
Tracked by #77
wacban opened this issue Aug 12, 2024 · 6 comments
Open
Tracked by #77

Handle protocol upgrades in congestion control #11923

wacban opened this issue Aug 12, 2024 · 6 comments
Assignees

Comments

@wacban
Copy link
Contributor

wacban commented Aug 12, 2024

Congestion Control keeps track of the gas and size of the receipts in buffers by storing the current total in the headers and updating it with deltas as computed in the apply function.

The gas and size of a receipt is computed and added to the total when receipt is buffered and re-computed and subtracted from the total when the receipt is removed from the buffer. This sadly will break when the runtime parameters are changed for any receipt that is added to the buffer before protocol upgrade and removed after the protocol upgrade.

This is also a blocker for introducing a cost for the PromiseYield receipts which are currently unaccounted for as discovered in #11873.

@wacban wacban self-assigned this Aug 12, 2024
@wacban
Copy link
Contributor Author

wacban commented Aug 12, 2024

cc @jancionear and @jakmeier

I can think of a few potential solutions to this issue:

  • Store the receipts together with the gas and size when added and use that when removing the receipt from buffer.
  • Store the receipts together with the protocol version when added ...
  • Allow invalid gas and size. Reset to zero if it goes below zero or if the corresponding buffers / queues are empty.

I'm curious if you have any better ideas and if you have any preference for which solution is the best.

@jancionear
Copy link
Contributor

jancionear commented Aug 12, 2024

Ah that sucks :/

I'm curious if you have any better ideas and if you have any preference for which solution is the best.

Store the receipts together with the gas and size when added and use that when removing the receipt from buffer.

This sounds like the most proper and robust solution 👍. We can keep a few more bytes per receipt, it shouldn't matter to much. The only problem is that modifying the current code sounds like a huge pain :/ I think this would be the proper way to do it if we were doing it from scratch, but now with existing code I'm not sure if it's worth it...

Store the receipts together with the protocol version when added ...

In theory a bit more space efficient, but less robust. The headache doesn't feel worth it to me. We can keep 16 bytes more per receipt, it's not a big problem.

Allow invalid gas and size. Reset to zero if it goes below zero or if the corresponding buffers / queues are empty.

Sounds good to me 👍. It's not the prettiest solution, but it'd solve the problem and wouldn't require too many code changes. I think I'd be in favor of this one 👍
There might be weird situations where we have a lot of receipts in the buffer and zero congestion, but it shouldn't be too bad. All the new receipts would start increasing congestion again and the shard shouldn't be overran.

@jakmeier
Copy link
Contributor

Uhh yeah I completely missed this issue, thanks for uncovering it!

Store the receipts together with the gas and size when added and use that when removing the receipt from buffer.

I also think this is the most robust solution. Perhaps almost over-engineered but definitely a clean way to get it done.

Store the receipts together with the protocol version when added ...

Possible, yes. A bit more subtle to get right compared to solution 1 but still quite robust IMO. Having the protocol version per receipt would even be a bit more general and therefore perhaps advantageous for future changes.

I wonder if we could optimize this solution. Instead of storing the protocol version with every receipt, we could keep the meta data externally. We have long ranges of the same protocol version with monotonically increasing protocol versions, storing it all explicitly on every receipts seems a bit overkill. We can do it with constant overhead rather than liner in the number of receipts.

First idea that comes to mind: Augment the receipt buffers and queues with a map of protocol versions to first receipt index.

For example, assume the queue start is at index 100 and it holds 5 receipts with protocol versions [70, 70, 71, 71, 71], then we would store "first_receipt_of_version": { "70": 0, "71": 102 }. After we pop two more receipts from the front, we can remove the entry for 70 and are left with only "first_receipt_of_version": { "71": 102 } until the next protocol upgrade.

If we store this in the trie, we could look up the protocol version for the receipt every time we pop a receipt from it. And every time we push, we have to ensure the current protocol version is correct or otherwise add a new entry to the map.
This adds a bit of overhead but I think it should be rather small. The map should never grow above length 2 under normal circumstances, so we can represent it as a vectorized list and simply store it directly in the same trie node as the start/end index of the queue. (Could just be a field of TrieQueueIndices)
This way it would not increase the number of trie accesses. And code wise this logic can be contained almost entirely inside the existing trait TrieQueue.

Maybe you can also come up with a better design, that's just what I thought about.

But again, solution 1 is probably easier to implement and more robust for now. I would probably tend towards solution 1, if we can bare the extra bytes per receipt. But this second solution seems viable, too, and would be more efficient.

Allow invalid gas and size. Reset to zero if it goes below zero or if the corresponding buffers / queues are empty.

Hm,I don't like it. In many other projects I would probably praise the pragmatic approach but not in this context where we are talking about data structures at the core of the blockchain state transitions. If something doesn't add up, I don't want to have it ignored silently, unless we can be 100% certain it was due to a protocol upgrade.

Also, imagine you join a company and see this sort of solution in the code. Personally, I would call this technical debt left by my predecessors. I don't think we are under particular time pressure to get this done, so let's maybe avoid adding technical debt.

@wacban
Copy link
Contributor Author

wacban commented Aug 13, 2024

Also, imagine you join a company and see this sort of solution in the code. Personally, I would call this technical debt left by my predecessors. I don't think we are under particular time pressure to get this done, so let's maybe avoid adding technical debt.

Hehe thanks that made me laugh :) Nothing like some fresh legacy code :D

@wacban
Copy link
Contributor Author

wacban commented Aug 13, 2024

Thanks both, personally I'm leaning towards solution 1. I will attempt it first and get a feel for how much of a pain adding it would be. If it's too crazy I will come back and reconsider some of the other options.

@jancionear
Copy link
Contributor

I was worried that modifying the runtime to handle all these cases would introduce so much complexity that it becomes worse than a little bit of hacky code. Complexity in runtime makes it harder to develop code for everyone, while the hacky bit matters only to people working on congestion control. For correctness we could still check that the gas drops to zero in the tests that don't do protocol changes.
But it's worth a try, I don't have a strong opinion.

github-merge-queue bot pushed a commit that referenced this issue Sep 23, 2024
…n protocol upgrades (#12031)

For now just a request for comments. There are still a few missing
pieces like adding tests and gating this feature by a protocol feature.
Please let me know if this approach makes sense on a high level.

This PR addresses the issue described in #11923. Please have a look at
that first for context.

This is a hacky implementation of option 1:
"Store the receipts together with the gas and size when added and use
that when removing the receipt from buffer."

The basic idea is that instead of storing `Receipt` directly in state,
it can be wrapped in a `StateStoredReceipt` together with the metadata
that is needed to correctly handle protocol upgrades.

In order to migrate from the old way of storing receipts I'm using a
rather hacky solution. I implemented a custom serialization and
deserialization for the `StateStoredReceipt`. Using that customization
it's possible to discriminate between serialized Receipt and serialized
StateStoredReceipt. I added a helper struct
`ReceiptOrStateStoredReceipt` and I made it so that both Receipt and
StateStoredReceipt can be deserialized directly to it.

How to differentiate between `Receipt` and `StateStoredReceipt` ?
* Receipt::V0 has first two bytes in the form of [x, 0] - the second
byte is zero.
* Receipt::V1 has first two bytes in the form of [1, x] - where x != 0 -
the first byte is one and second is non-zero.
* Receipt::Vn (future) has first two bytes in the form of [n, x] - where
x != 0
* StateStoredReceipt has first two bytes in the for of [T, T] where T is
a magic tag == u8::MAX.

The `StateStoredReceipt` can be told apart from Receipt::V0 by the
second byte.
The `StateStoredReceipt` can be told apart from Receipt::Vn by the first
byte.
Receipt::V0 and Receipt::V1 can be told apart by the second byte as
described in

https://github.com/near/nearcore/blob/0e5e7c7234952d1db8cbbafc984b7c8e6a1ac0ba/core/primitives/src/receipt.rs#L105-L118

How will the migration from `Receipt` to `StateStoredReceipt` happen? 
* In the old protocol version receipts will be stored as `Receipt`
* In the new protocol version receipts will be stored as
`StateStoredReceipt`
* In both version receipts will be read as
`ReceiptOrStateStoredReceipt`. This covers the fun case where receipts
are stored in the old version and read in the new version.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants