Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(op-node): pre-fetch receipts concurrently round 2 #104

Merged
merged 15 commits into from
Jan 2, 2024

Conversation

welkin22
Copy link
Contributor

@welkin22 welkin22 commented Dec 26, 2023

Description

I implemented the first version of concurrent code to pre-fetch receipts at #100, but there are some problems that need to be fixed. Our cache is an LRU cache with a maximum capacity of 1000 items. When the goroutine implementation is changed to concurrent, the maximum capacity is quickly reached. This leads to the following problems:

  1. Since the cache is LRU, the Get method will insert data of old blocks that were about to be evicted into the front of the queue, causing a large amount of useless data to accumulate in the cache while useful data is pushed out.
  2. Concurrency makes it very easy for the cache to become full, resulting in old block data being quickly evicted and the cache becoming ineffective.

Rationale

I made some changes. In the pre-fetched goroutine, I removed the rateLimit and used waitGroup instead. I also added the PreFetchReceipts method, which indicates whether the cache has been filled. If it is filled, the pre-fetched goroutine will wait and retry.
At the same time, I abandoned the LRU cache data structure because it does not meet the requirements of pre-fetch. We need to evict receipts data from memory when the corresponding L1 block height is used up. Therefore, a priority queue sorted by block height from smallest to largest is a better choice. I created a new data structure called preFetchCache to cache receipts data. This structure is composed of a map and a priority queue, it will assist pre-fetch in deleting old block height receipts data at the appropriate time.
So when is the appropriate time to delete receipt data? I inserted the method ClearReceiptsCacheBefore into postProcessSafeL2 because when the safe block height processing is completed, it means that the corresponding L1 original block height receipt data is no longer needed. To be more cautious, we only delete data that is lower than the block height, without including it.

Example

none

Changes

Notable changes:

  • Changed the logic in GoOrUpdatePreFetchReceipts, using waitGroup instead of rateLimit.
  • Added PreFetchReceipts method, replacing FetchReceipts in GoOrUpdatePreFetchReceipts.
  • Added the preFetchCache structure, replacing the original LRU, making it easier to eliminate old data.

@welkin22 welkin22 marked this pull request as draft December 26, 2023 06:44
@welkin22 welkin22 marked this pull request as ready for review December 27, 2023 08:43
@github-actions github-actions bot requested a review from krish-nr December 27, 2023 08:43
@welkin22 welkin22 removed the request for review from krish-nr December 27, 2023 08:46
Copy link
Contributor

@bendanzhentan bendanzhentan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When L2UnSafe.L1Origin - L2Safe.L1Origin > 1000, this receiptsCache will cache block receipts in the range of [L2Safe.L1Origin, L2Safe.L1Origin+1000), however, block receipts of L2UnSafe.L1Origin will not be cached. Could this result in the sequencer producing blocks more slowly? If yes, is this expectedly?

@welkin22
Copy link
Contributor Author

welkin22 commented Jan 2, 2024

When L2UnSafe.L1Origin - L2Safe.L1Origin > 1000, this receiptsCache will cache block receipts in the range of [L2Safe.L1Origin, L2Safe.L1Origin+1000), however, block receipts of L2UnSafe.L1Origin will not be cached. Could this result in the sequencer producing blocks more slowly? If yes, is this expectedly?

Yes, if the distance between unsafe and safe is too large, the cache cannot take care of both. This is unexpected, but 1000 blocks in our use case is equivalent to 50 minutes, and scenarios where the difference is 50 minutes are rare. If there is no better solution, I tend to temporarily ignore it.

@owen-reorg owen-reorg merged commit e8d466b into bnb-chain:develop Jan 2, 2024
8 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants