Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fork-aware transaction pool added #4639

Open
wants to merge 372 commits into
base: master
Choose a base branch
from

Conversation

michalkucharczyk
Copy link
Contributor

@michalkucharczyk michalkucharczyk commented May 29, 2024

Fork-Aware Transaction Pool Implementation

This PR introduces a fork-aware transaction pool (fatxpool) enhancing transaction management by maintaining the valid state of txpool for different forks.

High-level overview

The high level overview was added to sc_transaction_pool::fork_aware_txpool module. Use:

cargo  doc --document-private-items -p sc-transaction-pool --open

to build the doc. It should give a good overview and nice entry point into the new pool's mechanics.

Quick overview (documentation excerpt)

View

For every fork, a view is created. The view is a persisted state of the transaction pool computed and updated at the tip of the fork. The view is built around the existing ValidatedPool structure.

A view is created on every new best block notification. To create a view, one of the existing views is chosen and cloned.

When the chain progresses, the view is kept in the cache (retracted_views) to allow building blocks upon intermediary blocks in the fork.

The views are deleted on finalization: views lower than the finalized block are removed.

The views are updated with the transactions from the mempool—all transactions are sent to the newly created views.
A maintain process is also executed for the newly created views—basically resubmitting and pruning transactions from the appropriate tree route.

View store

View store is the helper structure that acts as a container for all the views. It provides some convenient methods.

Submitting transactions

Every transaction is submitted to every view at the tips of the forks. Retracted views are not updated.
Every transaction also goes into the mempool.

Internal mempool

Shortly, the main purpose of an internal mempool is to prevent a transaction from being lost. That could happen when a transaction is invalid on one fork and could be valid on another. It also allows the txpool to accept transactions when no blocks have been reported yet.

The mempool removes its transactions when they get finalized. Transactions are also periodically verified on every finalized event and removed from the mempool if no longer valid.

Events

Transaction events from multiple views are merged and filtered to avoid duplicated events.
Ready / Future / Inblock events are originated in the Views and are de-duplicated and forwarded to external listeners.
Finalized events are originated in fork-aware-txpool logic.
Invalid events requires special care and can be originated in both view and fork-aware-txpool logic.

Light maintain

Sometime transaction pool does not have enough time to prepare fully maintained view with all retracted transactions being revalidated. To avoid providing empty ready transaction set to block builder (what would result in empty block) the light maintain was implemented. It simply removes the imported transactions from ready iterator.

Revalidation

Revalidation is performed for every view. The revalidation process is started after a trigger is executed. The revalidation work is terminated just after a new best block / finalized event is notified to the transaction pool.
The revalidation result is applied to the newly created view which is built upon the revalidated view.

Additionally, parts of the mempool are also revalidated to make sure that no transactions are stuck in the mempool.

Logs

The most important log allowing to understand the state of the txpool is:

              maintain: txs:(0, 92) views:[2;[(327, 76, 0), (326, 68, 0)]] event:Finalized { hash: 0x8...f, tree_route: [] }  took:3.463522ms
                             ^   ^         ^     ^   ^  ^      ^   ^  ^        ^                                                   ^
unwatched txs in mempool ────┘   │         │     │   │  │      │   │  │        │                                                   │
   watched txs in mempool ───────┘         │     │   │  │      │   │  │        │                                                   │
                     views  ───────────────┘     │   │  │      │   │  │        │                                                   │
                      1st view block # ──────────┘   │  │      │   │  │        │                                                   │
                           number of ready tx ───────┘  │      │   │  │        │                                                   │
                                numer of future tx ─────┘      │   │  │        │                                                   │
                                        2nd view block # ──────┘   │  │        │                                                   │
                                      number of ready tx ──────────┘  │        │                                                   │
                                           number of future tx ───────┘        │                                                   │
                                                                 event ────────┘                                                   │
                                                                       duration  ──────────────────────────────────────────────────┘

It is logged after the maintenance is done.

The debug level enables per-transaction logging, allowing to keep track of all transaction-related actions that happened in txpool.

Integration notes

For teams having a custom node, the new txpool needs to be instantiated, typically in service.rs file, here is an example:

let transaction_pool = Arc::from(
sc_transaction_pool::Builder::new(
task_manager.spawn_essential_handle(),
client.clone(),
config.role.is_authority().into(),
)
.with_options(config.transaction_pool.clone())
.with_prometheus(config.prometheus_registry())
.build(),
);

To enable new transaction pool the following cli arg shall be specified: --pool-type=fork-aware. If it works, there shall be information printed in the log:

2024-09-20 21:28:17.528  INFO main txpool: [Parachain]  creating ForkAware txpool.

For debugging the following debugs shall be enabled:

      "-lbasic-authorship=debug",
      "-ltxpool=debug",

note: trace for txpool enables per-transaction logging.

Future work

The current implementation seems to be stable, however further improvements are required.
Here is the umbrella issue for future work:

Partially fixes: #1202

@michalkucharczyk michalkucharczyk added the R0-silent Changes should not be mentioned in any release notes label Jun 3, 2024
@brenzi
Copy link
Contributor

brenzi commented Jun 12, 2024

I can offer field testing service on Integritee-Paseo. We suffer from #1202 and need to restart one paseo collator authority every 5 min to get all extrinsics in which originate from a single wallet.
integritee-network/worker#1594

@michalkucharczyk
Copy link
Contributor Author

I can offer field testing service on Integritee-Paseo. We suffer from #1202 and need to restart one paseo collator authority every 5 min to get all extrinsics in which originate from a single wallet. integritee-network/worker#1594

@brenzi: The code still needs polishing but it is ready to be tested in testnet. So if you can give it a try that would be greate, I can assist with debugging.

Following option shall be provided to enable new txpool: --pool-type=fork-aware
For better understanding how new implementation behaves and help me with debugging please enable following logs:

      "-lbasic-authorship=debug",
      "-ltxpool=debug",
      "-lsync=debug",

@girazoki
Copy link
Contributor

@michalkucharczyk just cherry-picking this PR would suffice to test it? I can also try to test it in our testnet

@michalkucharczyk
Copy link
Contributor Author

@michalkucharczyk just cherry-picking this PR would suffice to test it? I can also try to test it in our testnet

Yes. But if you have custom node, you will also need to instantiate new txpool, here is example:

let transaction_pool = sc_transaction_pool::Builder::new()
.with_options(config.transaction_pool.clone())
.build(
config.role.is_authority().into(),
config.prometheus_registry(),
task_manager.spawn_essential_handle(),
client.clone(),
);

@girazoki
Copy link
Contributor

@michalkucharczyk just cherry-picking this PR would suffice to test it? I can also try to test it in our testnet

Yes. But if you have custom node, you will also need to instantiate new txpool, here is example:

let transaction_pool = sc_transaction_pool::Builder::new()
.with_options(config.transaction_pool.clone())
.build(
config.role.is_authority().into(),
config.prometheus_registry(),
task_manager.spawn_essential_handle(),
client.clone(),
);

thank you, we will try it

girazoki added a commit to moondance-labs/polkadot-sdk that referenced this pull request Jun 19, 2024
Copy link
Member

@bkchr bkchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry 🙈

Also did not yet review everything.

prdoc/pr_4639.prdoc Outdated Show resolved Hide resolved
prdoc/pr_4639.prdoc Outdated Show resolved Hide resolved
substrate/client/basic-authorship/src/basic_authorship.rs Outdated Show resolved Hide resolved
substrate/client/basic-authorship/src/basic_authorship.rs Outdated Show resolved Hide resolved
Copy link
Member

@bkchr bkchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work 👍

Still some stuff to do, but also a lot of stuff can be done in a follow up to finally have this tested in some real scenarios.

Thank you!

Comment on lines +236 to +242
let r = self.to_handler.unbounded_send(ToHandler::PropagateTransaction(hash.clone()));
if r.is_err() {
log::debug!(
"[{hash:?} ] import_notification_stream: sending failed {:?}",
r.unwrap_err()
);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let r = self.to_handler.unbounded_send(ToHandler::PropagateTransaction(hash.clone()));
if r.is_err() {
log::debug!(
"[{hash:?} ] import_notification_stream: sending failed {:?}",
r.unwrap_err()
);
}
let _ = self.to_handler.unbounded_send(ToHandler::PropagateTransaction(hash));

///
/// Pruning is just rebuilding the underlying transactions graph, no validations are executed,
/// so this process shall be fast.
pub fn ready_at_light(&self, at: Block::Hash) -> PolledIterator<ChainApi> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still don't see any reason why this function is async. It is not doing anything async.

Comment on lines +374 to +388
let best_result = {
api.tree_route(self.enactment_state.lock().recent_finalized_block(), at).map(
|tree_route| {
if let Some((index, view)) =
tree_route.enacted().iter().enumerate().rev().skip(1).find_map(|(i, b)| {
self.view_store.get_view_at(b.hash, true).map(|(view, _)| (i, view))
}) {
let e = tree_route.enacted()[index..].to_vec();
(TreeRoute::new(e, 0).ok(), Some(view))
} else {
(None, None)
}
},
)
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think that the following should be faster:

let mut to_process = Vec::new();
while at.number() > last_finalized.number() {
    if self.view_exists(at.hash()) { break }
    to_process.push(at.hash());
   at = at.parent();
}

No need to get a tree route down to the finalized block.

}
};

let fall_back_ready = self.ready_at_light(at);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read_at_light is not async :P

) -> Result<Self::Hash, Self::Error> {
//todo [#5493]
//looks like view_store / view needs non async submit_local method ?.
unimplemented!();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just put a debug_assert!

// local peer, and then imported in a fork from the other peer). Returning status stream is
// better then reporting error in this case.
Error::AlreadyImported(..) |
Error::TemporarilyBanned,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same.

);
match error {
Ok(
Error::InvalidTransaction(InvalidTransaction::Stale) | Error::TemporarilyBanned
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comment on why these are seen as acceptable would be nice.

michalkucharczyk and others added 2 commits October 8, 2024 15:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
R0-silent Changes should not be mentioned in any release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve transaction handling for Parachains
7 participants