Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Do not call initialize_block before any runtime api #8953

Merged
8 commits merged into from
Jul 1, 2021

Conversation

bkchr
Copy link
Member

@bkchr bkchr commented May 31, 2021

Before this change we always called initialize_block before calling
into the runtime. There was already support with skip_initialize to skip
the initialization. Almost no runtime_api requires that
initialize_block is called before. Actually this only leads to higher
execution times most of the time, because all runtime modules are
initialized and this is especially expensive when the block contained a
runtime upgrade.

TLDR: Do not call initialize_block before calling a runtime api.

Fixes: paritytech/polkadot#3053

polkadot companion: paritytech/polkadot#3140

bkchr added 2 commits May 28, 2021 23:14
Before this change we always called `initialize_block` before calling
into the runtime. There was already support with `skip_initialize` to skip
the initialization. Almost no runtime_api requires that
`initialize_block` is called before. Actually this only leads to higher
execution times most of the time, because all runtime modules are
initialized and this is especially expensive when the block contained a
runtime upgrade.

TLDR: Do not call `initialize_block` before calling a runtime api.
@bkchr bkchr added A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. labels May 31, 2021
@bkchr bkchr requested a review from tomusdrw as a code owner May 31, 2021 07:34
@bkchr bkchr added D3-trivial 🧸 PR contains trivial changes in a runtime directory that do not require an audit B5-clientnoteworthy and removed B0-silent Changes should not be mentioned in any release notes labels May 31, 2021
Copy link
Contributor

@tomusdrw tomusdrw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but I'm surprised there is no docs being updated. I think it's worth to document the behavior now.

primitives/api/test/tests/runtime_calls.rs Show resolved Hide resolved
client/light/src/call_executor.rs Show resolved Hide resolved
client/light/src/call_executor.rs Show resolved Hide resolved
/// Proof should include both environment preparation proof and method execution proof.
pub fn check_execution_proof_with_make_header<Header, E, H, MakeNextHeader>(
pub fn check_execution_proof_with_make_header<Header, E, H>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method name doesn't make sense now?

@@ -230,62 +211,41 @@ pub fn check_execution_proof<Header, E, H>(
H: Hasher,
H::Out: Ord + codec::Codec + 'static,
{
check_execution_proof_with_make_header::<Header, E, H, _>(
check_execution_proof_with_make_header::<Header, E, H>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this break compatibility between proof formats? Are there no consequences?

@bkchr bkchr requested a review from NikVolf as a code owner June 2, 2021 20:21
@bkchr bkchr requested a review from tomusdrw June 29, 2021 12:14
@bkchr
Copy link
Member Author

bkchr commented Jun 29, 2021

@pepyakin @tomusdrw burn in etc was all successful. So, now it is time to do a proper review.

Copy link
Contributor

@tomusdrw tomusdrw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work!

/// Validate the transaction.
///
/// This method is invoked by the transaction pool to learn details about given transaction.
/// The implementation should make sure to verify the correctness of the transaction
/// against current state.
/// against current state. The given `block_hash` corresponds to the hash of the block
/// that is used as current state.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we add info about the need to manually initialize the frame_system pallet?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For frame this is done automatically through Executive::validate_transaction. While this is also just required when you have a signed extension that checks for block hashes.

@@ -198,6 +198,11 @@ impl RuntimeVersion {
) -> bool {
self.apis.iter().any(|(s, v)| s == id && predicate(*v))
}

/// Returns the api version found for api with `id`.
pub fn api_version(&self, id: &ApiId) -> Option<u32> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏

@pepyakin
Copy link
Contributor

pepyakin commented Jul 1, 2021

Prior this patch, when an runtime API is called we would call initialize_block. In turn, it would call on_initialize of all modules. pallet_session is of special interest, since it would invoke SessionHandler::on_new_session as part of its on_initialize. With this patch, these routines would not be executed.

Specifically, a difference in behavior will be observed if an runtime API accesses some state that can be changed either in on_initialize or on_new_session.

I went through all the runtime API impls and have marked some that do that. Getting into the list doesn't mean that they are wrong, but I think it would be a good idea to check for somebody deeply familiar with the respective code, or at least acknowledge that they are aware of this change.

Here is the list:

  • BabeApi @andresilva . Runtime APIs access state which is tainted by do_initialize and enact_epoch_change.
    This won't be a big concern, as far as I can tell. Runtime APIs are called with no digests and thus do_initialize does nothing (except randomness collection), should_epoch_change returns false and thus no enact_epoch_change.
  • Equivocation proof generation and verification @andresilva . AFAICT, this should be fine since we always generate a proper proof. That enables it to be verified in any case.
  • Beefy @tomusdrw @adoerr . This one seems can change authorities on new session and thus the validator_set API will return a different set at the last block. As I mentioned for BABE it may be fine since it appropriates ShouldEndSession.
  • AuthorityDiscoveryApi @ordian . On new session this module rotates the keys.

Copy link
Contributor

@pepyakin pepyakin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the patch itself, LGTM!

@tomusdrw
Copy link
Contributor

tomusdrw commented Jul 1, 2021

Thanks @pepyakin!
I'm crossing-out BEEFY from that list. We are planning to rely on the digest item anyway, currently afair we indeed do the state call and it might be an issue if we have BEEFY validators running two different versions (the ones running older version will think we already transitioned to a new set), but given it's currently deployed only on Rococo it should be easy to coordinate the upgrade, so I don't think any preventive measures are needed.

@bkchr
Copy link
Member Author

bkchr commented Jul 1, 2021

I checked again Babe and AuthorityDiscovery and couldn't find anything problematic.

@bkchr
Copy link
Member Author

bkchr commented Jul 1, 2021

bot merge

@ghost
Copy link

ghost commented Jul 1, 2021

Trying merge.

@ghost ghost merged commit baf3736 into master Jul 1, 2021
@ghost ghost deleted the bkchr-api-no-initialize branch July 1, 2021 15:50
@andresilva
Copy link
Contributor

Both BABE and GRANDPA should be unaffected, I wasn't sure about the equivocation report submission but I tested it now.

s3krit pushed a commit that referenced this pull request Jul 6, 2021
* Do not call `initialize_block` before any runtime api

Before this change we always called `initialize_block` before calling
into the runtime. There was already support with `skip_initialize` to skip
the initialization. Almost no runtime_api requires that
`initialize_block` is called before. Actually this only leads to higher
execution times most of the time, because all runtime modules are
initialized and this is especially expensive when the block contained a
runtime upgrade.

TLDR: Do not call `initialize_block` before calling a runtime api.

* Change `validate_transaction` interface

* Fix rpc test

* Fixes and comments

* Some docs
@JoshOrndorff
Copy link
Contributor

What is the recommended way to port runtime APIs that actually did depend on the initialization? Should each one manually initialize only what it needs to?

cc @tgmichel I believe this is the change that broke our EVM tracing.

@shawntabrizi
Copy link
Member

shawntabrizi commented Jul 8, 2021

@JoshOrndorff not super familiar with how one might initialize the block, but it may be that rather than trying to turn this back on, you actually want to see if you can adjust your code or tracing to work without needing this initialize.

I think in general, it is risky to your chain's performance if you have any of the runtime apis call initialize_block because some blocks can be very heavy, and it seems your chain may be attackable on those blocks

(afaik, not an expert on this)

@bkchr
Copy link
Member Author

bkchr commented Jul 8, 2021

Yeah, I would not advise to do this.

If possible, I would prevent this. Or if you only need to initialize some stuff like System, you can only initialize this.

@crystalin
Copy link
Contributor

@shawntabrizi Thanks for the feedback.
Just a bit of context, we understand those are a concern for resources (just the fact of fully tracing blocks is very expensive), and that is why we don't expose those features by default. Those are meant for projects to use internally to query the trace logs.

KalitaAlexey pushed a commit to KalitaAlexey/substrate that referenced this pull request Jul 9, 2021
* Do not call `initialize_block` before any runtime api

Before this change we always called `initialize_block` before calling
into the runtime. There was already support with `skip_initialize` to skip
the initialization. Almost no runtime_api requires that
`initialize_block` is called before. Actually this only leads to higher
execution times most of the time, because all runtime modules are
initialized and this is especially expensive when the block contained a
runtime upgrade.

TLDR: Do not call `initialize_block` before calling a runtime api.

* Change `validate_transaction` interface

* Fix rpc test

* Fixes and comments

* Some docs
@JoshOrndorff
Copy link
Contributor

Before this PR was merged, was each pallet's on_initialize hook called for each runtime API?

@bkchr
Copy link
Member Author

bkchr commented Aug 26, 2021

Yes

vmarkushin pushed a commit to vmarkushin/substrate that referenced this pull request Oct 5, 2021
* Do not call `initialize_block` before any runtime api

Before this change we always called `initialize_block` before calling
into the runtime. There was already support with `skip_initialize` to skip
the initialization. Almost no runtime_api requires that
`initialize_block` is called before. Actually this only leads to higher
execution times most of the time, because all runtime modules are
initialized and this is especially expensive when the block contained a
runtime upgrade.

TLDR: Do not call `initialize_block` before calling a runtime api.

* Change `validate_transaction` interface

* Fix rpc test

* Fixes and comments

* Some docs

Signed-off-by: Vladislav Markushin <negigic@gmail.com>
This pull request was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. C1-low PR touches the given topic and has a low impact on builders. D3-trivial 🧸 PR contains trivial changes in a runtime directory that do not require an audit
Projects
None yet
Development

Successfully merging this pull request may close these issues.

getMetadata might stall the node's rpc
7 participants