-
Notifications
You must be signed in to change notification settings - Fork 257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Static Metadata Validation #478
Conversation
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A good start! I've only skimmed through so far, and I'll have to take a more thorough look tomorrow, but I left some comments on the way!
I think ultimately I'd lean towards not having a MetadataHashable
struct at all, and instead just having a set of functions which take a reference to metadata (or just the type registry if they don't need anything from Metadata
) and return, at the end, a hash.
Most of the functions probably only need access to the type registry and some starting type ID I'd imagine, and from there they can recurse through the types in the registry and spit out a hash at the end.
Caching recursive IDs may give some weight to havign a struct at some point (although I wouldn't expect this struct to live longer than it takes to hash the pallet/metadata).
We may then choose to cache hashes (or not) as a separate concern (for example by having said cache inside a mutex in Metadata
or something), to avoid re-hashing things each time we call them.
I also wonder whether std::hash::{ Hash, Hasher}
might be useful at all (https://doc.rust-lang.org/std/hash/index.html) :)
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
I had a pass over this, and think we're good to go to try and get this merged into Subxt. The main features:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good, missing some top-level integration tests for the validation failed cases.
This is a good improvement for safety and usablility. However reading over the hashing code I wonder whether we could make it even better for calls/events/storages by comparing some kind of "intermediate representation" (IR) instead of hashes. This would allow providing information of what actually has changed in those entities instead of just the fact that it has changed. The code for transalating the metadata to the IR could then shared with the codegen, avoiding duplicate code of traversing the metadata. This is probably a bit more involved so am happy to accept the hashing solution, it was just an idea that came to mind as I was looking at this.
.get(name) | ||
.ok_or_else(|| MetadataError::PalletNotFound(name.to_string())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this useful to know which pallet that was not found in the error? Or am I missing something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We know which pallet you're looking for when you call the function, so if an error happens you already have that information (and there was this weird thing where some errors took String
and some took &'static str
and actually just removing them, since the info is there, felt cleaner than adding Cow
s or something :)
) -> Result<&PalletConstantMetadata<PortableForm>, MetadataError> { | ||
self.constants | ||
.get(key) | ||
.ok_or(MetadataError::ConstantNotFound(key)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, isn't useful to know which constant that failed in the error? Or are we expecting the caller to figure this out anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above I think; we know what constant we're asking for when we call this funciton, so returning it in the error isn't necessary. (I have no objection to adding it back if we are consistent about String
vs &'static str
in our errors, but it felt easier to remove them and ignore that :))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks clean to me, I would prefer parking_lot::RwLock over StdRwLock though.
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
subxt/src/client.rs
Outdated
let metadata = rpc.metadata().await; | ||
metadata? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let metadata = rpc.metadata().await; | |
metadata? | |
rpc.metadata().await? |
/// Set the metadata. | ||
/// | ||
/// *Note:* Metadata will no longer be downloaded from the runtime node. | ||
pub fn set_metadata(mut self, metadata: Metadata) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pub fn set_metadata(mut self, metadata: Metadata) -> Self { | |
#[cfg(test)] | |
pub fn set_metadata(mut self, metadata: Metadata) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, cfg(test)
is not active for integration tests. I would place this under integration_tests
feature and follow up with #515 😄 .
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Thanks for your feedback @ascjones! Just to weigh in quickly; I think this IR approach is potentially very useful! A possible downside of it for this validation stuff is that it is likely much more allocation heavy to produce an IR; the nice thing about the hashing approach is that we can have almost no allocations. That said, we may end up wanting an IR for producing nice human readable diffs between metadata crates (or at least, I figure we'd either flatten the metadata types out into some enum that contained all of the relevant info itself and then compare those with eachother, or skip this and directly compare metadatas, depending on what works out to be simpler). We'll always have room to re-visit the underlying validation approach we use here though, if we find a better way to do it, so I think of this as a starting point more than anything :) |
Thanks a lot for your feedback @jsdw @ascjones and your help in getting the PR merged! 🎉 I do like the idea of intermediate representations for flattening out the metadata. |
Table of Contents
Static Metadata Validation
There are cases when a customer would generate the Runtime API from static metadata of one node and attempt interaction with another node.
This API proposal adds two validations points to capture any differences in the underlying metadata.
Call / Constant / Storage Validation
Nodes can register pallets in arbitrary order. This behavior will lead to different pallet representations inside the metadata but similar functionality.
Each pallet is composed of calls, constants, and storage.
Call / Constant / Storage is inspected recursively to obtain a unique deterministic representation. The representation is obtained via the
subxt-metadata
crate.The hashing information is embedded statically inside the generated pallet.
The static information is checked against the runtime information obtained from the
Client
object.API Changes
By default, the API will validate that a given cal / constant / storage is compatible between the static and dynamic metadata.
Full Metadata Validation
The metadata validation is implemented similarly to pallet validation.
Unlike the pallet validation, the metadata validation check is not performed by default. This is due to various small changes that would result in different metadata, as well as providing a simplified API to interact with.
The customer can call at any time the validation method for the full metadata check:
The
MetadataError::IncompatiblePalletMetadata
error is returned when trying to interact with a fully incompatible metadata.To be noted that the hashing is compared only for the subset of pallets that are present in the statically generated API.
Static Metadata Crate
The
subxt-metadata
crate exposes the core implementation of static metadata validation.The implementation follows recursively the metadata types towards flattening them out to a deterministic representation.
The resulted representation of the metadata is then hashed with
SHA 256
.A future implementation can lift this restriction, providing
String
representations that are human-readable.API
The following are exposed for customer usage:
get_metadata_hash()
get_metadata_per_pallet_hash()
get_pallet_hash()
get_call_hash()
get_constant_hash()
get_storage_hash()
The cache implementation is left entirely in the customers hands.
For examples see the
subxt/metadata/hash_cache.rs
.Considerations
At the moment, the crate does not provide inner caching but does provide the ability to skip pallet hashes when the metadata hash is constructed first.
However, the crate does not cache the inner metadata types, obtained from the internal
get_type_hash
.This behavior is intended for the time being for providing the ability to deterministically resolve recursive types.
When registering the previous types to the metadata, there are two possibilities, depending on which pallet registration order:
metadata_types: { { "id": 0, A } , { "id": 1, B } }
ormetadata_types: { { "id": 0, B } , { "id": 1, A } }
.If intermediate caching is provided, resolving ID 0 from first example would result into
0 = (A B A) , 1 = (B A)
, while resolving ID 0 from the second example:0 = (B A B), 1 = (A B)
.The internal caching of types should be considered carefully, part of a separate issue.
Subxt Metadata CLI
The subxt CLI is extended with
compatibility
subcommand for validating metadata compatibility between multiple nodes.The CLI uses the implementation exposed by the previous chapter to group compatible nodes, either by full metadata validation or by pallet validation.
Usage
Full Metadata Validation
Pallet Metadata Validation
Next Steps
skip_pallet_validation()
methods)MetadataHashable
MetadataHashable
to dedicated crateskip_validation
functionalityper call inside pallet
Arc
on client's metadataFuture Work
subxt-metadata
Closes #398