-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Protobuf tx UX is bad for multi-chain wallets/block explorers #6030
Comments
The proposal seems reasonable to me! I'm looking forward to seeing the revised ADR 👍 I'd like to clarify here two things:
|
I see three different potential approaches with different tradeoffs. I think they're pretty simply classified based on their use of I've tried to list out the pros and cons as I see them. 1)
|
As a js & eth developer with only minimal experience on the go side and no experience with protobuf maybe it's possible to break down the 3 scenarios @aaronc outlined a little more? Is it correct that the out-of-band information that is references are something like .proto files that describe the types included in custom Messages for each chain? This would be similar to the ABI file that describes the contract interface of EVM contracts, right? It's annoying with Ethereum to find the ABIs in order to interact with different contracts, but it's not a deal breaker. There are some repositories of ABIs, and it's possible to confirm that an ABI does in fact satisfy the application you're trying to interact with once you have it. I saw mention that these out-of-bound files could be accessible by an RPC interface, this sounds like the proposal outlined here: #4322. That issue is maybe a good outline of what I'd like to have from the perspective of a client side developer (my user wants to interact with a new chain, i should be able to auto-generate an interface for that user with a minimal amount of contextual info on that chain) |
In all of these scenarios there will be .proto files that describe most of the ABI. In approach 1), you will have almost all of the information in the .proto files. In approach 3), you will need to know more about which chains support which types even though you'll have all the types sitting right in front of you. This could potentially be solved by a reflection interface that lets you query for the set of supporting types. In approach 2), you can get the ease of 3) (in terms of being able to support multiple chains by just knowing part of their interface), but you can also look at the full app-level .proto files to get the benefits of 1). Does that help @okwme? |
For approach 1 certain difficulties arise in that the Proto definitions are required to be compiled in and known to parties handling the message. A generic signature library becomes a problem as it must understand how to handle the Proto in order to process it. I believe that approach 2 is suboptimal as a message should not be encoded in different ways for storage/signature calculation in my opinion. The signature process is used for integrity and repudiation. Maintaining the ability to verify a signature quickly without any further understanding of the message types used supports the point of using a signature in the first place. With this in mind translating a message into an alternate form prior to signing should be avoid if possible. Using a hybrid approach to address short comings of Approach 3 appears to be the superior option. This approach allows the system components to evolve independently. Using the same solution everywhere streamlines development process and reduces the developer friction from working with the same data in multiple areas of the system. |
Thanks for your input @iramiller. If we were going to consider approach 3, we'd need to asses a few things:
Would you agree that 2) is at least better than 1) if we choose not to rework our current encoding approach? |
Just wrote out a scenario to try to help me wrap my head around the question. After doing so and reading this for more background, I realize this is much more of a question for client library authors and not client library users. I don't feel qualified on that front to weigh in since a client library user should not actually participate in these steps and that's the only experience I can maybe add perspective on. I've included the text that I began in case parts of it are useful in the conversation but I don't expect any response. Thanks @aaronc this is very helpful! When you say most of the information, what type of info would be missing? As a client developer trying to support a user who wants to query something on a new chain and then also let them execute a new Msg type on that new chain I'd need the following:
Is that an approximation of the process we're trying to solve for? This doesn't include access to a REST server, we're trying to avoid that scenario right? The difference between |
Information about which concrete interface implementations are supported by a given chain. Ex. supported
You would generally get these .proto files from a github repository for the chain/modules you want to query. With approaches 2) and 3), a single module .proto file + the base sdk .proto files would encapsulate all the info needed to query and make transactions against a given module on any chain supporting that module.
Reflection on the structure of the state store in order to request merkle proofs is not covered under any of the current work. It is something I do have in mind for later though. You could use the
You wouldn't send JSON to the tendermint node, you would send protobuf or you would send JSON to an RPC server that then encodes it and broadcasts it to the tendermint node. Signing code be done just with JSON with the proposed approach.
No, we aren't trying to avoid an RPC &/or REST server. Some of the approaches described above are very much enhanced by an RPC/REST server although much less dependent on one than the previous amino approach.
The |
Thank you @aaronc for this very good summary. (1) is what is done in weave-based blockchains, which works nicely if you target one chain. I see the drawbacks in this context. (2) With two different encodings, I think it should be worked out in details, which mapping happens where and what information are required. E.g. the
If I understand correctly, the con from (3) "out of band information is needed" applies for the One aspect that was not covered so far is that the (3) This is my preferred solution for a system with the given flexibility requirements, primarily due to the cons I see in 2. The simplicity seems much more resilient against transaction malleability bugs and is more egonomic for those needing both signing and encoding representation. It uses a bit more storage in the transaction history, but I did not yet see any evidence that the overhead is relevant. If someone comes up with long URLs, they just hurt their own chain in terms of storage. |
Option 2 is an absolute improvement for those working on the client side. Alignment of |
Add these to the cons.
Well with 2) you could a) choose to use the app-level proto files or b) at least look at them. So I think it's at least mildly a pro. Although maybe a reflection service for 3) could accomplish something similar.
We're doing JSON encoding already for signature verification so it's expensive. But AFAIK we cache the result in CheckTx to avoid the performance hit.
Just to note that by default the URLs will be something like |
Overall what I'm thinking at this point is that we should proceed with #6031 (approach 2) and at that point, option 3) is also enabled for chains if they choose it. Basically with #6031 the SDK will have the infrastructure to allow either 2) or 3) based on configuration. Then maybe it should be a decision for the hub - possibly via governance - whether users want 2) or 3). Other chains would be free to make a different choice either way. |
I just got to catch up on this thread. Here are my thoughts: First, the framing of the issue is missing a bit. There are a few issues mixed together (a) Compatibility of JSON-to-Protobuf encodings AND I will focus just on (b). I think we also need a solid cross-platform JSON-to-Protobuf solution, or decide that we do not need to sign this custom "canonical JSON" representation. But that is another issue I think. First off, amino doesn't "just work". If my change uses a different Account type, all wallets and explorers will fail. If I use the same concrete types for the interfaces, then the binary representations are compatible. That is true. Assuming people develop wallets and explorer for the gaia hub, we get compatibility when: (a) for wallets (sending tx) - if the target chain tx/msg types are a superset of the types in the hub They only work perfect when you have the same type definitions as hub. You know the issue that the custom regen governance votes don't show up in the explorers (as they are not a subset of the gaia/hub types. This means, the two chains have slightly different type definitions for tx/msg (the oneof/interface types), but since there is significant overlap, much tooling (targeting that common ground) just works. GIven the above, we get the exact same condition with protobuf Please take a look at the comment here: https://github.com/iov-one/weave/blob/master/cmd/bnsd/app/codec.proto#L25-L35 We have since removed Back in the day, we had to register go-wire types with a type byte, which is like this protobuf field. Now, this type prefix is autogenerated from a hash of the descriptor, so this overlap happens as long as you register them under the same name. This is fundamentally the same as using the same field numbers. If instead of I think this whole issue can be resolved very easily via (1) as long as we set a standard naming for all types exposed from the standard modules (cosmos-sdk and other common modules, like group account work). Custom chains can freely use fields from 250 or so to make sure there is no overlap, and there is some discussion to claim a field number below that (when it is mean to be reused between many apps and "standardized"), kind of like you need root to bind a port below 1024. It just takes a bit of discipline and a canonical registry of types here. We will need the same discipline (or more) to maintain any kind of backwards-compatibility guarantees as well. |
Another point, tied more to the "compiling protobuf" for clients ux, is that the main js implementation (protobuf-js) works great dynamically generating types. Assume there is a JS lib that handles everything for you, and wants to be extensible. The Dev building on it can just provide the chain-specific Anyway, this is another topic, but since "client side" is often another word for "JS", I want to say that dynamic languages can provide some very neat approaches to using protobuf. |
Thanks for your detailed comments @ethanfrey and for sharing another option for addressing this! A schema registry could work and as you know is something I've advocated in the past. Having a blockchain registry of names to field numbers would be an interesting thing to explore, but I also think this could be started with a simple git repository with a master I wonder what others who advocated for 3) think about this option? @webmaster128 @iramiller ?
I'm not sure quite what you mean here @ethanfrey. Are you saying that protobuf-js can work extend protobuf functionality at runtime? Or just that a code generation step isn't needed?
Since using JSON at all for signing has been questioned, it would be good to vet that as well. Either in this thread or another thread. @webmaster128 did have an interesting comment on discord that having the JSON signing is advantageous for pure-JSON users that don't want to depend on proto files at all. I do want to note a couple other disadvantages to 3) that occur to me:
|
+1 Yes, I think having a git repository with the filed numbers would be a good choice. I understand it's a bit of work for developers to follow these rules but having them would make the things easier for clients. |
Great to see the comments here. We just had a long conversation about this during Regen's Cosmos SDK architecture review call, with a number of folks from this thread on the call. Links to the hackmd here. Rough alignment from that call (as noted with individual votes from participants in the bottom of the doc), was to move forward with option 2, and possibly revisit the state encoding at a later point after we are able to run some benchmarking tests. As I understood from the call, one of the biggest reason folks were not in favor of @ethanfrey's registry proposal had to do with the governance overhead of having a registry of message types for a fully decentralized ecosystem. I'd like to give this thread still some opportunity to hear back from advocates for @ethanfrey's proposal, if there are clear client-UX reasons why this still should be heavily considered over Option 2. |
Thanks @clevinson. I just want to note that as reflected in your notes, there was almost a super-majority of people on the call (6/9) favoring or leaning towards option 3 ( |
I advocate against 2. It seems way to complex. Although I think 1 is the most efficient and organized version, and the idea of a registry is made to be much bigger than it is (just refer to the app protobuf definition in gaia as a basis), I have no desire to try to convince anyone. The other ones also need canonical names for the "any" variants, it just becomes less visible that a registry is useful until later on. 2 adds even more complexity - JSON plus 2 protobuf files. 🤯 With 3 we end up with self-describing protobuf, which may be larger than (1) but smaller than the JSON representation, which is also self-describing. In such a case, I would see no need at all for using JSON for signing. (I do see the argument that you want a self-describing format that can be presented to the user if we encode it in approach 1). I see 3 as meaningful if developers: a. want a similar ux to amino (dynamically registering types rather than define them in one file) and |
I'm broadly in favor of a custom type -> JSON encoder that we sign and AMINO JSON is very much good enough. I've implemented this in rust, javascript, JAVA and more and it's fine. Also many many more people are impacted by changes to Cosmos signing than are affected currently by bytestream changes and I would really hesitate to break this. |
I am pretty sure the JSON format has already changed @zmanian I think they are using a canonicalized json representation of protobuf objects, not amino-json. But @alexanderbez or @aaronc can answer that better (just my understanding from a few conversations) The "amino JSON" encoder requires the custom type strings from go-amino, and unless they did some magic, those are likely gone in the protobuf world. |
Just noting that a custom JSON encoder is not something under discussion here although I guess it could be. A canonical encoding of protobuf to JSON is what we are currently planning to use for signing, but it is avoiding being "custom" as much as possible and is not compatible with amino JSON. That said, approaches 2) and 3) would result in a JSON format that is much more similar to amino JSON than approach 1). It would vary from amino JSON in the subtle way described in the first comment in this thread (#6030 (comment)). |
I think we should move the JSON representation discussion to a different place and get clarity about what is signed first before it is discussed which representation is signed. (spoiler: I wonder to what degree the current proposal makes sense)
I was not aware of the solution that Ethan proposed for the multi-chain support. Be he is right and I support both (1) and to a slightly lesser degree (3). My primary convern with (2) is that it is assumes the client does not need and does not get full control of the bytes exchanged with Tendermint. It optimizes for a script kiddie style client development and assumes a fully powered external instance will take care of the rest. If the client wants to get all the features and not a minimal subset, they need to re-implement a massive document manipulation process that is hard to get right even for experienced low level developers. Instead, I think we should optimize for a client lib / client app developer split as Billy mentioned before. It is enough that there is one properly maintained core library per programming language. This is easy to get if the overhead in the spec is minimal. This also ensures we don't end up with a standard defined by messy Go internals (again). As a client lib developer, I want to be able to implement the full standard to interact with Tendermint (read/write) without knowing any Go. My secondary concern is that when the bytes sent and the bytes signed differ, you open the door for transaction malleability bugs. I found one of those in Lisk, which I could have used to destroy 300 million market cap at that time. This is probably more dangerous for JSON/protobuf mappings than protobuf/protobuf mappings, but still. And even if it was secure, you still need to convince a lot of people that it is. Every audit of every implementation gets much harder and more expensive when there are multiple representations of a document. Given those two concerns, I really would not mind a few percent of storage, if it ensures that the resuling chain is (a) used and (b) secure. Without the later, storage optimizations don't matter.
The are about 530 million protobuf field numbers. I don't see any significant governance in defining something like: the first 1 million are reserves for SDK core modules in this repo, the first 10 million are open for commonly used external community supported modules (e.g. CosmWasm), and everything above can be used by chains as they want A chain can always fool a multi-chain client by using a different implementation for a given a common instruction that is binary compatible (e.g. replace send tokens with burn tokens). But then this is a shitty chain you should avoid, just like a chain that uses reserved field numbers for non-standard purposes.
You can inject custom .protos into protobuf-js at runtime and get dynamically generated classes with encoders and decoders in JavaScript. In contrast to .protos known at library build time, you don't have the code generator that gives you TypeScript interfaces. But still pretty neat. This could be wrapped in a Cosmos specific lib that makes the API a bit nicer for the context. |
For (1): Given the 29 bit field numbers, you can also map the amino style |
This is a fair ask and likely something we should consider more seriously.
I tend to think of protobuf to JSON as more secure that just signing with protobuf as the JSON includes more information. Nobody ever proposed mapping JSON to protobuf for signing, but that seems like similar to what is actually happened with the Lisk bug. A high resolution format was reduced to a lower resolution format for signing. With protobuf to JSON it is generally the opposite. The JSON representation includes more information than that protobuf representation - specifically field and sometimes type names - and I think this is the general argument for doing JSON signing. Now, theoretically there could be some types of data which are higher resolution in protobuf than their JSON representation, but it should be pretty straightforward to figure out which ones those are. Taking a quick look, it does appear that
This is the scenario I'm actually more concerned about. And not that a chain would intentionally fool clients but that simple user error could cause this to happen. Consider we have two messages oneof sum {
...
MsgMint mint = 10315;
MsgBurn burn = 10316;
} Now say a developer is copying this to the chain proto file (or alternatively to a client app proto file) and they write: oneof sum {
...
MsgBurn burn = 10315;
MsgMint mint = 10316;
} Maybe there's no malicious intent just user error and then the chain is mostly compatible with standard wallets with one important exception! Just signing the proto binary with approach 1) would not prevent this, but signing the JSON would (because However, JSON wouldn't fix this sort of screw up: oneof sum {
...
MsgBurn mint = 10315;
MsgMint burn = 10316;
} But, approach 3) would because the actual type name So, if security is our number 1 concern, so far approach 3) seems the most secure - the least amount of manual user coordination with the highest resolution information. |
The reason we sign JSON encodings of transactions rather than the bytes sent over the wire is that we prioritize the ability of humans to inspect what they are signing over the ease of confusion attacks. The strongly typed nature of AMINO JSON by reflecting over the paths has been a pretty robust defense against type confusion/ malleability attacks. As long as we retain this property in what we sign we should be fine. |
Approach 3) would provide identical or almost identical guarantees to amino JSON Approach 1) (our current approach) + JSON, would provide almost the same but I think slightly weaker guarantees, as per my comment above about messing up field ordering/naming. |
A streamlined implementation focused on signing over the bytes of the message as constructed by the client precisely and the routing to the appropriate handler is the ideal goal for the SDK. Approach 3 aligns very will with the separation of concerns outlined in the system architecture where the SDK is described as a router that takes messages and directs them to the module registered as capable of processing them. The type identifier in the The use of JSON encoding as a projection/interpretation of the message contents is not an action the SDK should concern itself with. Indeed this extra interpretation step is an attack surface with grave consequences in terms of transaction malleability as @webmaster128 points out. The client is responsible for determining the best approach for conveying the contents of the message to the end user in such a way that they can certify their intent with a signature. The SDK should not attempt to deem what is correct or incorrect in this process. At the highest levels JSON is never presented to non-technical users for review when they make decisions. A typical application provides a user friendly representation via graphical interface and subsequently transforms this into some form of optimized sequence of bytes for programs to operate over. I believe that option 3 + raw signing alone is aligned with the pure intent of the SDK. Option one and two both cause compromises and constraints to be internalized into the SDK itself. It is better to leave the type registration and mapping to users of the SDK (vs option 1) and avoid bloat and duplication of implementation (technical debt) as option 2 advocates. |
Does anyone have concerns about approach 3 besides message size/performance? I have run some contrived benchmarks to assess the impact of So all that's to say that if performance is the main concern, a) I'm not too alarmed by my preliminary benchmarks and b) there are optimizations we can introduce later to address size concerns. With all of that, I'm going to agree with @iramiller and @jackzampolin in advocating approach 3). I do also want to note that even if @ethanfrey and @webmaster128's approaches for assigning field numbers are feasible, there is a non-trivial developer impact in having to deal with master proto files that all have slightly different combinations of supported msg's and interface |
I don't want to say we risk need to fear a similar issue (which was specific to their address format). The only point is: as soon as there are multiple representation, you make this class of problems more likely. This is not black or white: 2 representaions are safer than 3. And 1 representation safer than 2. It also matters how close the representations are. I'm glad the 3 is off the table now and only 2 (proto/JSON) and 1 (proto) is left. As @iramiller explained very well, this should be open for debate now.
Very good point. I thought about this exact binary compatible signature for a different action as well, but then put into the shitty chains class. But you are right, this is easy to get wrong accidentally. I support both (1) and (3) and am happy to see the signaling towards (3) from many different people. Both (1) and (3) make it simple to generate the real stuff for Tendermint locally, using an existing protobuf lib plus a few documented normalizations. This means every client can submit the protobuf serialization bit by bit and a JSON->proto decoding is never needed. Next up: how to sign? I think this question deserves a new ticket. |
As a result of this discussion, the SDK team is proceeding with approach (3) (using As requested, here is an issue for discussing transaction signing alternatives: #6078 |
@alexanderbez just double-checking, are we in agreement on migrating state encoding to |
Yes, I'm in favor of keeping the codec interfaces where we opt for I'd also remove all the types and other unnecessary boilerplate that we needed due to |
This is how I think we should proceed as well. It seemed like on our call last week most folks were in favor for using |
Would appreciate a review of #6081, which describes our proposed usage of |
Problem Definition
When I first ran a testnet for Regen Network, my team and I were able to customize the Lunie wallet and Big Dipper block explorer to support our testnet by just modifying a few parameters in configuration files. In spite of the UX challenges of the amino encoding, this was a fairly good experience.
The current protobuf tx design would make this much harder. Every app will define its own
Transaction
,Message
, andSignDoc
types and client applications will need to run code generation for every new chain they intend to support and write custom code to:Msg
s into the appMessage
SignDoc
Msg
s likeMsgSubmitProposal
After having spent just a bit of time actually dogfooding our work, I find these tradeoffs unacceptable.
Proposal
I propose we leverage
google.protobuf.Any
to solve these issues. This can be done in a way that improves client developer UX without increasing encoding payload size, by separating signing messages from encoding messages.google.protobuf.Any
provides essentially the same UX as amino JSON for these use cases:One of the biggest concerns around using
Any
initially was that it increases stored message size by encoding the long URL. If, however, we just useAny
for signing and a simplified client UX and still useoneof
s for encoding, we don't have to accept a poor compromise between developer UX and performance.A more detailed proposal will be submitted in the form of a PR.
For Admin Use
The text was updated successfully, but these errors were encountered: