Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove on-chain validation of schema from as many places as possible #1131

Closed
Tracked by #760
alvrs opened this issue Jul 10, 2023 · 4 comments · Fixed by #1174
Closed
Tracked by #760

Remove on-chain validation of schema from as many places as possible #1131

alvrs opened this issue Jul 10, 2023 · 4 comments · Fixed by #1174
Assignees

Comments

@alvrs
Copy link
Member

alvrs commented Jul 10, 2023

We currently have some checks in Store to confirm that the input data has the correct length etc. To save gas and "not protect the user from themselves" we should remove these checks and expect users to use the codegen table libraries (which guarantee correct input data) or perform checks themselves. This might also involve passing schema from codegen libraries instead of loading from storage in StoreCore.

@alvrs alvrs added this to the Contracts stable milestone Jul 10, 2023
@alvrs alvrs mentioned this issue Jul 10, 2023
17 tasks
@dk1a
Copy link
Member

dk1a commented Jul 10, 2023

Agree, but we should keep in mind that in some cases (like updating an item in an array), no validation at all would allow someone to calculate a huge index to change someone else's data

@alvrs
Copy link
Member Author

alvrs commented Jul 18, 2023

Let's list the considerations to make a decision on whether to keep or remove schema validation

  • we still need a schema in all methods that modify data to determine where to write/read the data (eg dynamic vs static), but we could add an overload to setRecord and setField that takes an additional schema argument to reading it from storage
  • pro: might save some gas (tablegen libraries can pass in the known schema and we save the storage access cost)
  • pro: might remove some bootstraping issues - eg. maybe we could use a proper tablegen library for the Schema table instead of custom handling of this table
  • con: it would be possible to pass in an "invalid" schema and thereby write data that is not expected for the schema that is registered for the table. Indexers and clients would not know how to handle this update and the indexed data would diverge from the data stored on-chain
    • this might be a case of "don't protect users of themselves": indexers can just ignore invalid data and mark the table as invalid, so users only benefit from our network sync architecture if they use the protocol in the intended way (eg via tablegen libraries, which always use the expected schemas)
  • con: adding more methods to Store might be an issue in terms of contract size limit, tbd (one possibility could be to not provide methods that load the schema from store but always require it to be passed in, but if possible i'd like to keep the method versions that can be called without passing in a schema for better dev ex)

Agree, but we should keep in mind that in some cases (like updating an item in an array), no validation at all would allow someone to calculate a huge index to change someone else's data

I believe we could circumvent that by still requiring a schema to be passed in to the method - the schema still must be a valid schema, so array length is limited by the PackedCounter limitations

@holic
Copy link
Member

holic commented Jul 18, 2023

con: it would be possible to pass in an "invalid" schema and thereby write data that is not expected for the schema that is registered for the table. Indexers and clients would not know how to handle this update and the indexed data would diverge from the data stored on-chain

  • this might be a case of "don't protect users of themselves": indexers can just ignore invalid data and mark the table as invalid, so users only benefit from our network sync architecture if they use the protocol in the intended way (eg via tablegen libraries, which always use the expected schemas)

I think the indexer will need to be defensive about this anyway, because we can't guarantee someone will misuse the protocol outside of our tooling. Or maybe someone forks our protocol but uses the same event names and changes how it's encoded. So I kinda see this as not-really-a-con.

  • con: adding more methods to Store might be an issue in terms of contract size limit, tbd (one possibility could be to not provide methods that load the schema from store but always require it to be passed in, but if possible i'd like to keep the method versions that can be called without passing in a schema for better dev ex)

I agree on the dev ex side of things when using the World/Store direction, but I am curious if our codegen libraries could already automate around this by hardcoding the schema into the libraries and pass them into the necessary functions, avoiding the schema load step.

Overall, this direction sounds like a win!

@alvrs
Copy link
Member Author

alvrs commented Jul 18, 2023

I agree on the dev ex side of things when using the World/Store direction, but I am curious if our codegen libraries could already automate around this by hardcoding the schema into the libraries and pass them into the necessary functions, avoiding the schema load step.

On the contract side it's probably fine because the codegen libs would handle it, but it would make the devex slightly worse when interacting with tables directly from the client or scripts. Probably worth the tradeoff for smaller bytecode since it's not that big of a deal in terms of devex.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

3 participants