Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EPIC: Isolated Errors #12985

Open
tac0turtle opened this issue Aug 22, 2022 · 6 comments
Open

EPIC: Isolated Errors #12985

tac0turtle opened this issue Aug 22, 2022 · 6 comments

Comments

@tac0turtle
Copy link
Member

Summary

Currently when looking at a stack trace or error in the state machine, it is fairly difficult to identify where the error came from and/or in which stage of tx execution the error occurred.

Problem Definition

Errors and their origination are not easily legible.

Proposal

Isolate errors in a way that I easy to debug and identify at which stage of the process a tx or the state machine is at.

Would love to hear from @zmanian, @ValarDragon and others on needs for what they would like to see

@ValarDragon
Copy link
Contributor

ValarDragon commented Aug 22, 2022

This isn't the issue I have with error isolation, but this is a second high impact, easy fix dev UX problem of the SDK. The problem this issue describes is mostly solved, by wrapping all these state machine errors with errors.Wrap (which should be done, and really ought to be some extremely minimal work at the module manager work)

The systemic complexity right now, is that errors & panics in begin block / end block have no fault isolation. If there is a panic in begin block or end block, the entire chain halts. This is incredibly hard to reason about, as panics can occur from all sorts of crazy locations, and in most cases, the desired effect isn't chain halt. The chain halt should be chain level opt-in.

What should happen is there to by some module level specification that says:

  • module-level error, opt-in to all panics halt the chain, otherwise default panics revert state change for that module. (and eventually we can start defining cross-module panic catching scopes, so e.g. Staking/Distribution/Slashing are atomic together, but epochs can otherwise still proceed/)
  • We make a core function, where it emits some "signal" that the SDK will use to say "this beginblock / endblock panic should halt the chain".

cref: osmosis-labs/osmosis#1305 where we do this for epochs

@alexanderbez
Copy link
Contributor

I think we can have this be really simple. Any panic that results due to an error in BeginBlock or EndBlock, the app should have the ability to state "I want this to halt" and we can do that by providing a simple API that handles recovery which app devs can call in their begin/end block logic.

@tac0turtle tac0turtle moved this to 📝 Todo in Cosmos-SDK Aug 31, 2022
@robert-zaremba
Copy link
Collaborator

robert-zaremba commented Sep 2, 2022

We make a core function, where it emits some "signal" that the SDK will use to say "this beginblock / endblock panic should halt the chain".

This is easy to be done:

  1. Never panic in the Begin /End Blocker . Instead return errors (we have been talking a lot about it and unfortunately there was no decision made).
  2. App will panic on errors form Begin / End Blocker
  3. If someone want's to customize it then he can wrap the module and handle the errors or propagate.

@alexanderbez
Copy link
Contributor

Never panic in the Begin /End Blocker . Instead return errors (we have been talking a lot about it and unfortunately there was no decision made).

This isn't practical IMO

App can panic on error

What does this even mean?

The proposal, #12985 (comment), should be sufficient. This is what Osmosis does and provides a clean API.

@robert-zaremba
Copy link
Collaborator

App can panic on error

What does this even mean?

If we return errors in Begin/End Blockers then it's easier to handle the and manage them, rather than adding more complex catches in the app.
Basically, the app will panic if Begin/End blocker returned an error. Developer can easily mange errors (handle them or propagate) by wrapping a module basic of any x/module.

@alexanderbez
Copy link
Contributor

I don't think that makes much sense personally. Having a single API call that devs can call when executing functions is much practical and cleaner UX.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 📋 Backlog
Development

No branches or pull requests

4 participants