Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle errors when sending tokens within Aurora to NEAR #284

Closed

Conversation

mrLSD
Copy link
Member

@mrLSD mrLSD commented Sep 23, 2021

Details: #269

@mrLSD mrLSD self-assigned this Sep 23, 2021
@mrLSD mrLSD marked this pull request as ready for review September 24, 2021 18:07
@birchmd birchmd force-pushed the handle-errors-when-sending-tokens-within-aurora-to-near branch from a325c3b to a53df9a Compare September 24, 2021 20:37
Comment on lines +184 to +191
let refund_promise: Vec<u8> = PromiseCreateArgs {
target_account_id: String::from_utf8(sdk::current_account_id()).unwrap(),
method: "ft_transfer".to_string(),
args: args.as_bytes().to_vec(),
attached_balance: 1,
attached_gas: costs::FT_TRANSFER_GAS,
parent_promise: None,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my understanding of the problem, this is incorrect.

The error we are trying to catch is the case where the ft_transfer call (sending the tokens from Aurora to the user) failed. So scheduling another ft_transfer is not the solution.

The transaction that needs to be reverted is the Aurora one which triggered this exit call. Now that I am thinking about this though, maybe this is a harder problem than I thought. I was only thinking about the case where the exit resulted from a direct transfer from a user's address to the exit precompile address, which is easy to revert manually, but in the case that this was called because of a more complex transaction, say involving a smart contract, then it is not obvious how to do the revert.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea was simpler:

When making a cross contract call from this precompile we assume it won't fail. This means that from the point of view of the engine the tx that initiated the contract call is not reverted or cancelled. However, we create a promise that can handle the failure. This failure has no impact on the EVM tx.

To handle the failure we concatenate both promises the initial transfer, and the tx that catch the result in case of failure, the latter should make a "private" call to self (aurora-engine) to re-mint the tokens for the user.

This has a UX problem, where the user will see the engine tx succeeded when it could actually fail, but it is a trade-off we can make and fix on the frontend.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To handle the failure we concatenate both promises the initial transfer, and the tx that catch the result in case of failure, the latter should make a "private" call to self (aurora-engine)

Yes, this I am totally on-board with.

to re-mint the tokens for the user.

This is the part I worry about. The implicit assumption here is that the externally owned account (EOA) that initiated the transaction is the one where the tokens should end up in the case of failure. But I'm unsure if that is always true. Suppose the tokens were held by some contract, and it is this contract that was trying to move the tokens for some reason; it's possible sending the tokens back to the EOA creates an exploit of this contract where you can siphon off funds you should not be able to.

Maybe I'm being too paranoid, and it is reasonable to assume that any EOA which can initiate a transaction resulting in an exit precompile being called should be considered the "owner" of those tokens, but I at least want to be explicit about this if we are making such an assumption.

If we do not make this assumption then sending the tokens back to the EOA is not good enough and the correct behaviour would be to revert the whole transaction which caused the exit call. I think this is still doable, but it is more complex because we would need to have some sort of "prepare/commit" kind of scheme. By which I mean, in the case we schedule an exit promise, we do not call apply on the list of state changes returned by the SputnikVM, instead, we store those changes. As with your solution, we schedule a second promise to check on the result of the exit, but now we must handle both the success and failure cases. In the case of success we apply the state changes and return a successful SubmitResult as we normally would. In the case of failure we do not apply the state changes (simply delete them from storage), and return a SubmitResult containing the revert status. Note: in the failure case we would still need to make sure the EOA's nonce was incremented and gas was changed, just like when the EVM reverts normally. Also note that this means the outcome of the engine call is not known by the front-end until after the final promise is executed.

What do you think @mfornet ? Is the simple solution with the big assumption ok, or do we need to implement the complex solution?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to re-mint the tokens for the user.

This is the part I worry about. The implicit assumption here is that the externally owned account (EOA) that initiated the transaction is the one where the tokens should end up in the case of failure. But I'm unsure if that is always true. Suppose the tokens were held by some contract, and it is this contract that was trying to move the tokens for some reason; it's possible sending the tokens back to the EOA creates an exploit of this contract where you can siphon off funds you should not be able to.

I agree with @birchmd, it would be too dangerous to do that. If we really want to go this way, we first need to check all possible scenarios to make sure it's impossible to abuse it. But I think it's possible.

The issue and its solution are not trivial at all. I also think the right way to handle this is to revert EVM changes as @birchmd suggested. This solution is more complex from the engineering point of view but much easier from the perspective of security.

"Re-minting the tokens for the user" solution is simple from the engineering point of view but much more complex from the perspective of the exploits that could be implemented using it as a consequence.

If we do not make this assumption then sending the tokens back to the EOA is not good enough and the correct behaviour would be to revert the whole transaction which caused the exit call. I think this is still doable, but it is more complex because we would need to have some sort of "prepare/commit" kind of scheme. By which I mean, in the case we schedule an exit promise, we do not call apply on the list of state changes returned by the SputnikVM, instead, we store those changes. As with your solution, we schedule a second promise to check on the result of the exit, but now we must handle both the success and failure cases. In the case of success we apply the state changes and return a successful SubmitResult as we normally would. In the case of failure we do not apply the state changes (simply delete them from storage), and return a SubmitResult containing the revert status. Note: in the failure case we would still need to make sure the EOA's nonce was incremented and gas was changed, just like when the EVM reverts normally. Also note that this means the outcome of the engine call is not known by the front-end until after the final promise is executed.

I think this is the correct approach, the important thing here is to ensure that the first promise failed indeed and there was only one promise before. Because if we have a chain of promises and then check only for the success/failure of the last one, it would be possible that some previous promise was successful means applied and thus we can't revert the EVM state in this case. This could be another obstacle for the future implementation of general NEAR contract calls from the EVM, but again, if we use it only for exit precompiles and ensure that only one promised is being called.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach of running the tx and only storing the changes won't work AFAICT. Reason is, suppose I run a tx, which is swapping tokens in some DEX for a price, if it doesn't execute immediately, then other can change the price by interacting with the DEX, and applying the older state will result in an incompatible state.

One potential solution, albeit not great, is to include in the exit precompile the address of the receiver in case of failure. This way we move this generic problem to the caller of the pre-compile, and I guess in most cases, figuring out the receiver of the tokens at the app can be easily determined without conflicts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This issue is related to the proposal of allowing cross-contracts calls from aurora: #291

I think we should be trying to solve the generic problem in the context of the engine, and let applications handle the details. Ideally, we should only provide a safe way to make cross contract calls, and remove the exit to NEAR/Ethereum precompile.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

applying the older state will result in an incompatible state

Yes, I agree. We would need a queue of some kind so that new transactions would not be executed until all the promises of the on-going call were complete. And this does lean heavily into the asynchronous aspects of NEAR.

include in the exit precompile the address of the receiver in case of failure

This seems like an ok solution in terms of solving the problem at hand. Though I think changing the exit precompile API would be a lot of work; we would need to migrate all existing bridged token contracts.

I think we should be trying to solve the generic problem in the context of the engine

I agree this would be elegant. It then becomes a question of priority whether we can afford to wait for this generic solution before the #269 bug is resolved.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be trying to solve the generic problem in the context of the engine, and let applications handle the details.
I agree, we should implement a generic approach.

@birchmd
Copy link
Member

birchmd commented Oct 4, 2021

@mfornet @sept-en and @mrLSD all agree we should implement a generic solution for Aurora <-> NEAR interaction (which of course should include error handling), then use this solution to fix the issue here. I'll do design work on the discussion started by @mfornet and then proceed with implementation. In the meantime let's close this PR.

@birchmd birchmd closed this Oct 4, 2021
@joshuajbouw joshuajbouw deleted the handle-errors-when-sending-tokens-within-aurora-to-near branch July 11, 2022 11:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants