State Recovery Swapd Implementation #485

TheCharlatan · 2022-06-09T13:11:48Z

This adds swapd state serialization, checkpointing and recovery.

The current workflow for restoration is only for cases where farcasterd is restarted. Peer reconnection is also not handled.

I left out of scope for now checking if we are overwriting a later state on recovery. For the current workflow it does not add much value and I don't want this pull request to have an even larger scope.

codecov-commenter · 2022-06-09T13:29:05Z

Codecov Report

Merging #485 (18a9d92) into main (a2f8071) will decrease coverage by 0.8%.
The diff coverage is 0.5%.

@@           Coverage Diff           @@
##            main    #485     +/-   ##
=======================================
- Coverage   12.3%   11.5%   -0.8%     
=======================================
  Files         34      34             
  Lines       9033    9631    +598     
=======================================
+ Hits        1111    1112      +1     
- Misses      7922    8519    +597

Impacted Files	Coverage Δ
src/farcasterd/runtime.rs	`0.0% <0.0%> (ø)`
src/rpc/request.rs	`15.2% <0.0%> (-0.1%)`	⬇️
src/swapd/runtime.rs	`0.0% <0.0%> (ø)`
src/swapd/swap_state.rs	`0.0% <0.0%> (ø)`
src/swapd/syncer_client.rs	`0.0% <0.0%> (ø)`
src/swapd/temporal_safety.rs	`0.0% <0.0%> (ø)`
src/walletd/runtime.rs	`0.0% <0.0%> (ø)`
src/databased/runtime.rs	`26.9% <3.7%> (-3.4%)`	⬇️
src/rpc/mod.rs	`22.2% <100.0%> (-2.8%)`	⬇️
... and 9 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a2f8071...18a9d92. Read the comment docs.

TheCharlatan · 2022-06-10T22:23:51Z

I had a successful state restoration with the latest commit. I made and took an offer, funded it, terminated farcasterd, started farcasterd, restored the checkpoint and eventually got the amount refunded again. I'm going to keep on polishing and testing and on this pull request while I wait for review and merge of the others.

src/farcasterd/runtime.rs

doc/staterecovery_sequencediagram.txt

src/walletd/runtime.rs

src/rpc/request.rs

zkao

Since there are proper tests written already testing the added functionality, I would like to explore and test more on a fork of this branch

I guess I understood enough of the overall architecture, and the most of the intends on the code

zkao · 2022-06-27T14:40:12Z

src/farcasterd/runtime.rs

+                        ServiceBus::Msg,
+                        ServiceId::Farcasterd,
+                        ServiceId::Swap(swap_id.clone()),
+                        Request::Hello,


I'm not convinced reusing Request::Hello is better than creating a new Request::NewVariant, that inherits no further semantics

farcasterd being the broker will send (or only receive?) this Hello request on the background as well

I guess we can add a ConnectionTest?

zkao · 2022-06-27T14:45:14Z

src/farcasterd/runtime.rs

+                let CheckpointEntry {
+                    public_offer,
+                    trade_role,
+                    ..


I'm not sure if the way we launch swapd is appropriate on the context of state recovery

passing public_offer and trade_role made sense previously, but should probably go

It's nice to have these, so you can retrieve them when you do GetInfo on the swap.

my point is not about not having them. It's about not passing them as command line args, passing them through the Ctl bus instead

zkao · 2022-06-27T14:47:08Z

src/farcasterd/runtime.rs

+                let _child = launch(
+                    "swapd",
+                    &[
+                        swap_id.to_hex(),
+                        public_offer.to_string(),
+                        trade_role.to_string(),
+                    ],


unconvinced weather public_offer and trade_role should be command line args here

src/swapd/runtime.rs

zkao · 2022-06-27T16:11:02Z

src/swapd/runtime.rs

                                self.txs.remove(&TxLabel::Buy);
                                self.txs.remove(&TxLabel::Cancel);
                                self.txs.remove(&TxLabel::Punish);
                            }
                            TxLabel::Buy
                                if self.temporal_safety.final_tx(*confirmations, Coin::Bitcoin)
-                                    && self.state.a_refundsig()
-                                    && self.state.a_buy_published() =>


why is this being removed? related to recovering state? and not yet knowing that you published buy? alice must checkpoint before publishing buy and after, so if a_buy_published() returns false after recovery, there is a bug

I think I get it. Checkpoint is not updated after checkpoint, but please confirm!

Not sure that is a bug. I was trying to be super strict on state transitions, but for state recovery it may justify it.

If that is the case, I find it more correct to set buy_published to true before checkpointing

I think I get it. Checkpoint is not updated after checkpoint, but please confirm!

Yeah, that is it.

I need to understand what we are protecting against by checking a_buy_published? How can this case be hit in the first place without it being published?

it does not look like a_buy_published is safety critical

It especially prevents us from publishing buy. the a_buy_published() == false protects us from not doing stuff out of order

How can this case be hit in the first place without it being published?

For me if it happens that we find Buy tx and are in a state that a_buy_published == false, we have a bug that must be addressed, because the protocol state and swapd are out of sync. This is why I enjoy swapd being very strict on its state evolution. But I agree this one specifically can go.

src/swapd/runtime.rs

src/walletd/runtime.rs

TheCharlatan · 2022-06-28T21:10:48Z

Thank you for the great review zkao. I will attempt to address your comments first in writing and will follow-up with some code later.

…cast

…atched txids.

zkao

LGTM, continue deving in other PRs

src/walletd/runtime.rs

src/swapd/runtime.rs

zkao · 2022-06-29T09:10:29Z

src/swapd/runtime.rs

                                self.txs.remove(&TxLabel::Buy);
                                self.txs.remove(&TxLabel::Cancel);
                                self.txs.remove(&TxLabel::Punish);
                            }
                            TxLabel::Buy
                                if self.temporal_safety.final_tx(*confirmations, Coin::Bitcoin)
-                                    && self.state.a_refundsig()
-                                    && self.state.a_buy_published() =>


it does not look like a_buy_published is safety critical

It especially prevents us from publishing buy. the a_buy_published() == false protects us from not doing stuff out of order

How can this case be hit in the first place without it being published?

For me if it happens that we find Buy tx and are in a state that a_buy_published == false, we have a bug that must be addressed, because the protocol state and swapd are out of sync. This is why I enjoy swapd being very strict on its state evolution. But I agree this one specifically can go.

zkao · 2022-06-29T11:54:53Z

src/swapd/runtime.rs

+        let res: Result<usize, strict_encoding::Error> =
+            self.txs.iter().try_fold(len, |mut acc, (key, val)| {
+                acc += key.strict_encode(&mut e).map_err(|err| {
+                    strict_encoding::Error::DataIntegrityError(format!("{}", err))
+                })?;
+                acc += val.strict_encode(&mut e).map_err(|err| {
+                    strict_encoding::Error::DataIntegrityError(format!("{}", err))
+                })?;
+                Ok(acc)
+            });
+        len = match res {
+            Ok(val) => Ok(val),
+            Err(err) => Err(strict_encoding::Error::DataIntegrityError(format!(
+                "{}",
+                err
+            ))),
+        }?;


It was my understanding that we are not be doing the (high level) data structure encoding manually, just encode the low level types, and derive the high level ones using the derive macros

high level here being HashMap and the low level the keys and values

i guess the data type of interest is List<T> from request.rs that is just a wrapper over vectors

I did not find a derive for HashMap, so I rolled this myself. I guess we can create a wrapper type for a HashMap, implement the strict encoding for it and then use it here.

src/swapd/runtime.rs

TheCharlatan marked this pull request as draft June 9, 2022 13:11

TheCharlatan force-pushed the state_recovery3 branch 2 times, most recently from 4fecc22 to 61b6efb Compare June 9, 2022 21:16

TheCharlatan force-pushed the state_recovery3 branch 2 times, most recently from a881cba to 344a9ec Compare June 15, 2022 09:25

TheCharlatan added the mainnet label Jun 16, 2022

TheCharlatan force-pushed the state_recovery3 branch 6 times, most recently from b7cae58 to 21046fe Compare June 22, 2022 08:52

This was referenced Jun 23, 2022

Cancel swap #504

Merged

Persist funding address secret key #506

Merged

TheCharlatan commented Jun 24, 2022

View reviewed changes

src/farcasterd/runtime.rs Show resolved Hide resolved

TheCharlatan force-pushed the state_recovery3 branch from 93610cd to 31adf58 Compare June 26, 2022 11:52

TheCharlatan added 12 commits June 26, 2022 16:07

Swapd: Serialize state for checkpointing

f20ea8b

Swapd: Add state checkpointing

6001e84

Swapd: Handle checkpoint state restore

c55c80e

Checkpoint: Start syncers on restore

76fc897

Checkpoint: Restore watch height and watch tx

d5b9ef2

Checkpoint: Patch restore workflow

bbeb0db

Checkpoint: Report failure if no checkpoint was found to restore

63b1bdb

Swapd: Add progress message on successful restore

e09379b

Swapd: Improve restore checkpoint data

f384089

Walletd: Send Tx's before BuyProcedureSig

a6bed90

Sequence Diagram: Reflect reality of swap cancel and refund tx retrieval

e89d38e

Walletd: Use the same fee amount as swapd

6c22c37

TheCharlatan added 6 commits June 26, 2022 16:15

Swapd: Add xmr address to state and checkpoint

437f29c

Request: improve display for logs

0672d28

Walletd: Correct bob pre buy checkpoint

fb37022

Checkpointing: Remove debugging sleeps and logs

5fadcae

Swapd: Use exported functions to handle multipart messages

a71195a

Swapd: Correct checkpoint data after rebase

64c90b9

TheCharlatan force-pushed the state_recovery3 branch from 31adf58 to eaebb4f Compare June 26, 2022 14:17

TheCharlatan added 2 commits June 26, 2022 16:22

Doc: Update state recovery sequencediagram

0fd7b72

Farcasterd: Check if swapd is not running before restoring checkpoint

6304670

TheCharlatan force-pushed the state_recovery3 branch from eaebb4f to 6304670 Compare June 26, 2022 14:23

TheCharlatan marked this pull request as ready for review June 26, 2022 14:26

TheCharlatan linked an issue Jun 26, 2022 that may be closed by this pull request

state management implementation #408

Closed

Farcasterd: Incr initiated swaps on checkpoint restore

f2ec729

zkao reviewed Jun 27, 2022

View reviewed changes

doc/staterecovery_sequencediagram.txt Show resolved Hide resolved

TheCharlatan commented Jun 27, 2022

View reviewed changes

src/walletd/runtime.rs Outdated Show resolved Hide resolved

zkao reviewed Jun 27, 2022

View reviewed changes

src/rpc/request.rs Outdated Show resolved Hide resolved

TheCharlatan added 2 commits June 27, 2022 20:22

Walletd: Log reason for failed tx refund retrieval

b711b76

Rpc: Correct swapd checkpoint display

ce66697

zkao reviewed Jun 28, 2022

View reviewed changes

Swapd: Revert changes to tx removal to invalidate prior states

18a9d92

TheCharlatan force-pushed the state_recovery3 branch from 307d529 to 18a9d92 Compare June 28, 2022 22:00

Swapd: Revert changes to tx retrieval to ensure tx's are not re-broad…

8660617

…cast

TheCharlatan force-pushed the state_recovery3 branch from 8a89085 to 8660617 Compare June 29, 2022 08:42

TheCharlatan and others added 3 commits June 29, 2022 10:54

Swapd: On Checkpoint restore, do not re-watch tx's, only previously w…

70dfa6c

…atched txids.

Walletd: Log warn if wallet already exists on checkpoint restore

b00f632

syncer_client: homogenize the way tasks are produced

c9c7d62

zkao force-pushed the state_recovery3 branch from c47b0e4 to c9c7d62 Compare June 29, 2022 14:46

zkao approved these changes Jun 29, 2022

View reviewed changes

zkao merged commit dcc9b06 into farcaster-project:main Jun 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

State Recovery Swapd Implementation #485

State Recovery Swapd Implementation #485

TheCharlatan commented Jun 9, 2022 •

edited

Loading

codecov-commenter commented Jun 9, 2022 •

edited

Loading

TheCharlatan commented Jun 10, 2022

zkao left a comment

zkao Jun 27, 2022

zkao Jun 27, 2022

TheCharlatan Jun 28, 2022

zkao Jun 29, 2022

zkao Jun 27, 2022

TheCharlatan Jun 28, 2022

zkao Jun 29, 2022

zkao Jun 27, 2022

zkao Jun 27, 2022

zkao Jun 27, 2022

zkao Jun 27, 2022

TheCharlatan Jun 28, 2022

zkao Jun 29, 2022

TheCharlatan commented Jun 28, 2022

zkao left a comment

zkao Jun 29, 2022

zkao Jun 29, 2022

zkao Jun 29, 2022

zkao Jun 29, 2022

TheCharlatan Jun 29, 2022

State Recovery Swapd Implementation #485

State Recovery Swapd Implementation #485

Conversation

TheCharlatan commented Jun 9, 2022 • edited Loading

codecov-commenter commented Jun 9, 2022 • edited Loading

Codecov Report

TheCharlatan commented Jun 10, 2022

zkao left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TheCharlatan commented Jun 28, 2022

zkao left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TheCharlatan commented Jun 9, 2022 •

edited

Loading

codecov-commenter commented Jun 9, 2022 •

edited

Loading