Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Less dumb heartbeat #33

Merged
merged 18 commits into from
Jun 28, 2021
Merged

Less dumb heartbeat #33

merged 18 commits into from
Jun 28, 2021

Conversation

abailly-iohk
Copy link
Contributor

This is a possibly more useful implementation of a Heartbeat or rather a Failure detector.

@abailly-iohk abailly-iohk requested review from ch1bo and KtorZ June 28, 2021 10:19
Copy link
Collaborator

@ch1bo ch1bo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Liveness should be determined using something >> than the hearbeat delay. Maybe double?

hydra-node/src/Hydra/Network/Heartbeat.hs Outdated Show resolved Hide resolved
hydra-node/src/Hydra/Network/Heartbeat.hs Outdated Show resolved Hide resolved
@abailly-iohk abailly-iohk requested a review from ch1bo June 28, 2021 14:33
@abailly-iohk
Copy link
Contributor Author

I have addressed comments from the review, and also tried to make tests and code clearer, the latter by separating threads for sending heartbeats and checking peers.

@abailly-iohk abailly-iohk merged commit 9a87116 into master Jun 28, 2021
@abailly-iohk abailly-iohk deleted the abailly-iohk/less-dumb-heartbeat branch June 28, 2021 15:35
@@ -70,7 +69,8 @@ instance Tx tx => SignableRepresentation (Snapshot tx) where
getSignableRepresentation = encodeUtf8 . show @Text

data ClientResponse tx
= PeerConnected Host
= PeerConnected Party
| PeerDisconnected Party
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would say PartyConnected and PartyDisconnected is more correct as this is really based on the hydra-node (with a configured Party) sending us pings?


shouldSendHeartbeat :: Time -> HeartbeatState -> Bool
shouldSendHeartbeat now HeartbeatState{lastSent} =
maybe True (checkTimeout id now) lastSent
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be quite a bit clearer to me if this would be

Suggested change
maybe True (checkTimeout id now) lastSent
now > lastSent + livenessDelay

updateSuspected heartbeatState now =
atomically $ do
aliveParties <- alive <$> readTVar heartbeatState
let timedOutParties = Map.filter (checkTimeout (2 *) now) aliveParties
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Analogous to shouldSendHeartbeat, a function with something like

checkTimedOut now lastSeen = now > lastSeen + 2 * livenesDelay

Would be easier to read and understand IMO (instead of passing a modifier to a re-used checkTimeout function.

Comment on lines 24 to +32
let sentHeartbeats = runSimOrThrow $ do
sentMessages <- newTVarIO ([] :: [HydraMessage SimpleTx])
sentMessages <- newTVarIO ([] :: [Heartbeat (HydraMessage Integer)])

withHeartbeat testHost (dummyNetwork sentMessages) noop $ \_ -> do
withHeartbeat 1 (captureOutgoing sentMessages) noop $ \_ ->
threadDelay 1.1

readTVarIO sentMessages

sentHeartbeats `shouldBe` replicate 2 (Ping testHost)
sentHeartbeats `shouldBe` [Ping 1]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having the assertions in the runSimOrThrow would be quite a bit easier to read, i.e

    runSimOrThrow $ do
          sentMessages <- newTVarIO ([] :: [Heartbeat (HydraMessage Integer)])

          withHeartbeat 1 (captureOutgoing sentMessages) noop $ \_ ->
            threadDelay 1.1

          sentHeartbeats <- readTVarIO sentMessages
          sentHeartbeats `shouldBe` [Ping 1]

Where the newTVarIO and readTVarIO could be also moved into (typed) boilerplate functions with drafted usage:

    runSimOrThrow $ do
          (capture, getSentHeartbeats) <- captureOutgoing

          withHeartbeat 1 capture noop $ \_ ->
            threadDelay 1.1

          getSentHeartbeats `shouldReturn` [Ping 1]

let component incoming action =
race_
(action (Network noop))
(incoming (Ping 2) >> threadDelay 4 >> incoming (Ping 2) >> threadDelay 7)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having some interpretations to these numbers would've been great, e.g. slightlyMore and moreThanDouble, where these variables would also be parameterized by the actual livenessDelay

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants