-
Notifications
You must be signed in to change notification settings - Fork 90
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
new node manager decoupled from transport layer (#4728)
Refactor: New Node Manager Implementation This PR introduces a new node manager implementation that decouples node lifecycle management from the transport layer. The existing implementation was tightly coupled with bprotocol specifics, making it difficult to evolve our networking stack. It also lacked graceful handling of node re-connection. This refactoring is a key step towards our new transport layer architecture. Key Changes: - Separates node management logic into a clean, transport-agnostic interface - Implements reliable connection state tracking with configurable heartbeat monitoring - Introduces sequence number tracking for both compute and orchestrator messages to support ordered delivery - Maintains fast in-memory state with periodic persistence for durability - Provides event notifications for node connection state changes - Implements proper thread safety for concurrent operations Visible Changes: - Node state APIs now include additional fields for connection state and sequence tracking <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced a custom dictionary for spell checking with new entries. - Added a `HeartbeatClient` for managing heartbeat messages. - Implemented a new `Server` structure for orchestrating node management. - Added new models for connection state management in the Swagger API. - **Bug Fixes** - Updated logic for filtering nodes based on connection status. - Enhanced error handling in command execution. - **Documentation** - Enhanced Swagger API documentation with new models and updated descriptions. - **Tests** - Added comprehensive test suites for heartbeat functionality and node management. - Updated existing tests to reflect changes in data structures and types, particularly for node approval and rejection scenarios. - **Chores** - Removed obsolete files related to previous node management implementations. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
- Loading branch information
Showing
76 changed files
with
3,343 additions
and
2,127 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -440,3 +440,5 @@ IMDS | |
tlsca | ||
Lenf | ||
traefik | ||
bprotocolcompute | ||
bprotocolorchestrator |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
package legacy | ||
|
||
import "github.com/bacalhau-project/bacalhau/pkg/models" | ||
|
||
const ( | ||
HeartbeatMessageType = "heartbeat" | ||
) | ||
|
||
// Heartbeat represents a heartbeat message from a specific node. | ||
// It contains the node ID and the sequence number of the heartbeat | ||
// which is monotonically increasing (reboots aside). We do not | ||
// use timestamps on the client, we rely solely on the server-side | ||
// time to avoid clock drift issues. | ||
type Heartbeat struct { | ||
NodeID string | ||
Sequence uint64 | ||
} | ||
|
||
type RegisterRequest struct { | ||
Info models.NodeInfo | ||
} | ||
|
||
type RegisterResponse struct { | ||
Accepted bool | ||
Reason string | ||
} | ||
|
||
type UpdateInfoRequest struct { | ||
Info models.NodeInfo | ||
} | ||
|
||
type UpdateInfoResponse struct { | ||
Accepted bool | ||
Reason string | ||
} | ||
|
||
type UpdateResourcesRequest struct { | ||
NodeID string | ||
AvailableCapacity models.Resources | ||
QueueUsedCapacity models.Resources | ||
} | ||
|
||
type UpdateResourcesResponse struct{} |
Oops, something went wrong.