Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: state sync #1026

Open
wants to merge 32 commits into
base: v1.5-dev
Choose a base branch
from
Open

feat!: state sync #1026

wants to merge 32 commits into from

Conversation

lklimek
Copy link
Collaborator

@lklimek lklimek commented Jan 15, 2025

Issue being fixed or feature implemented

What was done?

  • added new call, FinalizeSnapshot, that is sent to ABCI App once state sync is finished
  • fixed bug in QuorumVerify call
  • updated e2e tests to test state sync
  • removed unused config settings

How Has This Been Tested?

Updated e2e tests, updated unit test.

Breaking Changes

  • removed unused configuration settings from [statesync] section: trust-height, trust-hash, trust-period
  • added new ABCI method FinalizeSnapshot
  • bumped ABCI protocol version to 1.3.0

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have made corresponding changes to the documentation

For repository code-owners and collaborators only

  • I have assigned this pull request to a milestone

Summary by CodeRabbit

  • New Features

    • Added snapshot finalization functionality to the ABCI interface.
    • Enhanced state synchronization process with light block validation.
    • Introduced new methods for finalizing snapshots across multiple components.
  • Improvements

    • Simplified state sync configuration by removing trust parameters.
    • Updated ABCI protocol specification with new request and response types.
    • Improved error handling during state synchronization.
  • Version Update

    • ABCI library version upgraded from 1.2.0 to 1.3.0.
  • Dependency Updates

    • Updated several Go module dependencies to newer versions.

Copy link
Contributor

coderabbitai bot commented Jan 15, 2025

Walkthrough

This pull request introduces a comprehensive enhancement to the state synchronization process in the Tenderdash blockchain framework. The changes primarily focus on adding a new FinalizeSnapshot method across multiple components, removing trust-related configuration parameters, and updating the ABCI (Application Blockchain Interface) protocol. The modifications aim to streamline snapshot synchronization, provide more robust validation, and simplify the configuration process for state sync.

Changes

File Change Summary
abci/client/* Added FinalizeSnapshot method to various client implementations (gRPC, routed, socket, proxy)
abci/types/application.go Added FinalizeSnapshot to StateSyncer interface and BaseApplication
abci/types/messages.go Added ToRequestFinalizeSnapshot function for converting requests
abci/types/mocks/application.go Added mock implementation for FinalizeSnapshot
config/config.go Removed trust-related fields (TrustHeight, TrustHash, TrustPeriod)
config/toml.go Removed trust-related configuration parameters
internal/statesync/* Significant refactoring of state sync process, added LightBlock method, modified syncer logic
proto/tendermint/abci/types.proto Added RequestFinalizeSnapshot and ResponseFinalizeSnapshot messages
spec/abci++/api.md Documented new request and response types
version/version.go Updated ABCI semantic version from 1.2.0 to 1.3.0
go.mod Updated various dependencies
test/e2e/* Updated tests to reflect new state synchronization changes

Sequence Diagram

sequenceDiagram
    participant App as ABCI Application
    participant Syncer as State Syncer
    participant Provider as State Provider

    Syncer->>Provider: Retrieve LightBlock
    Provider-->>Syncer: Return LightBlock
    Syncer->>App: FinalizeSnapshot(snapshot_block, genesis_block)
    App->>App: Validate Snapshot Integrity
    alt Validation Successful
        App-->>Syncer: ResponseFinalizeSnapshot (Success)
    else Validation Failed
        App-->>Syncer: ResponseFinalizeSnapshot (Error)
        Syncer->>Syncer: Restart Process
    end
Loading

Poem

🐰 Snapshot Sync, a Rabbit's Tale
Through light blocks and genesis' trail,
Trust parameters fade away,
A new sync method comes to play!
Finalize with grace, my friend! 🌟

✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@lklimek
Copy link
Collaborator Author

lklimek commented Jan 20, 2025

@coderabbitai review

Copy link
Contributor

coderabbitai bot commented Jan 20, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
abci/types/application.go (1)

102-104: Consider adding documentation for the base implementation.

While the empty implementation is appropriate for a base class, adding a comment explaining that this is a default no-op implementation would improve clarity.

+// FinalizeSnapshot provides a no-op implementation for the StateSyncer interface
 func (BaseApplication) FinalizeSnapshot(_ context.Context, _req *RequestFinalizeSnapshot) (*ResponseFinalizeSnapshot, error) {
   return &ResponseFinalizeSnapshot{}, nil
 }
proto/tendermint/abci/types.proto (1)

642-644: Consider adding status field to ResponseFinalizeSnapshot.

The empty response provides no feedback mechanism for the application to indicate success/failure of the finalization process.

Consider adding a status field:

message ResponseFinalizeSnapshot {
+  bool success = 1;  // Indicates if snapshot finalization was successful
+  string error = 2;  // Error message if finalization failed
}
abci/example/kvstore/kvstore.go (1)

583-598: Implementation looks good, but error messages could be more descriptive.

The FinalizeSnapshot implementation correctly verifies both height and app hash, with proper thread safety using mutex. However, the error messages could be more informative.

Consider enhancing the error messages with more details:

-    return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot height mismatch")
+    return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot height mismatch: expected %d, got %d", 
+        app.LastCommittedState.GetHeight(), req.LightBlock.SignedHeader.Header.Height)

-    return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot apphash mismatch")
+    return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot apphash mismatch: expected %x, got %x",
+        app.LastCommittedState.GetAppHash(), req.LightBlock.SignedHeader.Header.AppHash)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a6890a7 and e84e0be.

⛔ Files ignored due to path filters (2)
  • abci/types/types.pb.go is excluded by !**/*.pb.go
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (30)
  • abci/client/grpc_client.go (1 hunks)
  • abci/client/mocks/client.go (1 hunks)
  • abci/client/routed_client.go (1 hunks)
  • abci/client/socket_client.go (1 hunks)
  • abci/example/kvstore/kvstore.go (1 hunks)
  • abci/types/application.go (2 hunks)
  • abci/types/messages.go (1 hunks)
  • abci/types/mocks/application.go (1 hunks)
  • config/config.go (0 hunks)
  • config/toml.go (0 hunks)
  • go.mod (3 hunks)
  • internal/proxy/client.go (1 hunks)
  • internal/statesync/chunks.go (2 hunks)
  • internal/statesync/mocks/stateprovider.go (1 hunks)
  • internal/statesync/reactor.go (2 hunks)
  • internal/statesync/reactor_test.go (1 hunks)
  • internal/statesync/stateprovider.go (5 hunks)
  • internal/statesync/syncer.go (4 hunks)
  • internal/statesync/syncer_test.go (4 hunks)
  • light/client.go (1 hunks)
  • light/client_test.go (1 hunks)
  • proto/tendermint/abci/types.proto (5 hunks)
  • spec/abci++/api.md (7 hunks)
  • test/e2e/networks/rotate.toml (2 hunks)
  • test/e2e/pkg/mockcoreserver/core_server.go (1 hunks)
  • test/e2e/pkg/mockcoreserver/methods.go (1 hunks)
  • test/e2e/runner/setup.go (0 hunks)
  • test/e2e/runner/start.go (1 hunks)
  • types/quorum_sign_data.go (1 hunks)
  • version/version.go (1 hunks)
💤 Files with no reviewable changes (3)
  • test/e2e/runner/setup.go
  • config/toml.go
  • config/config.go
✅ Files skipped from review due to trivial changes (2)
  • light/client_test.go
  • light/client.go
🧰 Additional context used
🪛 Markdownlint (0.37.0)
spec/abci++/api.md

431-431: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines

(MD022, blanks-around-headings)


439-439: null
Link fragments should be valid

(MD051, link-fragments)


945-945: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines

(MD022, blanks-around-headings)

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: tests (02)
  • GitHub Check: e2e-test (rotate)
  • GitHub Check: e2e-test (dashcore)
🔇 Additional comments (31)
version/version.go (1)

14-14: ABCI version bump aligns with new functionality.

The version update from 1.2.0 to 1.3.0 correctly follows semantic versioning for introducing the new FinalizeSnapshot method.

Let's verify this version is consistently used across the codebase:

✅ Verification successful

ABCI version update is consistently implemented.

The version bump from 1.2.0 to 1.3.0 is properly defined and consistently used across the codebase through the centralized version.ABCIVersion constant. No conflicting version strings were found.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any hardcoded ABCI version strings
rg -g '!version/version.go' '"1\.[0-9]\.[0-9]"' 

Length of output: 48492


Script:

#!/bin/bash
# Search for ABCI version references more specifically
rg -i "abci.*version|version.*abci"

# Also check for any proto definitions that might reference ABCI versions
fd -e proto | xargs rg -i "version|abci"

Length of output: 37694

test/e2e/runner/start.go (1)

91-91: Simplified error handling looks good.

The removal of intermediate block variables and direct error checking from waitForHeight aligns with the removal of trust parameters from state sync configuration.

test/e2e/pkg/mockcoreserver/methods.go (1)

68-68: Correctly includes QuorumHash in unmarshal parameters.

The addition of &cmd.QuorumHash ensures proper unmarshalling of the quorum verification request.

abci/types/messages.go (1)

90-94: New ToRequestFinalizeSnapshot function follows established patterns.

The implementation correctly converts RequestFinalizeSnapshot to the generic Request type, maintaining consistency with other message conversion functions in the file.

abci/types/application.go (1)

19-20: LGTM! Well-documented interface addition.

The new FinalizeSnapshot method in the StateSyncer interface is clearly documented and aligns with the state sync enhancement objectives.

test/e2e/pkg/mockcoreserver/core_server.go (1)

149-150: LGTM! Good simplification of return value.

The change simplifies the code by directly returning the boolean value instead of wrapping it in a struct, while maintaining the same functionality.

internal/statesync/mocks/stateprovider.go (1)

148-205: LGTM! Correctly generated mock implementation.

The new LightBlock mock method follows the established patterns for error handling and return values, properly supporting the state sync enhancement.

types/quorum_sign_data.go (1)

164-164: Verify the impact of changing the hash size constant.

The validation now uses crypto.QuorumHashSize instead of crypto.DefaultHashSize. Please ensure:

  1. The new constant exists and has the correct value
  2. All existing quorum hashes comply with the new size requirement
✅ Verification successful

QuorumHashSize constant is correctly defined and used consistently

The validation now uses the semantically correct crypto.QuorumHashSize constant which equals crypto.DefaultHashSize (32 bytes). The codebase consistently validates quorum hashes against this size through multiple validation checks, and test files demonstrate correct usage of the constant.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the existence and usage of QuorumHashSize

# Check if QuorumHashSize is defined
rg "QuorumHashSize.*=.*\d+" -A 2

# Find all quorum hash declarations to verify size compatibility
ast-grep --pattern 'QuorumHash []byte'

Length of output: 75


Script:

#!/bin/bash
# Broader search for hash size constants and usages

# Find QuorumHashSize constant definition and usage
rg "QuorumHashSize" -B 2 -A 2

# Find DefaultHashSize for comparison
rg "DefaultHashSize" -B 2 -A 2

# Find quorum hash related code with more context
rg "QuorumHash" -B 2 -A 2

# Look specifically in crypto package
fd -t f . "crypto" --exec cat {}

Length of output: 80387

internal/proxy/client.go (1)

215-218: LGTM! Implementation follows established patterns.

The FinalizeSnapshot method is well-implemented, following the same patterns as other methods in the file:

  • Consistent error handling
  • Proper timing metrics
  • Correct delegation to the underlying client
abci/client/grpc_client.go (1)

258-260: LGTM! Implementation follows established patterns.

The FinalizeSnapshot method is well-implemented, following the same patterns as other methods in the file:

  • Proper request conversion using ToRequestFinalizeSnapshot
  • Consistent use of grpc.WaitForReady(true) option
  • Correct error handling
internal/statesync/chunks.go (2)

136-144: LGTM! Improved handling of empty chunks.

The changes improve the robustness of chunk handling:

  • Maintains essential nil check for the chunk itself
  • Safely handles empty chunk content by replacing nil with empty byte slice
  • Prevents potential nil pointer dereference while allowing valid empty chunks

164-164: LGTM! Consistent use of modified data.

Correctly uses the potentially modified data variable for writing, ensuring consistency with the empty chunk handling changes.

abci/client/routed_client.go (1)

354-357: LGTM! Implementation follows established patterns.

The FinalizeSnapshot method is well-implemented, following the same patterns as other methods in the file:

  • Consistent use of delegate pattern
  • Proper type assertion for response
  • Correct error handling
internal/statesync/stateprovider.go (3)

43-44: LGTM: Well-structured interface addition

The new LightBlock method follows the existing interface pattern and is properly documented.


126-131: LGTM: Thread-safe implementation of LightBlock

The implementation correctly reuses the existing verifyLightBlockAtHeight method and maintains thread safety through proper mutex usage.


271-276: LGTM: Consistent implementation across providers

The P2P implementation maintains consistency with the RPC implementation, ensuring proper thread safety and error handling.

internal/statesync/syncer.go (2)

353-354: LGTM: Enhanced sync process with light blocks

The change improves the sync process by retrieving complete light blocks instead of just commits, providing more comprehensive block data for verification.


581-599: LGTM: Robust snapshot finalization implementation

The implementation properly handles snapshot finalization by sending light blocks to the ABCI app and includes appropriate logging.

abci/types/mocks/application.go (1)

Line range hint 261-422: LGTM! The new FinalizeSnapshot mock implementation follows established patterns.

The implementation correctly handles:

  • Error cases with appropriate panic messages
  • Type assertions and conversions
  • Method signatures matching the interface
  • Supporting mock types for testing
internal/statesync/reactor_test.go (1)

239-240: LGTM! Mock expectation for FinalizeSnapshot is correctly configured.

The mock is set up to be called exactly once and return an empty response, which is appropriate for this test scenario.

internal/statesync/reactor.go (1)

239-240: LGTM! Trust-related configuration has been cleanly removed.

The changes consistently remove trust-related parameters from both P2P and RPC state provider initialization while maintaining proper logging.

Also applies to: 252-253, 261-261

abci/client/mocks/client.go (1)

365-422: LGTM! The new FinalizeSnapshot mock implementation follows established patterns.

The implementation correctly handles:

  • Error cases with appropriate panic messages
  • Type assertions and conversions
  • Method signatures matching the interface
  • Supporting mock types for testing
test/e2e/networks/rotate.toml (1)

155-155: LGTM! State sync configuration changes look good.

The changes consistently configure p2p-based state sync for both validator06 and full01 nodes, aligning with the removal of trust-based parameters.

Also applies to: 195-195

abci/client/socket_client.go (1)

360-366: LGTM! Clean implementation of FinalizeSnapshot method.

The implementation follows the established pattern for ABCI methods in the socket client, with proper error handling and request/response flow.

go.mod (1)

12-13: Verify compatibility of updated dependencies.

Multiple dependency updates have been made. Please ensure:

  1. The new version of dashd-go (v0.26.1-0.20250113154619-015ef93ad36e) is stable and compatible
  2. The btcec/v2 v2.2.0 update doesn't introduce breaking changes
  3. The upgrade to stable versions (btclog v1.0.0) is intentional

Also applies to: 83-83, 111-111

✅ Verification successful

Dependencies update appears safe to proceed

The verification shows these are indirect dependencies with no direct usage in the codebase. The version updates follow semantic versioning principles, suggesting backward compatibility. While breaking changes exist in dashd-go, they don't appear to affect this codebase as there's no direct usage of the updated components.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for any known issues or breaking changes in the updated dependencies

# Check dashd-go changelog/releases
gh api repos/dashpay/dashd-go/releases | jq -r '.[].body' | grep -i "breaking change\|security"

# Check btcec changelog
gh api repos/dashpay/dashd-go/contents/btcec/v2/CHANGELOG.md | jq -r '.content' | base64 -d | grep -A 5 "## v2.2.0"

Length of output: 356


Script:

#!/bin/bash
# Search for import patterns and potential breaking changes
echo "=== Checking import patterns ==="
rg -l "github.com/dashpay/dashd-go/btcec/v2" --type go

echo -e "\n=== Checking documentation ==="
fd -e md -e txt . | xargs rg -l "dashd-go|btcec" || true

echo -e "\n=== Checking recent changes ==="
git log -n 5 --oneline -- go.mod

echo -e "\n=== Checking current usages ==="
rg "btcec\..*" --type go -A 2

Length of output: 454


Script:

#!/bin/bash
# Check for indirect dependencies and upgrade guides
echo "=== Checking go.mod for indirect dependencies ==="
cat go.mod | grep -A 10 "require"

echo -e "\n=== Looking for upgrade documentation ==="
fd -e md . | xargs rg -l "upgrade|migration|breaking|changelog" || true

echo -e "\n=== Checking dashd-go main repository ==="
gh api repos/dashpay/dashd-go/contents/docs | jq -r '.[].name' || true
gh api repos/dashpay/dashd-go/contents/UPGRADING.md | jq -r '.content' | base64 -d || true

Length of output: 4572

proto/tendermint/abci/types.proto (2)

187-193: LGTM! Well-defined RequestFinalizeSnapshot message.

The message includes the essential light_block field needed for state reconstruction after snapshot sync.


908-908: LGTM! Service definition properly updated.

The FinalizeSnapshot RPC method is correctly added to the ABCIApplication service.

internal/statesync/syncer_test.go (4)

103-106: LGTM: Commit structure properly includes Height field.

The commit structure now correctly includes the Height field, which is essential for proper block validation.


108-115: LGTM: Well-structured LightBlock implementation.

The LightBlock structure properly encapsulates both the Commit and Header, following the standard blockchain block structure pattern.


131-132: LGTM: StateProvider mock updated to use LightBlock.

The mock has been correctly updated to return a LightBlock instead of just the commit, aligning with the new state sync protocol.


268-270: LGTM: FinalizeSnapshot mock properly configured.

The FinalizeSnapshot mock is correctly set up with proper expectations and return values.

spec/abci++/api.md Outdated Show resolved Hide resolved
@lklimek lklimek added this to the v1.5 milestone Jan 20, 2025
@lklimek lklimek marked this pull request as ready for review January 20, 2025 08:55
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
proto/tendermint/abci/types.proto (2)

186-196: Fix documentation formatting.

The documentation has some minor formatting issues:

  1. Missing period after "e.g"
  2. Missing blank line before heading
 // RequestFinalizeSnapshot is called by Tenderdash after successfully applying all snapshot chunks, eg.
+// RequestFinalizeSnapshot is called by Tenderdash after successfully applying all snapshot chunks, e.g.,
 // when the ABCI application returned `ResponseApplySnapshotChunk` with `Result = ACCEPT`.
 // It includes the light block committed at the synced height, which Tenderdash uses to reconstruct its own state.
 // The application should validate the light block against its restored state.
+
 // If the application fails to validate the light block, it should return error in the response.
 // This is considered fatal, non-recoverable consensus failure and will cause Tenderdash to restart.

Line range hint 947-956: Fix documentation formatting.

Add a blank line before the heading.

 <a name="tendermint-abci-ResponseFinalizeSnapshot"></a>
+
 ### ResponseFinalizeSnapshot
 The response to a `RequestPrepareSnapshot` message.
abci/example/kvstore/kvstore.go (2)

583-600: Consider enhancing the logging for better observability.

The debug logging could be more detailed to help with troubleshooting. Consider adding logging for the validation steps and error cases.

 func (app *Application) FinalizeSnapshot(_ctx context.Context, req *abci.RequestFinalizeSnapshot) (*abci.ResponseFinalizeSnapshot, error) {
     app.mu.Lock()
     defer app.mu.Unlock()
 
+    app.logger.Debug("Starting snapshot finalization",
+        "expected_height", app.LastCommittedState.GetHeight(),
+        "expected_app_hash", app.LastCommittedState.GetAppHash(),
+        "snapshot_height", req.LightBlock.SignedHeader.Header.Height,
+        "snapshot_app_hash", req.LightBlock.SignedHeader.Header.AppHash)
+
     // we verify snapshot height and app hash, as there is no additional logic to be called here
     if app.LastCommittedState.GetHeight() != req.LightBlock.SignedHeader.Header.Height {
+        app.logger.Error("Snapshot height mismatch",
+            "expected", app.LastCommittedState.GetHeight(),
+            "got", req.LightBlock.SignedHeader.Header.Height)
         return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot height mismatch: expected %d, got %d",
             app.LastCommittedState.GetHeight(), req.LightBlock.SignedHeader.Header.Height)
     }
 
     if !app.LastCommittedState.GetAppHash().Equal(req.LightBlock.SignedHeader.Header.AppHash) {
+        app.logger.Error("Snapshot app hash mismatch",
+            "expected", app.LastCommittedState.GetAppHash(),
+            "got", req.LightBlock.SignedHeader.Header.AppHash)
         return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot apphash mismatch: expected %x, got %x",
             app.LastCommittedState.GetAppHash(), req.LightBlock.SignedHeader.Header.AppHash)
     }
 
     app.logger.Debug("FinalizeSnapshot finished successfully", "req", req)
     return &abci.ResponseFinalizeSnapshot{}, nil
 }

583-600: Well-integrated state sync implementation.

The FinalizeSnapshot method is well-integrated into the state sync workflow, following the same patterns as other ABCI methods (mutex locking, error handling, logging). It complements the existing snapshot-related methods (OfferSnapshot, ApplySnapshotChunk, etc.) to provide a complete state synchronization solution.

The implementation demonstrates good practices:

  1. Consistent error handling across all snapshot-related methods
  2. Thread-safety through mutex usage
  3. Clear separation of concerns between different snapshot operations
  4. Proper validation of snapshot metadata before finalization
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e84e0be and af79697.

⛔ Files ignored due to path filters (1)
  • abci/types/types.pb.go is excluded by !**/*.pb.go
📒 Files selected for processing (4)
  • abci/example/kvstore/kvstore.go (1 hunks)
  • abci/types/application.go (2 hunks)
  • proto/tendermint/abci/types.proto (5 hunks)
  • spec/abci++/api.md (7 hunks)
🧰 Additional context used
🪛 LanguageTool
spec/abci++/api.md

[duplication] ~431-~431: Possible typo: you repeated a word.
Context: ...abci-RequestFinalizeSnapshot"> ### RequestFinalizeSnapshot RequestFinalizeSnapshot is called by Tenderdash after successfu...

(ENGLISH_WORD_REPEAT_RULE)


[uncategorized] ~432-~432: The abbreviation “e.g.” (= for example) requires two periods.
Context: ...cessfully applying all snapshot chunks, eg. when the ABCI application returned `Res...

(E_G)

🪛 Markdownlint (0.37.0)
spec/abci++/api.md

431-431: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines

(MD022, blanks-around-headings)


443-443: null
Link fragments should be valid

(MD051, link-fragments)


949-949: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines

(MD022, blanks-around-headings)

🔇 Additional comments (5)
abci/types/application.go (2)

19-20: LGTM! Interface addition is well-documented.

The FinalizeSnapshot method is properly added to the StateSyncer interface with clear documentation.


102-105: LGTM! Base implementation provides good default behavior.

The no-op implementation in BaseApplication follows the pattern used for other methods.

proto/tendermint/abci/types.proto (1)

Line range hint 252-252: LGTM! Protocol changes are well-structured.

The new finalize_snapshot field is properly added to Request/Response messages and the ABCIApplication service.

Also applies to: 836-836, 1402-1402

spec/abci++/api.md (1)

431-444: Documentation needs enhancement.

The FinalizeSnapshot documentation should be expanded to include:

  1. When exactly Tenderdash calls this method in the state sync process
  2. What the application should do with the light block
  3. Error handling expectations
  4. Impact on the node's state

Consider adding a detailed section similar to other ABCI methods:

### RequestFinalizeSnapshot

#### Usage
- Called by Tenderdash after successfully applying all snapshot chunks
- The light block allows the application to verify the final state
- Applications should validate the light block against their restored state
- Any errors during finalization should trigger a state sync retry

#### When does Tenderdash call it?
Detailed sequence of events...
🧰 Tools
🪛 LanguageTool

[duplication] ~431-~431: Possible typo: you repeated a word.
Context: ...abci-RequestFinalizeSnapshot"> ### RequestFinalizeSnapshot RequestFinalizeSnapshot is called by Tenderdash after successfu...

(ENGLISH_WORD_REPEAT_RULE)


[uncategorized] ~432-~432: The abbreviation “e.g.” (= for example) requires two periods.
Context: ...cessfully applying all snapshot chunks, eg. when the ABCI application returned `Res...

(E_G)

🪛 Markdownlint (0.37.0)

431-431: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines

(MD022, blanks-around-headings)


443-443: null
Link fragments should be valid

(MD051, link-fragments)

abci/example/kvstore/kvstore.go (1)

583-600: LGTM! The implementation looks solid.

The FinalizeSnapshot method correctly validates both the height and AppHash of the snapshot against the last committed state, with proper error handling and thread safety.

A few notable strengths of this implementation:

  1. Thread-safe with mutex locking
  2. Clear, descriptive error messages
  3. Proper debug logging
  4. Consistent with ABCI protocol requirements

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
proto/tendermint/abci/types.proto (1)

186-196: Documentation needs enhancement.

The documentation should be expanded to include:

  1. When exactly Tenderdash calls this method in the state sync process
  2. What the application should do with the light block
  3. Error handling expectations
  4. Impact on the node's state

Consider adding a detailed section similar to other ABCI methods:

### RequestFinalizeSnapshot

#### Usage
- Called by Tenderdash after successfully applying all snapshot chunks
- The light block allows the application to verify the final state
- Applications should validate the light block against their restored state
- Any errors during finalization should trigger a state sync retry

#### When does Tenderdash call it?
Detailed sequence of events...
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between af79697 and a9cf16f.

⛔ Files ignored due to path filters (1)
  • abci/types/types.pb.go is excluded by !**/*.pb.go
📒 Files selected for processing (2)
  • proto/tendermint/abci/types.proto (5 hunks)
  • spec/abci++/api.md (7 hunks)
🧰 Additional context used
🪛 Markdownlint (0.37.0)
spec/abci++/api.md

431-431: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines

(MD022, blanks-around-headings)


443-443: null
Link fragments should be valid

(MD051, link-fragments)


949-949: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines

(MD022, blanks-around-headings)

⏰ Context from checks skipped due to timeout of 90000ms (12)
  • GitHub Check: test_apps
  • GitHub Check: test_abci_cli
  • GitHub Check: tests (05)
  • GitHub Check: tests (04)
  • GitHub Check: tests (03)
  • GitHub Check: tests (02)
  • GitHub Check: e2e-test (rotate)
  • GitHub Check: tests (01)
  • GitHub Check: Super linter
  • GitHub Check: tests (00)
  • GitHub Check: e2e-test (dashcore)
  • GitHub Check: golangci-lint
🔇 Additional comments (5)
proto/tendermint/abci/types.proto (3)

35-35: LGTM! Field addition follows protobuf best practices.

The finalize_snapshot field is properly added to the Request message with an appropriate field number.


529-529: LGTM! Field addition follows protobuf best practices.

The finalize_snapshot field is properly added to the Response message with an appropriate field number.


Line range hint 1402-1402: LGTM! Service definition is properly updated.

The FinalizeSnapshot RPC is correctly added to the ABCIApplication service.

spec/abci++/api.md (2)

429-449: Documentation needs enhancement.

The documentation feedback from the proto file review applies here as well.

🧰 Tools
🪛 Markdownlint (0.37.0)

431-431: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines

(MD022, blanks-around-headings)


443-443: null
Link fragments should be valid

(MD051, link-fragments)


21-21: LGTM! Table of contents is properly updated.

The table of contents correctly reflects the new FinalizeSnapshot messages.

Also applies to: 39-39

proto/tendermint/abci/types.proto Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (7)
spec/abci++/api.md (2)

429-450: Fix markdown formatting.

Add blank lines around headings to comply with markdown standards:

 <a name="tendermint-abci-RequestFinalizeSnapshot"></a>
+
 ### RequestFinalizeSnapshot
+
 RequestFinalizeSnapshot is called by Tenderdash...
🧰 Tools
🪛 LanguageTool

[duplication] ~431-~431: Possible typo: you repeated a word.
Context: ...abci-RequestFinalizeSnapshot"> ### RequestFinalizeSnapshot RequestFinalizeSnapshot is called by Tenderdash after successfu...

(ENGLISH_WORD_REPEAT_RULE)

🪛 Markdownlint (0.37.0)

431-431: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines

(MD022, blanks-around-headings)


443-443: null
Link fragments should be valid

(MD051, link-fragments)


444-444: null
Link fragments should be valid

(MD051, link-fragments)


948-957: Fix markdown formatting and description.

  1. Add blank lines around headings
  2. Fix the response description:
 <a name="tendermint-abci-ResponseFinalizeSnapshot"></a>
+
 ### ResponseFinalizeSnapshot
+
-The response to a `RequestPrepareSnapshot` message.
+The response to a `RequestFinalizeSnapshot` message.
🧰 Tools
🪛 Markdownlint (0.37.0)

950-950: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines

(MD022, blanks-around-headings)

light/example_test.go (3)

117-131: Consider making the test more configurable and reducing log verbosity.

  • The hardcoded chain ID "dashmate_local_37" makes the test environment-dependent. Consider making it configurable through environment variables or test parameters.
  • Using LogLevelTrace in tests can generate excessive output. Consider using LogLevelInfo unless detailed logging is specifically needed for debugging.
-const CHAINID = "dashmate_local_37"
+const defaultChainID = "dashmate_local_37"
+
+func getTestChainID() string {
+    if chainID := os.Getenv("TEST_CHAIN_ID"); chainID != "" {
+        return chainID
+    }
+    return defaultChainID
+}

-logger, err := log.NewDefaultLogger(log.LogFormatPlain, log.LogLevelTrace)
+logger, err := log.NewDefaultLogger(log.LogFormatPlain, log.LogLevelInfo)

133-136: Remove commented code.

If the memory app setup is no longer needed, the commented code should be removed rather than committed. If it's kept for reference, please add a comment explaining why the app setup is different from TestExampleClient.


159-201: Reduce code duplication and improve documentation.

The test shares significant verification logic with TestExampleClient. Consider:

  1. Extracting common verification logic into a helper function.
  2. Documenting why NewClientAtHeight is used with height 1 instead of NewClient.
+// verifyLightBlocks performs common light block verification steps
+func verifyLightBlocks(t *testing.T, ctx context.Context, c *light.Client) {
+    // verify the block at height 3
+    _, err := c.VerifyLightBlockAtHeight(ctx, 3, time.Now())
+    require.NoError(t, err)
+
+    // retrieve light block at height 3
+    _, err = c.TrustedLightBlock(3)
+    require.NoError(t, err)
+
+    // update to the latest height
+    lb, err := c.Update(ctx, time.Now())
+    require.NoError(t, err)
+    
+    return lb
+}

// In both test functions:
-// verify the block at height 3
-_, err = c.VerifyLightBlockAtHeight(ctx, 3, time.Now())
-if err != nil {
-    t.Fatal(err)
-}
-// ... rest of the verification logic
+lb := verifyLightBlocks(t, ctx, c)
internal/statesync/syncer.go (2)

375-382: Enhance error message for initial height validation.

While the validation logic is correct, the error message could be more descriptive to help with debugging.

Consider this improvement:

-       return sm.State{}, nil, fmt.Errorf("initial genesis height %d is invalid", state.InitialHeight)
+       return sm.State{}, nil, fmt.Errorf("invalid initial genesis height %d: must be greater than 0", state.InitialHeight)

590-614: Consider adding debug logging and nil checks.

The implementation is solid, but could benefit from additional safeguards and debugging capabilities.

Consider these improvements:

  1. Add nil checks for block parameters:
 func (s *syncer) finalizeSnapshot(ctx context.Context, snapshot *snapshot, genesisBlock *types.LightBlock, snapshotBlock *types.LightBlock) error {
+       if genesisBlock == nil || snapshotBlock == nil {
+           return fmt.Errorf("nil block provided: genesis=%v snapshot=%v", genesisBlock == nil, snapshotBlock == nil)
+       }
  1. Add debug logging for proto conversion steps:
+       s.logger.Debug("Converting snapshot block to proto", "height", snapshotBlock.Height)
        snapshotBlockProto, err := snapshotBlock.ToProto()
        if err != nil {
                return fmt.Errorf("failed to convert snapshot block %s to proto: %w", snapshotBlock.String(), err)
        }
+       s.logger.Debug("Converting genesis block to proto", "height", genesisBlock.Height)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bc399c8 and 9d3dae4.

⛔ Files ignored due to path filters (2)
  • abci/types/types.pb.go is excluded by !**/*.pb.go
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (6)
  • go.mod (3 hunks)
  • internal/statesync/syncer.go (4 hunks)
  • light/example_test.go (1 hunks)
  • proto/tendermint/abci/types.proto (5 hunks)
  • spec/abci++/api.md (7 hunks)
  • test/e2e/pkg/mockcoreserver/core_server.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • test/e2e/pkg/mockcoreserver/core_server.go
  • go.mod
🧰 Additional context used
📓 Learnings (1)
proto/tendermint/abci/types.proto (1)
Learnt from: lklimek
PR: dashpay/tenderdash#1026
File: proto/tendermint/abci/types.proto:645-648
Timestamp: 2025-01-20T11:38:49.099Z
Learning: In ABCI protocol, FinalizeSnapshot call should return error directly for validation failures rather than using error fields in ResponseFinalizeSnapshot message, as these failures are considered fatal consensus failures that should cause Tenderdash to restart.
🪛 LanguageTool
spec/abci++/api.md

[duplication] ~431-~431: Possible typo: you repeated a word.
Context: ...abci-RequestFinalizeSnapshot"> ### RequestFinalizeSnapshot RequestFinalizeSnapshot is called by Tenderdash after successfu...

(ENGLISH_WORD_REPEAT_RULE)

🪛 Markdownlint (0.37.0)
spec/abci++/api.md

431-431: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines

(MD022, blanks-around-headings)


443-443: null
Link fragments should be valid

(MD051, link-fragments)


444-444: null
Link fragments should be valid

(MD051, link-fragments)


950-950: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines

(MD022, blanks-around-headings)

🔇 Additional comments (8)
proto/tendermint/abci/types.proto (4)

35-35: LGTM!

The field addition to the Request message is correctly implemented with proper field numbering and placement.


186-198: LGTM! Well-documented message definition.

The RequestFinalizeSnapshot message is well-structured with:

  • Clear documentation of when it's called
  • Proper error handling guidance
  • Essential fields for validation

531-531: LGTM!

The field addition to the Response message is correctly implemented with proper field numbering and placement.


647-650: LGTM! Empty response is appropriate.

The empty ResponseFinalizeSnapshot message aligns with the protocol design where validation failures are returned as errors directly from the ABCI call, as these are considered fatal consensus failures.

spec/abci++/api.md (1)

21-21: LGTM!

The table of contents is properly updated with the new message types.

Also applies to: 39-39

internal/statesync/syncer.go (3)

353-354: LGTM! Improved error handling and context management.

The transition from using commit to block provides more context and maintains better error handling patterns.

Also applies to: 400-400


384-387: LGTM! Well-structured finalization step.

The finalization step is properly implemented with good error handling practices.


590-614: Verify potential deadlock scenarios in finalization.

Let's verify there are no potential deadlock scenarios with the new finalization step, particularly around mutex usage.

✅ Verification successful

No deadlock risks identified in snapshot finalization

The finalization step is executed without holding any locks and the overall locking pattern in the state sync flow is correct. The code properly manages lock acquisition/release, uses deferred cleanup, and maintains clear lock boundaries during ABCI interactions.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for potential deadlock scenarios in mutex usage patterns
ast-grep --pattern 'func \($_\) finalizeSnapshot\($_\) $_ {
  $$$
  s.mtx.$_($$$)
  $$$
}'

# Search for other mutex operations that might interact with finalization
rg -A 5 's\.mtx\.' | grep -v 'RLock\|RUnlock'

Length of output: 36204


Script:

#!/bin/bash
# Find where finalizeSnapshot is called from
ast-grep --pattern 'finalizeSnapshot($$$)'

# Look for the full context of the call
rg -B 10 'finalizeSnapshot\(' internal/statesync/syncer.go

Length of output: 928


Script:

#!/bin/bash
# Get the context of state sync operation and mutex usage
rg -B 5 -A 5 's\.mtx\.' internal/statesync/syncer.go

# Get the broader context of where finalization happens
rg -B 20 -A 10 'finalizeSnapshot' internal/statesync/syncer.go

Length of output: 3335

light/example_test.go Outdated Show resolved Hide resolved
light/example_test.go Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
abci/example/kvstore/kvstore.go (2)

583-606: LGTM! Consider additional hash verification.

The implementation correctly verifies height and app hash consistency between the restored state and provided blocks. However, consider adding verification of the genesis block's app hash as well.

 func (app *Application) FinalizeSnapshot(_ctx context.Context, req *abci.RequestFinalizeSnapshot) (*abci.ResponseFinalizeSnapshot, error) {
     app.mu.Lock()
     defer app.mu.Unlock()
 
     // we only run some verifications here
 
     if app.LastCommittedState.GetHeight() != req.SnapshotBlock.SignedHeader.Header.Height {
         return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot height mismatch: expected %d, got %d",
             app.LastCommittedState.GetHeight(), req.SnapshotBlock.SignedHeader.Header.Height)
     }
 
     if !app.LastCommittedState.GetAppHash().Equal(req.SnapshotBlock.SignedHeader.Header.AppHash) {
         return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot apphash mismatch: expected %x, got %x",
             app.LastCommittedState.GetAppHash(), req.SnapshotBlock.SignedHeader.Header.AppHash)
     }
 
     if app.initialHeight != req.GenesisBlock.SignedHeader.Header.Height {
         return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("genesis height mismatch: expected %d, got %d",
             app.initialHeight, req.GenesisBlock.SignedHeader.Header.Height)
     }
+
+    // Verify genesis block app hash
+    genesisAppHash := make([]byte, crypto.DefaultAppHashSize)
+    if !bytes.Equal(req.GenesisBlock.SignedHeader.Header.AppHash, genesisAppHash) {
+        return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("genesis apphash mismatch: expected %x, got %x",
+            genesisAppHash, req.GenesisBlock.SignedHeader.Header.AppHash)
+    }
 
     app.logger.Debug("FinalizeSnapshot finished successfully", "req", req)
     return &abci.ResponseFinalizeSnapshot{}, nil
 }

589-602: Enhance error message format for better debugging.

The error messages could be more descriptive by including additional context.

-        return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot height mismatch: expected %d, got %d",
+        return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot height mismatch during state sync finalization: expected %d, got %d",
             app.LastCommittedState.GetHeight(), req.SnapshotBlock.SignedHeader.Header.Height)
     }
 
     if !app.LastCommittedState.GetAppHash().Equal(req.SnapshotBlock.SignedHeader.Header.AppHash) {
-        return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot apphash mismatch: expected %x, got %x",
+        return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot apphash mismatch during state sync finalization: expected %x, got %x",
             app.LastCommittedState.GetAppHash(), req.SnapshotBlock.SignedHeader.Header.AppHash)
     }
 
     if app.initialHeight != req.GenesisBlock.SignedHeader.Header.Height {
-        return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("genesis height mismatch: expected %d, got %d",
+        return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("genesis height mismatch during state sync finalization: expected %d, got %d",
             app.initialHeight, req.GenesisBlock.SignedHeader.Header.Height)
     }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9d3dae4 and 94c5f69.

📒 Files selected for processing (2)
  • abci/example/kvstore/kvstore.go (1 hunks)
  • internal/statesync/syncer_test.go (5 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: e2e-test (rotate)
  • GitHub Check: e2e-test (dashcore)
🔇 Additional comments (5)
internal/statesync/syncer_test.go (5)

86-86: LGTM: Proper initialization of InitialHeight.

The addition of InitialHeight: 1 ensures complete state initialization for testing.


103-115: LGTM: Well-structured block representation.

The changes properly implement the block structure with consistent height values across commit and header. The encapsulation of SignedHeader within LightBlock follows the correct blockchain data structure pattern.


131-132: LGTM: Proper mock setup for LightBlock.

The stateProvider mock is correctly configured to return the new LightBlock structure with matching heights.


268-270: LGTM: Proper mock setup for FinalizeSnapshot.

The new FinalizeSnapshot mock is correctly configured with appropriate expectations and return values, aligning with the state sync feature implementation.


633-634: LGTM: Proper chunk initialization and queueing.

The chunk handling code correctly initializes and enqueues the chunk ID for processing.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (5)
internal/statesync/syncer.go (3)

354-355: Improve error handling for light block retrieval.

The error handling for light block retrieval could be more specific about the failure reason.

-s.logger.Info("failed to get and verify light block. Dropping snapshot and trying again",
+s.logger.Error("failed to get and verify light block. Dropping snapshot and trying again",
     "err", err, "height", snapshot.Height)
+    "witnesses_remaining", s.stateProvider.WitnessesLeft())

Also applies to: 363-363, 365-365


380-384: Consider caching the genesis block.

The genesis block is unlikely to change during sync. Consider caching it to avoid repeated retrievals.

+// Add to syncer struct
+type syncer struct {
+    // ... existing fields ...
+    genesisBlock *types.LightBlock
+}

+// Add getter method
+func (s *syncer) getGenesisBlock(ctx context.Context, height uint64) (*types.LightBlock, error) {
+    if s.genesisBlock != nil {
+        return s.genesisBlock, nil
+    }
+    block, err := s.getStateProvider().LightBlock(ctx, height)
+    if err != nil {
+        return nil, err
+    }
+    s.genesisBlock = block
+    return block, nil
+}

592-616: Add metrics for snapshot finalization.

The new finalizeSnapshot function should include metrics to track finalization success/failure rates.

 func (s *syncer) finalizeSnapshot(ctx context.Context, snapshot *snapshot, genesisBlock *types.LightBlock, snapshotBlock *types.LightBlock) error {
+    start := time.Now()
+    defer func() {
+        s.metrics.SnapshotFinalizationTime.Set(time.Since(start).Seconds())
+    }()
     // ... existing code ...
     _, err = s.conn.FinalizeSnapshot(ctx, &abci.RequestFinalizeSnapshot{
         SnapshotBlock: snapshotBlockProto,
         GenesisBlock:  genesisBlockProto,
     })
+    if err != nil {
+        s.metrics.SnapshotFinalizationFailures.Inc()
+        return err
+    }
+    s.metrics.SnapshotFinalizationSuccess.Inc()
     return err
 }
libs/math/safemath.go (1)

261-308: Extend Max and Min functions to support all Integer types

The Max and Min functions may panic with "unsupported type" when used with custom integer types that satisfy the Integer interface but are not explicitly handled in the switch cases.

Consider modifying the switch statements to handle all types that satisfy the Integer interface, possibly using type assertions or reflection to generalize the implementation. Alternatively, document this limitation to inform users about the supported types.

libs/math/safemath_test.go (1)

88-191: Enhance test coverage for custom integer types in TestSafeConvert

The TestSafeConvert function provides thorough test cases for standard integer types. However, it may be beneficial to include test cases for custom types that satisfy the Integer interface to ensure that SafeConvert handles them correctly.

Consider adding test cases with custom integer types to validate the function's compatibility with all types that implement the Integer interface.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 94c5f69 and 81367da.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (4)
  • go.mod (3 hunks)
  • internal/statesync/syncer.go (6 hunks)
  • libs/math/safemath.go (3 hunks)
  • libs/math/safemath_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • go.mod
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: e2e-test (rotate)
  • GitHub Check: e2e-test (dashcore)
🔇 Additional comments (5)
internal/statesync/syncer.go (5)

19-19: LGTM: Import addition for math package.

The addition of the tmmath import is appropriate for the new mathematical operations used in the code.


376-378: LGTM: Initial height validation.

Good addition of initial height validation to prevent invalid genesis states.


386-389: LGTM: Snapshot finalization implementation.

Good addition of snapshot finalization with proper error handling and context propagation.


402-402: LGTM: Updated return value.

Correctly updated to return the commit from the light block.


219-219: Consider using type conversion function for metrics.

Using MustConvertInt64 for metrics could panic. Consider using a safer conversion method or handling potential errors.

-s.lastSyncedSnapshotHeight = tmmath.MustConvertInt64(snapshot.Height)
+height, err := tmmath.SafeConvertInt64(snapshot.Height)
+if err != nil {
+    return sm.State{}, nil, fmt.Errorf("invalid snapshot height: %w", err)
+}
+s.lastSyncedSnapshotHeight = height

libs/math/safemath.go Show resolved Hide resolved
libs/math/safemath.go Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (5)
abci/example/kvstore/kvstore.go (1)

589-602: Consider standardizing error handling.

To improve error handling consistency, consider:

  1. Defining error constants for each validation case
  2. Using a consistent error format across all validations

Example implementation:

+const (
+    errSnapshotHeightMismatch   = "snapshot height mismatch during state sync finalization: expected %d, got %d"
+    errSnapshotAppHashMismatch  = "snapshot apphash mismatch during state sync finalization: expected %x, got %x"
+    errGenesisHeightMismatch    = "genesis height mismatch during state sync finalization: expected %d, got %d"
+)

 func (app *Application) FinalizeSnapshot(_ctx context.Context, req *abci.RequestFinalizeSnapshot) (*abci.ResponseFinalizeSnapshot, error) {
     app.mu.Lock()
     defer app.mu.Unlock()

     if app.LastCommittedState.GetHeight() != req.SnapshotBlock.SignedHeader.Header.Height {
-        return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot height mismatch during state sync finalization: expected %d, got %d",
+        return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf(errSnapshotHeightMismatch,
             app.LastCommittedState.GetHeight(), req.SnapshotBlock.SignedHeader.Header.Height)
     }

     if !app.LastCommittedState.GetAppHash().Equal(req.SnapshotBlock.SignedHeader.Header.AppHash) {
-        return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot apphash mismatch during state sync finalization: expected %x, got %x",
+        return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf(errSnapshotAppHashMismatch,
             app.LastCommittedState.GetAppHash(), req.SnapshotBlock.SignedHeader.Header.AppHash)
     }

     if app.initialHeight != req.GenesisBlock.SignedHeader.Header.Height {
-        return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("genesis height mismatch during state sync finalization: expected %d, got %d",
+        return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf(errGenesisHeightMismatch,
             app.initialHeight, req.GenesisBlock.SignedHeader.Header.Height)
     }
libs/math/safemath.go (2)

99-118: Enhance function documentation.

The wrapper functions could benefit from more descriptive documentation. Consider adding:

  • Parameter descriptions
  • Return value descriptions
  • Example usage

Example improvement for SafeConvertUint64:

-// SafeConvertUint64 takes a int and checks if it overflows.
+// SafeConvertUint64 safely converts a value of type T to uint64.
+// Parameters:
+//   - a: The value to convert
+// Returns:
+//   - uint64: The converted value
+//   - error: ErrOverflow if the conversion would overflow
+// Example:
+//   val, err := SafeConvertUint64(int32(42))

262-309: Consider improving error handling in Max/Min functions.

While the implementation is thorough, the panic messages for unsupported types could be more informative.

Consider this improvement:

-		panic("unsupported type")
+		panic(fmt.Sprintf("unsupported integer type %T", max))
-		panic("unsupported type")
+		panic(fmt.Sprintf("unsupported integer type %T", T(0)))
internal/statesync/syncer.go (2)

219-219: Consider using type conversion function directly.

The MustConvertInt64 function will panic if the conversion fails. Consider using ConvertInt64 and handling the error explicitly for better error handling.

-s.lastSyncedSnapshotHeight = tmmath.MustConvertInt64(snapshot.Height)
+height, err := tmmath.ConvertInt64(snapshot.Height)
+if err != nil {
+    return sm.State{}, nil, fmt.Errorf("invalid snapshot height: %w", err)
+}
+s.lastSyncedSnapshotHeight = height

592-619: Consider adding timeout for finalization.

The finalizeSnapshot method should have a timeout to prevent hanging in case the ABCI app is unresponsive.

 func (s *syncer) finalizeSnapshot(ctx context.Context, snapshot *snapshot, genesisBlock *types.LightBlock, snapshotBlock *types.LightBlock) error {
+    // Add timeout for finalization
+    ctx, cancel := context.WithTimeout(ctx, 1*time.Minute)
+    defer cancel()
+
     s.logger.Info("Finalizing snapshot restoration",

Additionally, consider adding metrics for tracking finalization success/failure rates.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 81367da and 37f8b5f.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (4)
  • abci/example/kvstore/kvstore.go (1 hunks)
  • go.mod (3 hunks)
  • internal/statesync/syncer.go (6 hunks)
  • libs/math/safemath.go (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • go.mod
🧰 Additional context used
📓 Learnings (1)
libs/math/safemath.go (1)
Learnt from: lklimek
PR: dashpay/tenderdash#1026
File: libs/math/safemath.go:123-161
Timestamp: 2025-01-20T13:38:10.707Z
Learning: The `SafeConvert` function in `libs/math/safemath.go` correctly handles negative values during integer type conversion through two mechanisms: explicit checks for int64 to uint64 conversion, and the Min[U]() check that catches negative values when converting to unsigned types.
🔇 Additional comments (11)
abci/example/kvstore/kvstore.go (2)

583-606: LGTM! The FinalizeSnapshot implementation looks solid.

The method correctly implements state sync finalization with proper validation checks, error handling, and thread safety:

  • Uses mutex for thread safety
  • Validates snapshot height and app hash consistency
  • Validates genesis block height
  • Provides clear error messages
  • Includes debug logging

583-606: Add unit tests for FinalizeSnapshot.

To ensure the reliability of state sync finalization, we should add unit tests that cover:

  • Successful finalization case
  • Height mismatch scenarios
  • App hash mismatch scenarios
  • Genesis block height mismatch scenarios

Would you like me to help generate comprehensive unit tests for the FinalizeSnapshot method?

libs/math/safemath.go (3)

11-15: LGTM! Error variables are well-defined.

The new error variables follow the established naming pattern and provide clear, specific error messages.


163-198: LGTM! Must-convert functions are well-implemented.

The implementation:

  • Uses a consistent pattern across all functions
  • Properly handles error messages using zero values for type information
  • Provides a comprehensive set of conversion functions for common integer types

123-161: Implementation is robust and handles edge cases well.

The SafeConvert function correctly handles:

  • Special cases for int64/uint64 inputs
  • Platform-specific integer sizes
  • Negative value conversions
  • Overflow checks using Max/Min bounds

Let's verify the platform-specific behavior:

✅ Verification successful

Platform-specific integer handling verified as correct

The implementation correctly handles platform-specific integer sizes through:

  • Proper detection of platform integer sizes using math.MaxInt/MaxUint comparisons
  • Special handling for 64-bit integers
  • Comprehensive bounds checking using Max[T]()/Min[T]() functions
  • Test coverage validating the behavior
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that the code handles different platform integer sizes correctly
# Test: Search for platform-specific integer size handling
rg -A 2 'const (uintIsSmall|intIsSmall) ='

Length of output: 354


Script:

#!/bin/bash
# Check for test cases and related constants
rg -g '*.go' -A 5 'TestSafeConvert|func.*SafeConvert'

# Look for Max/Min function implementations
ast-grep --pattern 'func Max[$_]() $_'

Length of output: 4689

internal/statesync/syncer.go (6)

19-22: LGTM: Required imports for the new functionality.

The added imports support the new light client and math functionality needed for state sync improvements.


354-355: LGTM: Improved error handling for light block retrieval.

The transition from using commit to block improves the code by fetching the complete light block. The error handling is comprehensive, including specific handling for the case when no witnesses are available.

Also applies to: 363-365


376-378: LGTM: Essential validation of initial height.

Good addition of validation to ensure the initial height is valid (>= 1).


386-389: LGTM: Proper finalization handling.

The finalization step is well-integrated into the sync process with appropriate error handling and context propagation.


600-602: LGTM: Proper nil checks for blocks.

Good defensive programming with explicit nil checks for both genesis and snapshot blocks.


354-355: Verify the impact of LightBlock usage.

Let's verify that all callers of this method are prepared to handle the new return type.

Comment on lines +380 to +384
genesisHeight := tmmath.MustConvertUint64(state.InitialHeight)
genesisBlock, err := s.getStateProvider().LightBlock(ctx, genesisHeight)
if err != nil {
return sm.State{}, nil, fmt.Errorf("failed to get genesis block at height %d: %w", genesisHeight, err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Verify error handling in genesis block retrieval.

The code uses MustConvertUint64 which can panic. Consider using the regular conversion function and handling the error explicitly.

-genesisHeight := tmmath.MustConvertUint64(state.InitialHeight)
+genesisHeight, err := tmmath.ConvertUint64(state.InitialHeight)
+if err != nil {
+    return sm.State{}, nil, fmt.Errorf("invalid initial height: %w", err)
+}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
genesisHeight := tmmath.MustConvertUint64(state.InitialHeight)
genesisBlock, err := s.getStateProvider().LightBlock(ctx, genesisHeight)
if err != nil {
return sm.State{}, nil, fmt.Errorf("failed to get genesis block at height %d: %w", genesisHeight, err)
}
genesisHeight, err := tmmath.ConvertUint64(state.InitialHeight)
if err != nil {
return sm.State{}, nil, fmt.Errorf("invalid initial height: %w", err)
}
genesisBlock, err := s.getStateProvider().LightBlock(ctx, genesisHeight)
if err != nil {
return sm.State{}, nil, fmt.Errorf("failed to get genesis block at height %d: %w", genesisHeight, err)
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

1 participant