Implement atomic append operation #843

ijsong · 2024-07-21T08:01:58Z

Current Situation

Currently, Varlog's Append API writes payloads to disk by dividing them into batchlets and processing each independently. This can result in partial success/failure scenarios where some batchlets are successfully written while others fail. This leads to two main issues:

Users neither expect nor desire partial success/failure when appending their payloads.
It's challenging for Varlog to manage and communicate these partial success/failure states.

Proposed Solution

Implement an atomic append operation for the entire payload. Key changes include:

Remove the concept of batchlets: Write all batches in the user's payload to disk at once.
Utilize the existing atomic batch write functionality in Varlog's Storage layer.
Introduce batch length limit settings:
- Global setting (applied to all Varlog topics)
- Topic-specific setting (if necessary)
Remove the Error field from the AppendResult message type.

Expected Benefits

Simplified user experience: Users can rely on a clear success/failure status for their entire payload.
Simplified system architecture: Removing the batchlet concept reduces system complexity.
Simplified error handling and recovery: Atomic operations make error scenarios more straightforward to handle and recover from.
Potential performance improvement: Eliminating the step of dividing into batchlets may reduce overall processing time.

Challenges and Next Steps

Handling large payloads
- Challenge: Potential increase in memory usage
- Action: Analyze memory usage and research optimization strategies
Batch length limit settings
- Challenge: Changes in user experience and determining optimal values
- Action: Research and decide on optimal values for batch length limits
- Action: Implement global and topic-specific settings
Maintaining backward compatibility
- Challenge: Compatibility issues with existing systems
- Action: Develop migration strategy and plan for phased implementation
Performance impact assessment
- Challenge: Impact of atomic writes on huge batches
- Action: Implement prototype and conduct performance tests under various conditions
API and client library updates
- Action: Modify API response structure (remove Error field)
- Action: Update client libraries and plan new version release
Documentation and communication
- Action: Update API documentation
- Action: Create and distribute user guide for the changes

Discussion Points

What should be the appropriate default value for the batch length limit?
Are there specific use cases that require topic-specific settings?
How can we minimize the impact of this change on systems currently using Varlog?
Are there additional methods to optimize the performance of atomic batch writes?

Testing Plan

Develop unit tests for the new atomic append operation
Conduct integration tests to ensure compatibility with existing Varlog components
Perform stress tests with various payload sizes to assess performance and stability

This PR modifies replicate task pool implementation for future refactoring that will resolve #843. - Deprecated `newReplicateTask` and `release` functions in favor of new implementations. - Added `replicateTaskPool` struct for simplified pool management. - Updated tests to use the new functions and ensure backward compatibility.

This commit introduces a commit wait task that represents an entire append batch, rather than individual log entries. This is a crucial step towards implementing atomic append operations. Previously, a separate commit wait task was created for each log entry in an append batch. This approach made it difficult to handle the batch atomically, as commit wait tasks were processed individually. With this change, a single commit wait task is created for the entire batch. This allows the committer to process the batch atomically, ensuring that either all log entries in the batch are committed or none are. This change also brings a slight performance improvement, as the committer now needs to process fewer tasks. However, no specific benchmarks have been performed to measure the exact gain. The client API does not yet support atomic append operations, and partial success/failure is still allowed. This will be addressed in a future update. This change is a major step towards resolving #843, which aims to implement atomic append operations.

This commit deprecates the error field in AppendResult as a step towards implementing atomic append operations while maintaining backward compatibility. Previously, the Append RPC could return partial success/failure results, with some log entries in a batch being appended successfully and others failing. This was indicated by the error field in AppendResult. To support atomic append operations without breaking existing clients, the error field is deprecated instead of being removed completely. This change allows clients to continue using the error field for now, but they should be aware that it will be removed in a future release. Clients should start migrating to the new atomic append API as soon as possible. The following changes were made to deprecate the error field: - The error field in AppendResult is marked as deprecated in the protobuf definition. - The Append RPC implementation no longer populates the error field. The next step is to implement atomic append operations in the client API. This will enable clients to append multiple log entries atomically, which will help to resolve #843.

ijsong self-assigned this Jul 21, 2024

ijsong changed the title ~~Implement Atomic Append Operation for Varlog~~ Implement Atomic Append Operation Aug 20, 2024

ijsong changed the title ~~Implement Atomic Append Operation~~ Implement atomic append operation Oct 18, 2024

ijsong mentioned this issue Jan 21, 2025

refactor(logstream): deprecate old replicateTask pool management #966

Open

ijsong mentioned this issue Feb 7, 2025

feat(logstream): introduce commit wait task for append batch #984

Open

ijsong mentioned this issue Feb 7, 2025

feat(logstream): deprecate error field in AppendResult for atomic append #985

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement atomic append operation #843

Implement atomic append operation #843

ijsong commented Jul 21, 2024 •

edited

Loading

Implement atomic append operation #843

Implement atomic append operation #843

Comments

ijsong commented Jul 21, 2024 • edited Loading

Current Situation

Proposed Solution

Expected Benefits

Challenges and Next Steps

Discussion Points

Testing Plan

ijsong commented Jul 21, 2024 •

edited

Loading