Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce VersionVector to detect the relationship between changes #800

Closed
wants to merge 11 commits into from

Conversation

hackerwins
Copy link
Member

@hackerwins hackerwins commented Feb 22, 2024

What this PR does / why we need it:

This PR introduces VersionVector for Change.ID to differentiate between Concurrent and Causal relationships among changes.

Existing Functionality of Change.ID:

  • Allows for determining the Total Order of Changes.
  • Generates TimeTicket using a provided Delimiter.
    • Used as an ID for Elements.
    • Utilized for Conflict Resolution of Operations.
    • Only establishes order in Total Order, unable to distinguish between Concurrent and Causal relationships.

New Features of Change.ID:

  • Enables differentiation between Concurrent and Causal relationships among Changes.

We still use LamportClock in conjunction with VersionVector to reduce the size of CRDT metadata in Documents.

Which issue(s) this PR fixes:

Related to #723, #786

Special notes for your reviewer:

Does this PR introduce a user-facing change?:


Additional documentation:


Checklist:

  • Added relevant tests or not required
  • Didn't break anything

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced version vector handling for improved document synchronization and state management.
    • Added new fields for version vectors in various structures, enhancing the tracking of document changes.
    • Enhanced API responses to include version vectors, improving data integrity in collaborative environments.
  • Bug Fixes

    • Improved error handling across multiple methods related to version vector operations.
  • Documentation

    • Updated OpenAPI specifications to reflect new version vector schemas and properties.
  • Tests

    • Expanded test coverage for version vector functionalities and garbage collection mechanisms.

Copy link

codecov bot commented Mar 11, 2024

Codecov Report

Attention: Patch coverage is 20.34188% with 466 lines in your changes missing coverage. Please review.

Project coverage is 46.44%. Comparing base (5e9d925) to head (5c98c41).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
pkg/document/time/version_vector.go 0.00% 114 Missing ⚠️
server/backend/database/memory/database.go 15.51% 98 Missing ⚠️
server/backend/database/mongo/client.go 20.56% 85 Missing ⚠️
pkg/document/change/id.go 13.88% 31 Missing ⚠️
server/backend/database/mongo/registry.go 40.00% 16 Missing and 8 partials ⚠️
pkg/document/internal_document.go 24.00% 19 Missing ⚠️
server/packs/serverpacks.go 0.00% 17 Missing ⚠️
server/packs/packs.go 0.00% 16 Missing ⚠️
api/converter/from_pb.go 57.14% 8 Missing and 4 partials ⚠️
api/converter/to_pb.go 62.50% 8 Missing and 4 partials ⚠️
... and 7 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #800      +/-   ##
==========================================
- Coverage   48.32%   46.44%   -1.89%     
==========================================
  Files          80       81       +1     
  Lines       11410    11887     +477     
==========================================
+ Hits         5514     5521       +7     
- Misses       5354     5806     +452     
- Partials      542      560      +18     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go Benchmark

Benchmark suite Current: 5c98c41 Previous: bb9c7bb Ratio
BenchmarkDocument/constructor_test 1526 ns/op 1337 B/op 24 allocs/op 1486 ns/op 1337 B/op 24 allocs/op 1.03
BenchmarkDocument/constructor_test - ns/op 1526 ns/op 1486 ns/op 1.03
BenchmarkDocument/constructor_test - B/op 1337 B/op 1337 B/op 1
BenchmarkDocument/constructor_test - allocs/op 24 allocs/op 24 allocs/op 1
BenchmarkDocument/status_test 969 ns/op 1305 B/op 22 allocs/op 964.4 ns/op 1305 B/op 22 allocs/op 1.00
BenchmarkDocument/status_test - ns/op 969 ns/op 964.4 ns/op 1.00
BenchmarkDocument/status_test - B/op 1305 B/op 1305 B/op 1
BenchmarkDocument/status_test - allocs/op 22 allocs/op 22 allocs/op 1
BenchmarkDocument/equals_test 8974 ns/op 7529 B/op 134 allocs/op 8175 ns/op 7529 B/op 134 allocs/op 1.10
BenchmarkDocument/equals_test - ns/op 8974 ns/op 8175 ns/op 1.10
BenchmarkDocument/equals_test - B/op 7529 B/op 7529 B/op 1
BenchmarkDocument/equals_test - allocs/op 134 allocs/op 134 allocs/op 1
BenchmarkDocument/nested_update_test 17167 ns/op 12394 B/op 264 allocs/op 19086 ns/op 12395 B/op 264 allocs/op 0.90
BenchmarkDocument/nested_update_test - ns/op 17167 ns/op 19086 ns/op 0.90
BenchmarkDocument/nested_update_test - B/op 12394 B/op 12395 B/op 1.00
BenchmarkDocument/nested_update_test - allocs/op 264 allocs/op 264 allocs/op 1
BenchmarkDocument/delete_test 23102 ns/op 15923 B/op 347 allocs/op 23341 ns/op 15924 B/op 347 allocs/op 0.99
BenchmarkDocument/delete_test - ns/op 23102 ns/op 23341 ns/op 0.99
BenchmarkDocument/delete_test - B/op 15923 B/op 15924 B/op 1.00
BenchmarkDocument/delete_test - allocs/op 347 allocs/op 347 allocs/op 1
BenchmarkDocument/object_test 8832 ns/op 7073 B/op 122 allocs/op 8813 ns/op 7073 B/op 122 allocs/op 1.00
BenchmarkDocument/object_test - ns/op 8832 ns/op 8813 ns/op 1.00
BenchmarkDocument/object_test - B/op 7073 B/op 7073 B/op 1
BenchmarkDocument/object_test - allocs/op 122 allocs/op 122 allocs/op 1
BenchmarkDocument/array_test 30073 ns/op 12203 B/op 278 allocs/op 30344 ns/op 12203 B/op 278 allocs/op 0.99
BenchmarkDocument/array_test - ns/op 30073 ns/op 30344 ns/op 0.99
BenchmarkDocument/array_test - B/op 12203 B/op 12203 B/op 1
BenchmarkDocument/array_test - allocs/op 278 allocs/op 278 allocs/op 1
BenchmarkDocument/text_test 32324 ns/op 15324 B/op 492 allocs/op 33100 ns/op 15323 B/op 492 allocs/op 0.98
BenchmarkDocument/text_test - ns/op 32324 ns/op 33100 ns/op 0.98
BenchmarkDocument/text_test - B/op 15324 B/op 15323 B/op 1.00
BenchmarkDocument/text_test - allocs/op 492 allocs/op 492 allocs/op 1
BenchmarkDocument/text_composition_test 30594 ns/op 18718 B/op 504 allocs/op 30839 ns/op 18718 B/op 504 allocs/op 0.99
BenchmarkDocument/text_composition_test - ns/op 30594 ns/op 30839 ns/op 0.99
BenchmarkDocument/text_composition_test - B/op 18718 B/op 18718 B/op 1
BenchmarkDocument/text_composition_test - allocs/op 504 allocs/op 504 allocs/op 1
BenchmarkDocument/rich_text_test 84911 ns/op 40181 B/op 1183 allocs/op 84197 ns/op 40181 B/op 1183 allocs/op 1.01
BenchmarkDocument/rich_text_test - ns/op 84911 ns/op 84197 ns/op 1.01
BenchmarkDocument/rich_text_test - B/op 40181 B/op 40181 B/op 1
BenchmarkDocument/rich_text_test - allocs/op 1183 allocs/op 1183 allocs/op 1
BenchmarkDocument/counter_test 18726 ns/op 11874 B/op 258 allocs/op 18918 ns/op 11875 B/op 258 allocs/op 0.99
BenchmarkDocument/counter_test - ns/op 18726 ns/op 18918 ns/op 0.99
BenchmarkDocument/counter_test - B/op 11874 B/op 11875 B/op 1.00
BenchmarkDocument/counter_test - allocs/op 258 allocs/op 258 allocs/op 1
BenchmarkDocument/text_edit_gc_100 1365185 ns/op 872567 B/op 17282 allocs/op 1323036 ns/op 872558 B/op 17281 allocs/op 1.03
BenchmarkDocument/text_edit_gc_100 - ns/op 1365185 ns/op 1323036 ns/op 1.03
BenchmarkDocument/text_edit_gc_100 - B/op 872567 B/op 872558 B/op 1.00
BenchmarkDocument/text_edit_gc_100 - allocs/op 17282 allocs/op 17281 allocs/op 1.00
BenchmarkDocument/text_edit_gc_1000 50007820 ns/op 50548287 B/op 186742 allocs/op 51217596 ns/op 50547265 B/op 186737 allocs/op 0.98
BenchmarkDocument/text_edit_gc_1000 - ns/op 50007820 ns/op 51217596 ns/op 0.98
BenchmarkDocument/text_edit_gc_1000 - B/op 50548287 B/op 50547265 B/op 1.00
BenchmarkDocument/text_edit_gc_1000 - allocs/op 186742 allocs/op 186737 allocs/op 1.00
BenchmarkDocument/text_split_gc_100 1935000 ns/op 1589060 B/op 15951 allocs/op 1948567 ns/op 1589113 B/op 15951 allocs/op 0.99
BenchmarkDocument/text_split_gc_100 - ns/op 1935000 ns/op 1948567 ns/op 0.99
BenchmarkDocument/text_split_gc_100 - B/op 1589060 B/op 1589113 B/op 1.00
BenchmarkDocument/text_split_gc_100 - allocs/op 15951 allocs/op 15951 allocs/op 1
BenchmarkDocument/text_split_gc_1000 117988905 ns/op 141482008 B/op 186145 allocs/op 117653420 ns/op 141482905 B/op 186143 allocs/op 1.00
BenchmarkDocument/text_split_gc_1000 - ns/op 117988905 ns/op 117653420 ns/op 1.00
BenchmarkDocument/text_split_gc_1000 - B/op 141482008 B/op 141482905 B/op 1.00
BenchmarkDocument/text_split_gc_1000 - allocs/op 186145 allocs/op 186143 allocs/op 1.00
BenchmarkDocument/text_delete_all_10000 17456864 ns/op 10213389 B/op 55684 allocs/op 16624633 ns/op 10213553 B/op 55687 allocs/op 1.05
BenchmarkDocument/text_delete_all_10000 - ns/op 17456864 ns/op 16624633 ns/op 1.05
BenchmarkDocument/text_delete_all_10000 - B/op 10213389 B/op 10213553 B/op 1.00
BenchmarkDocument/text_delete_all_10000 - allocs/op 55684 allocs/op 55687 allocs/op 1.00
BenchmarkDocument/text_delete_all_100000 308184194 ns/op 142974136 B/op 561678 allocs/op 299359812 ns/op 143002140 B/op 561786 allocs/op 1.03
BenchmarkDocument/text_delete_all_100000 - ns/op 308184194 ns/op 299359812 ns/op 1.03
BenchmarkDocument/text_delete_all_100000 - B/op 142974136 B/op 143002140 B/op 1.00
BenchmarkDocument/text_delete_all_100000 - allocs/op 561678 allocs/op 561786 allocs/op 1.00
BenchmarkDocument/text_100 221010 ns/op 120491 B/op 5182 allocs/op 218903 ns/op 120491 B/op 5182 allocs/op 1.01
BenchmarkDocument/text_100 - ns/op 221010 ns/op 218903 ns/op 1.01
BenchmarkDocument/text_100 - B/op 120491 B/op 120491 B/op 1
BenchmarkDocument/text_100 - allocs/op 5182 allocs/op 5182 allocs/op 1
BenchmarkDocument/text_1000 2381116 ns/op 1171277 B/op 51086 allocs/op 2364085 ns/op 1171277 B/op 51086 allocs/op 1.01
BenchmarkDocument/text_1000 - ns/op 2381116 ns/op 2364085 ns/op 1.01
BenchmarkDocument/text_1000 - B/op 1171277 B/op 1171277 B/op 1
BenchmarkDocument/text_1000 - allocs/op 51086 allocs/op 51086 allocs/op 1
BenchmarkDocument/array_1000 1226373 ns/op 1091562 B/op 11833 allocs/op 1205291 ns/op 1091580 B/op 11833 allocs/op 1.02
BenchmarkDocument/array_1000 - ns/op 1226373 ns/op 1205291 ns/op 1.02
BenchmarkDocument/array_1000 - B/op 1091562 B/op 1091580 B/op 1.00
BenchmarkDocument/array_1000 - allocs/op 11833 allocs/op 11833 allocs/op 1
BenchmarkDocument/array_10000 13603751 ns/op 9798995 B/op 120293 allocs/op 13131662 ns/op 9799025 B/op 120293 allocs/op 1.04
BenchmarkDocument/array_10000 - ns/op 13603751 ns/op 13131662 ns/op 1.04
BenchmarkDocument/array_10000 - B/op 9798995 B/op 9799025 B/op 1.00
BenchmarkDocument/array_10000 - allocs/op 120293 allocs/op 120293 allocs/op 1
BenchmarkDocument/array_gc_100 148063 ns/op 133271 B/op 1266 allocs/op 146215 ns/op 133269 B/op 1266 allocs/op 1.01
BenchmarkDocument/array_gc_100 - ns/op 148063 ns/op 146215 ns/op 1.01
BenchmarkDocument/array_gc_100 - B/op 133271 B/op 133269 B/op 1.00
BenchmarkDocument/array_gc_100 - allocs/op 1266 allocs/op 1266 allocs/op 1
BenchmarkDocument/array_gc_1000 1396462 ns/op 1159642 B/op 12882 allocs/op 1406256 ns/op 1159741 B/op 12883 allocs/op 0.99
BenchmarkDocument/array_gc_1000 - ns/op 1396462 ns/op 1406256 ns/op 0.99
BenchmarkDocument/array_gc_1000 - B/op 1159642 B/op 1159741 B/op 1.00
BenchmarkDocument/array_gc_1000 - allocs/op 12882 allocs/op 12883 allocs/op 1.00
BenchmarkDocument/counter_1000 201975 ns/op 193335 B/op 5773 allocs/op 200069 ns/op 193337 B/op 5773 allocs/op 1.01
BenchmarkDocument/counter_1000 - ns/op 201975 ns/op 200069 ns/op 1.01
BenchmarkDocument/counter_1000 - B/op 193335 B/op 193337 B/op 1.00
BenchmarkDocument/counter_1000 - allocs/op 5773 allocs/op 5773 allocs/op 1
BenchmarkDocument/counter_10000 2194340 ns/op 2088253 B/op 59780 allocs/op 2190063 ns/op 2088252 B/op 59780 allocs/op 1.00
BenchmarkDocument/counter_10000 - ns/op 2194340 ns/op 2190063 ns/op 1.00
BenchmarkDocument/counter_10000 - B/op 2088253 B/op 2088252 B/op 1.00
BenchmarkDocument/counter_10000 - allocs/op 59780 allocs/op 59780 allocs/op 1
BenchmarkDocument/object_1000 1385779 ns/op 1428509 B/op 9851 allocs/op 1384291 ns/op 1428190 B/op 9850 allocs/op 1.00
BenchmarkDocument/object_1000 - ns/op 1385779 ns/op 1384291 ns/op 1.00
BenchmarkDocument/object_1000 - B/op 1428509 B/op 1428190 B/op 1.00
BenchmarkDocument/object_1000 - allocs/op 9851 allocs/op 9850 allocs/op 1.00
BenchmarkDocument/object_10000 15466252 ns/op 12165847 B/op 100563 allocs/op 15198295 ns/op 12166628 B/op 100564 allocs/op 1.02
BenchmarkDocument/object_10000 - ns/op 15466252 ns/op 15198295 ns/op 1.02
BenchmarkDocument/object_10000 - B/op 12165847 B/op 12166628 B/op 1.00
BenchmarkDocument/object_10000 - allocs/op 100563 allocs/op 100564 allocs/op 1.00
BenchmarkDocument/tree_100 1073525 ns/op 943958 B/op 6103 allocs/op 1026882 ns/op 943964 B/op 6103 allocs/op 1.05
BenchmarkDocument/tree_100 - ns/op 1073525 ns/op 1026882 ns/op 1.05
BenchmarkDocument/tree_100 - B/op 943958 B/op 943964 B/op 1.00
BenchmarkDocument/tree_100 - allocs/op 6103 allocs/op 6103 allocs/op 1
BenchmarkDocument/tree_1000 78959382 ns/op 86460690 B/op 60116 allocs/op 73903187 ns/op 86460579 B/op 60117 allocs/op 1.07
BenchmarkDocument/tree_1000 - ns/op 78959382 ns/op 73903187 ns/op 1.07
BenchmarkDocument/tree_1000 - B/op 86460690 B/op 86460579 B/op 1.00
BenchmarkDocument/tree_1000 - allocs/op 60116 allocs/op 60117 allocs/op 1.00
BenchmarkDocument/tree_10000 9717389494 ns/op 8580660416 B/op 600217 allocs/op 9389978340 ns/op 8580660848 B/op 600215 allocs/op 1.03
BenchmarkDocument/tree_10000 - ns/op 9717389494 ns/op 9389978340 ns/op 1.03
BenchmarkDocument/tree_10000 - B/op 8580660416 B/op 8580660848 B/op 1.00
BenchmarkDocument/tree_10000 - allocs/op 600217 allocs/op 600215 allocs/op 1.00
BenchmarkDocument/tree_delete_all_1000 74323511 ns/op 87510496 B/op 75272 allocs/op 74599427 ns/op 87509420 B/op 75270 allocs/op 1.00
BenchmarkDocument/tree_delete_all_1000 - ns/op 74323511 ns/op 74599427 ns/op 1.00
BenchmarkDocument/tree_delete_all_1000 - B/op 87510496 B/op 87509420 B/op 1.00
BenchmarkDocument/tree_delete_all_1000 - allocs/op 75272 allocs/op 75270 allocs/op 1.00
BenchmarkDocument/tree_edit_gc_100 3796974 ns/op 4147229 B/op 15146 allocs/op 3698646 ns/op 4147389 B/op 15147 allocs/op 1.03
BenchmarkDocument/tree_edit_gc_100 - ns/op 3796974 ns/op 3698646 ns/op 1.03
BenchmarkDocument/tree_edit_gc_100 - B/op 4147229 B/op 4147389 B/op 1.00
BenchmarkDocument/tree_edit_gc_100 - allocs/op 15146 allocs/op 15147 allocs/op 1.00
BenchmarkDocument/tree_edit_gc_1000 301217152 ns/op 383743890 B/op 154847 allocs/op 295151553 ns/op 383747214 B/op 154863 allocs/op 1.02
BenchmarkDocument/tree_edit_gc_1000 - ns/op 301217152 ns/op 295151553 ns/op 1.02
BenchmarkDocument/tree_edit_gc_1000 - B/op 383743890 B/op 383747214 B/op 1.00
BenchmarkDocument/tree_edit_gc_1000 - allocs/op 154847 allocs/op 154863 allocs/op 1.00
BenchmarkDocument/tree_split_gc_100 2509993 ns/op 2412967 B/op 11131 allocs/op 2475349 ns/op 2413103 B/op 11131 allocs/op 1.01
BenchmarkDocument/tree_split_gc_100 - ns/op 2509993 ns/op 2475349 ns/op 1.01
BenchmarkDocument/tree_split_gc_100 - B/op 2412967 B/op 2413103 B/op 1.00
BenchmarkDocument/tree_split_gc_100 - allocs/op 11131 allocs/op 11131 allocs/op 1
BenchmarkDocument/tree_split_gc_1000 182217737 ns/op 222252112 B/op 122001 allocs/op 181291608 ns/op 222252000 B/op 121998 allocs/op 1.01
BenchmarkDocument/tree_split_gc_1000 - ns/op 182217737 ns/op 181291608 ns/op 1.01
BenchmarkDocument/tree_split_gc_1000 - B/op 222252112 B/op 222252000 B/op 1.00
BenchmarkDocument/tree_split_gc_1000 - allocs/op 122001 allocs/op 121998 allocs/op 1.00
BenchmarkRPC/client_to_server 428941946 ns/op 20269160 B/op 230306 allocs/op 427718189 ns/op 20275786 B/op 230322 allocs/op 1.00
BenchmarkRPC/client_to_server - ns/op 428941946 ns/op 427718189 ns/op 1.00
BenchmarkRPC/client_to_server - B/op 20269160 B/op 20275786 B/op 1.00
BenchmarkRPC/client_to_server - allocs/op 230306 allocs/op 230322 allocs/op 1.00
BenchmarkRPC/client_to_client_via_server 798395042 ns/op 42257800 B/op 483572 allocs/op 787354738 ns/op 42258348 B/op 483420 allocs/op 1.01
BenchmarkRPC/client_to_client_via_server - ns/op 798395042 ns/op 787354738 ns/op 1.01
BenchmarkRPC/client_to_client_via_server - B/op 42257800 B/op 42258348 B/op 1.00
BenchmarkRPC/client_to_client_via_server - allocs/op 483572 allocs/op 483420 allocs/op 1.00
BenchmarkRPC/attach_large_document 2057705314 ns/op 3023220176 B/op 15036 allocs/op 1930554148 ns/op 3046011768 B/op 14997 allocs/op 1.07
BenchmarkRPC/attach_large_document - ns/op 2057705314 ns/op 1930554148 ns/op 1.07
BenchmarkRPC/attach_large_document - B/op 3023220176 B/op 3046011768 B/op 0.99
BenchmarkRPC/attach_large_document - allocs/op 15036 allocs/op 14997 allocs/op 1.00
BenchmarkRPC/adminCli_to_server 540120742 ns/op 35960088 B/op 289558 allocs/op 532697142 ns/op 35962960 B/op 289571 allocs/op 1.01
BenchmarkRPC/adminCli_to_server - ns/op 540120742 ns/op 532697142 ns/op 1.01
BenchmarkRPC/adminCli_to_server - B/op 35960088 B/op 35962960 B/op 1.00
BenchmarkRPC/adminCli_to_server - allocs/op 289558 allocs/op 289571 allocs/op 1.00
BenchmarkLocker 66.2 ns/op 16 B/op 1 allocs/op 65.69 ns/op 16 B/op 1 allocs/op 1.01
BenchmarkLocker - ns/op 66.2 ns/op 65.69 ns/op 1.01
BenchmarkLocker - B/op 16 B/op 16 B/op 1
BenchmarkLocker - allocs/op 1 allocs/op 1 allocs/op 1
BenchmarkLockerParallel 39.44 ns/op 0 B/op 0 allocs/op 38.58 ns/op 0 B/op 0 allocs/op 1.02
BenchmarkLockerParallel - ns/op 39.44 ns/op 38.58 ns/op 1.02
BenchmarkLockerParallel - B/op 0 B/op 0 B/op 1
BenchmarkLockerParallel - allocs/op 0 allocs/op 0 allocs/op 1
BenchmarkLockerMoreKeys 157.7 ns/op 15 B/op 0 allocs/op 152.9 ns/op 15 B/op 0 allocs/op 1.03
BenchmarkLockerMoreKeys - ns/op 157.7 ns/op 152.9 ns/op 1.03
BenchmarkLockerMoreKeys - B/op 15 B/op 15 B/op 1
BenchmarkLockerMoreKeys - allocs/op 0 allocs/op 0 allocs/op 1
BenchmarkChange/Push_10_Changes 4428185 ns/op 143579 B/op 1572 allocs/op 4429838 ns/op 143278 B/op 1572 allocs/op 1.00
BenchmarkChange/Push_10_Changes - ns/op 4428185 ns/op 4429838 ns/op 1.00
BenchmarkChange/Push_10_Changes - B/op 143579 B/op 143278 B/op 1.00
BenchmarkChange/Push_10_Changes - allocs/op 1572 allocs/op 1572 allocs/op 1
BenchmarkChange/Push_100_Changes 16279584 ns/op 700667 B/op 8188 allocs/op 15741604 ns/op 699155 B/op 8189 allocs/op 1.03
BenchmarkChange/Push_100_Changes - ns/op 16279584 ns/op 15741604 ns/op 1.03
BenchmarkChange/Push_100_Changes - B/op 700667 B/op 699155 B/op 1.00
BenchmarkChange/Push_100_Changes - allocs/op 8188 allocs/op 8189 allocs/op 1.00
BenchmarkChange/Push_1000_Changes 129193463 ns/op 6748964 B/op 77159 allocs/op 124784510 ns/op 6741080 B/op 77158 allocs/op 1.04
BenchmarkChange/Push_1000_Changes - ns/op 129193463 ns/op 124784510 ns/op 1.04
BenchmarkChange/Push_1000_Changes - B/op 6748964 B/op 6741080 B/op 1.00
BenchmarkChange/Push_1000_Changes - allocs/op 77159 allocs/op 77158 allocs/op 1.00
BenchmarkChange/Pull_10_Changes 3653857 ns/op 123696 B/op 1405 allocs/op 3639335 ns/op 123758 B/op 1405 allocs/op 1.00
BenchmarkChange/Pull_10_Changes - ns/op 3653857 ns/op 3639335 ns/op 1.00
BenchmarkChange/Pull_10_Changes - B/op 123696 B/op 123758 B/op 1.00
BenchmarkChange/Pull_10_Changes - allocs/op 1405 allocs/op 1405 allocs/op 1
BenchmarkChange/Pull_100_Changes 5215990 ns/op 350620 B/op 4949 allocs/op 5113628 ns/op 350913 B/op 4949 allocs/op 1.02
BenchmarkChange/Pull_100_Changes - ns/op 5215990 ns/op 5113628 ns/op 1.02
BenchmarkChange/Pull_100_Changes - B/op 350620 B/op 350913 B/op 1.00
BenchmarkChange/Pull_100_Changes - allocs/op 4949 allocs/op 4949 allocs/op 1
BenchmarkChange/Pull_1000_Changes 10462337 ns/op 2219929 B/op 42667 allocs/op 10428674 ns/op 2219529 B/op 42667 allocs/op 1.00
BenchmarkChange/Pull_1000_Changes - ns/op 10462337 ns/op 10428674 ns/op 1.00
BenchmarkChange/Pull_1000_Changes - B/op 2219929 B/op 2219529 B/op 1.00
BenchmarkChange/Pull_1000_Changes - allocs/op 42667 allocs/op 42667 allocs/op 1
BenchmarkSnapshot/Push_3KB_snapshot 18424864 ns/op 814892 B/op 8190 allocs/op 17770717 ns/op 811447 B/op 8189 allocs/op 1.04
BenchmarkSnapshot/Push_3KB_snapshot - ns/op 18424864 ns/op 17770717 ns/op 1.04
BenchmarkSnapshot/Push_3KB_snapshot - B/op 814892 B/op 811447 B/op 1.00
BenchmarkSnapshot/Push_3KB_snapshot - allocs/op 8190 allocs/op 8189 allocs/op 1.00
BenchmarkSnapshot/Push_30KB_snapshot 131804629 ns/op 7297222 B/op 79063 allocs/op 126576326 ns/op 7150086 B/op 78539 allocs/op 1.04
BenchmarkSnapshot/Push_30KB_snapshot - ns/op 131804629 ns/op 126576326 ns/op 1.04
BenchmarkSnapshot/Push_30KB_snapshot - B/op 7297222 B/op 7150086 B/op 1.02
BenchmarkSnapshot/Push_30KB_snapshot - allocs/op 79063 allocs/op 78539 allocs/op 1.01
BenchmarkSnapshot/Pull_3KB_snapshot 7198007 ns/op 1137068 B/op 19416 allocs/op 7195073 ns/op 1137913 B/op 19417 allocs/op 1.00
BenchmarkSnapshot/Pull_3KB_snapshot - ns/op 7198007 ns/op 7195073 ns/op 1.00
BenchmarkSnapshot/Pull_3KB_snapshot - B/op 1137068 B/op 1137913 B/op 1.00
BenchmarkSnapshot/Pull_3KB_snapshot - allocs/op 19416 allocs/op 19417 allocs/op 1.00
BenchmarkSnapshot/Pull_30KB_snapshot 17724675 ns/op 9296474 B/op 187558 allocs/op 18054599 ns/op 9291002 B/op 187555 allocs/op 0.98
BenchmarkSnapshot/Pull_30KB_snapshot - ns/op 17724675 ns/op 18054599 ns/op 0.98
BenchmarkSnapshot/Pull_30KB_snapshot - B/op 9296474 B/op 9291002 B/op 1.00
BenchmarkSnapshot/Pull_30KB_snapshot - allocs/op 187558 allocs/op 187555 allocs/op 1.00
BenchmarkSplayTree/stress_test_100000 0.1902 ns/op 0 B/op 0 allocs/op 0.1919 ns/op 0 B/op 0 allocs/op 0.99
BenchmarkSplayTree/stress_test_100000 - ns/op 0.1902 ns/op 0.1919 ns/op 0.99
BenchmarkSplayTree/stress_test_100000 - B/op 0 B/op 0 B/op 1
BenchmarkSplayTree/stress_test_100000 - allocs/op 0 allocs/op 0 allocs/op 1
BenchmarkSplayTree/stress_test_200000 0.3834 ns/op 0 B/op 0 allocs/op 0.3723 ns/op 0 B/op 0 allocs/op 1.03
BenchmarkSplayTree/stress_test_200000 - ns/op 0.3834 ns/op 0.3723 ns/op 1.03
BenchmarkSplayTree/stress_test_200000 - B/op 0 B/op 0 B/op 1
BenchmarkSplayTree/stress_test_200000 - allocs/op 0 allocs/op 0 allocs/op 1
BenchmarkSplayTree/stress_test_300000 0.5959 ns/op 0 B/op 0 allocs/op 0.5978 ns/op 0 B/op 0 allocs/op 1.00
BenchmarkSplayTree/stress_test_300000 - ns/op 0.5959 ns/op 0.5978 ns/op 1.00
BenchmarkSplayTree/stress_test_300000 - B/op 0 B/op 0 B/op 1
BenchmarkSplayTree/stress_test_300000 - allocs/op 0 allocs/op 0 allocs/op 1
BenchmarkSplayTree/random_access_100000 0.01293 ns/op 0 B/op 0 allocs/op 0.01267 ns/op 0 B/op 0 allocs/op 1.02
BenchmarkSplayTree/random_access_100000 - ns/op 0.01293 ns/op 0.01267 ns/op 1.02
BenchmarkSplayTree/random_access_100000 - B/op 0 B/op 0 B/op 1
BenchmarkSplayTree/random_access_100000 - allocs/op 0 allocs/op 0 allocs/op 1
BenchmarkSplayTree/random_access_200000 0.02435 ns/op 0 B/op 0 allocs/op 0.02376 ns/op 0 B/op 0 allocs/op 1.02
BenchmarkSplayTree/random_access_200000 - ns/op 0.02435 ns/op 0.02376 ns/op 1.02
BenchmarkSplayTree/random_access_200000 - B/op 0 B/op 0 B/op 1
BenchmarkSplayTree/random_access_200000 - allocs/op 0 allocs/op 0 allocs/op 1
BenchmarkSplayTree/random_access_300000 0.03651 ns/op 0 B/op 0 allocs/op 0.0354 ns/op 0 B/op 0 allocs/op 1.03
BenchmarkSplayTree/random_access_300000 - ns/op 0.03651 ns/op 0.0354 ns/op 1.03
BenchmarkSplayTree/random_access_300000 - B/op 0 B/op 0 B/op 1
BenchmarkSplayTree/random_access_300000 - allocs/op 0 allocs/op 0 allocs/op 1
BenchmarkSplayTree/editing_trace_bench 0.002356 ns/op 0 B/op 0 allocs/op 0.001988 ns/op 0 B/op 0 allocs/op 1.19
BenchmarkSplayTree/editing_trace_bench - ns/op 0.002356 ns/op 0.001988 ns/op 1.19
BenchmarkSplayTree/editing_trace_bench - B/op 0 B/op 0 B/op 1
BenchmarkSplayTree/editing_trace_bench - allocs/op 0 allocs/op 0 allocs/op 1
BenchmarkSync/memory_sync_10_test 6986 ns/op 1286 B/op 38 allocs/op 6889 ns/op 1286 B/op 38 allocs/op 1.01
BenchmarkSync/memory_sync_10_test - ns/op 6986 ns/op 6889 ns/op 1.01
BenchmarkSync/memory_sync_10_test - B/op 1286 B/op 1286 B/op 1
BenchmarkSync/memory_sync_10_test - allocs/op 38 allocs/op 38 allocs/op 1
BenchmarkSync/memory_sync_100_test 51332 ns/op 8640 B/op 273 allocs/op 51089 ns/op 8700 B/op 277 allocs/op 1.00
BenchmarkSync/memory_sync_100_test - ns/op 51332 ns/op 51089 ns/op 1.00
BenchmarkSync/memory_sync_100_test - B/op 8640 B/op 8700 B/op 0.99
BenchmarkSync/memory_sync_100_test - allocs/op 273 allocs/op 277 allocs/op 0.99
BenchmarkSync/memory_sync_1000_test 588590 ns/op 73959 B/op 2097 allocs/op 572452 ns/op 74557 B/op 2133 allocs/op 1.03
BenchmarkSync/memory_sync_1000_test - ns/op 588590 ns/op 572452 ns/op 1.03
BenchmarkSync/memory_sync_1000_test - B/op 73959 B/op 74557 B/op 0.99
BenchmarkSync/memory_sync_1000_test - allocs/op 2097 allocs/op 2133 allocs/op 0.98
BenchmarkSync/memory_sync_10000_test 7498974 ns/op 734900 B/op 20204 allocs/op 7235109 ns/op 736742 B/op 20307 allocs/op 1.04
BenchmarkSync/memory_sync_10000_test - ns/op 7498974 ns/op 7235109 ns/op 1.04
BenchmarkSync/memory_sync_10000_test - B/op 734900 B/op 736742 B/op 1.00
BenchmarkSync/memory_sync_10000_test - allocs/op 20204 allocs/op 20307 allocs/op 0.99
BenchmarkTextEditing 5662678307 ns/op 3982595936 B/op 20647696 allocs/op 4834668753 ns/op 3982585728 B/op 20647680 allocs/op 1.17
BenchmarkTextEditing - ns/op 5662678307 ns/op 4834668753 ns/op 1.17
BenchmarkTextEditing - B/op 3982595936 B/op 3982585728 B/op 1.00
BenchmarkTextEditing - allocs/op 20647696 allocs/op 20647680 allocs/op 1.00
BenchmarkTree/10000_vertices_to_protobuf 3601778 ns/op 6263000 B/op 70025 allocs/op 3450040 ns/op 6262995 B/op 70025 allocs/op 1.04
BenchmarkTree/10000_vertices_to_protobuf - ns/op 3601778 ns/op 3450040 ns/op 1.04
BenchmarkTree/10000_vertices_to_protobuf - B/op 6263000 B/op 6262995 B/op 1.00
BenchmarkTree/10000_vertices_to_protobuf - allocs/op 70025 allocs/op 70025 allocs/op 1
BenchmarkTree/10000_vertices_from_protobuf 165384139 ns/op 442172597 B/op 290039 allocs/op 158770717 ns/op 442172737 B/op 290052 allocs/op 1.04
BenchmarkTree/10000_vertices_from_protobuf - ns/op 165384139 ns/op 158770717 ns/op 1.04
BenchmarkTree/10000_vertices_from_protobuf - B/op 442172597 B/op 442172737 B/op 1.00
BenchmarkTree/10000_vertices_from_protobuf - allocs/op 290039 allocs/op 290052 allocs/op 1.00
BenchmarkTree/20000_vertices_to_protobuf 7974975 ns/op 12716869 B/op 140028 allocs/op 7705317 ns/op 12716922 B/op 140028 allocs/op 1.03
BenchmarkTree/20000_vertices_to_protobuf - ns/op 7974975 ns/op 7705317 ns/op 1.03
BenchmarkTree/20000_vertices_to_protobuf - B/op 12716869 B/op 12716922 B/op 1.00
BenchmarkTree/20000_vertices_to_protobuf - allocs/op 140028 allocs/op 140028 allocs/op 1
BenchmarkTree/20000_vertices_from_protobuf 726014914 ns/op 1697268024 B/op 580044 allocs/op 693450670 ns/op 1697263896 B/op 580044 allocs/op 1.05
BenchmarkTree/20000_vertices_from_protobuf - ns/op 726014914 ns/op 693450670 ns/op 1.05
BenchmarkTree/20000_vertices_from_protobuf - B/op 1697268024 B/op 1697263896 B/op 1.00
BenchmarkTree/20000_vertices_from_protobuf - allocs/op 580044 allocs/op 580044 allocs/op 1
BenchmarkTree/30000_vertices_to_protobuf 12969203 ns/op 19318415 B/op 210030 allocs/op 12327261 ns/op 19318452 B/op 210030 allocs/op 1.05
BenchmarkTree/30000_vertices_to_protobuf - ns/op 12969203 ns/op 12327261 ns/op 1.05
BenchmarkTree/30000_vertices_to_protobuf - B/op 19318415 B/op 19318452 B/op 1.00
BenchmarkTree/30000_vertices_to_protobuf - allocs/op 210030 allocs/op 210030 allocs/op 1
BenchmarkTree/30000_vertices_from_protobuf 1702525182 ns/op 3752036872 B/op 870053 allocs/op 1682902416 ns/op 3752680264 B/op 870104 allocs/op 1.01
BenchmarkTree/30000_vertices_from_protobuf - ns/op 1702525182 ns/op 1682902416 ns/op 1.01
BenchmarkTree/30000_vertices_from_protobuf - B/op 3752036872 B/op 3752680264 B/op 1.00
BenchmarkTree/30000_vertices_from_protobuf - allocs/op 870053 allocs/op 870104 allocs/op 1.00

This comment was automatically generated by workflow using github-action-benchmark.

hackerwins added a commit that referenced this pull request May 8, 2024
We will use maxCreatedAtMapByActor until the introduction of
VersionVector. #800
Copy link

coderabbitai bot commented Jul 15, 2024

Walkthrough

The pull request introduces significant enhancements to the handling of version vectors across multiple components of the Yorkie application. Key changes include the addition of version vector fields in various data structures, modifications to conversion functions, and updates to the OpenAPI specifications. The migration scripts for MongoDB are also included to accommodate these changes. Overall, the modifications aim to improve the management of document states in collaborative editing scenarios by integrating version vectors into the document processing and synchronization logic.

Changes

File Path Change Summary
admin/client.go Modified ListChangeSummaries to handle version vectors using converter.FromVersionVector and update document creation.
api/converter/from_pb.go Enhanced conversion functions to include VersionVector handling and added FromVersionVector function.
api/converter/to_pb.go Updated conversion functions to include error handling and added ToVersionVector for Protobuf representation.
api/docs/yorkie/v1/admin.openapi.yaml Modified OpenAPI spec to include versionVector in ChangeID and GetSnapshotMetaResponse, and added new VersionVector schema.
api/docs/yorkie/v1/resources.openapi.yaml Added versionVector to ChangeID and ChangePack schemas, and introduced VersionVector schema.
api/docs/yorkie/v1/yorkie.openapi.yaml Updated ChangeID and ChangePack schemas to include versionVector and added new schemas for version vectors.
api/yorkie/v1/admin.proto Added version_vector field to GetSnapshotMetaResponse.
api/yorkie/v1/resources.proto Updated ChangePack and ChangeID messages to include version_vector, and added new VersionVector message.
cmd/yorkie/migration.go Implemented MongoDB migration logic for version vectors, including CLI for managing migrations.
design/garbage-collection.md Updated document to reflect version vector changes in garbage collection mechanisms.
design/version-vector.md Introduced detailed explanations of version vectors and their integration with logical clocks.
pkg/document/change/change.go Added AfterOrEqual method to Change struct for version comparison.
pkg/document/change/id.go Enhanced ID struct with versionVector field and updated related methods for synchronization.
pkg/document/change/pack.go Removed MinSyncedTicket and added VersionVector field to Pack struct.
pkg/document/crdt/root.go Updated GarbageCollect method to accept VersionVector instead of Ticket.
pkg/document/crdt/root_test.go Updated tests to use helper.MaxVersionVector() for garbage collection checks.
pkg/document/document.go Modified GarbageCollect method to accept VersionVector and added new VersionVector method.
pkg/document/document_test.go Added tests for version vector functionality and updated garbage collection tests.
pkg/document/gc_test.go Updated garbage collection tests to utilize version vectors.
pkg/document/internal_document.go Updated methods to handle version vectors in document processing.
pkg/document/time/actor_id.go Changed ActorID struct to use a new actorID type for internal representation.
pkg/document/time/ticket.go Updated documentation for Ticket struct to clarify its purpose.
pkg/document/time/version_vector.go Introduced new VersionVector type with methods for managing versioning in documents.
server/backend/database/change_info.go Added VersionVector field to ChangeInfo struct for BSON serialization.
server/backend/database/database.go Introduced new methods in the Database interface for handling version vectors.
server/backend/database/memory/database.go Enhanced methods for managing version vectors in memory database operations.
server/backend/database/memory/indexes.go Added new table and indexes for version vectors in the memory database schema.
server/backend/database/mongo/client.go Updated methods for managing version vectors in MongoDB client operations.
server/backend/database/mongo/indexes.go Added new collection for version vectors in MongoDB schema.
server/backend/database/mongo/registry.go Introduced BSON encoding/decoding for time.VersionVector.
server/backend/database/mongo/registry_test.go Enhanced tests for BSON encoding/decoding of version vectors.
server/backend/database/snapshot_info.go Added VersionVector field to SnapshotInfo struct and updated time handling.
server/backend/database/testcases/testcases.go Updated test cases to include version vector parameters in method calls.
server/backend/database/version_vector.go Introduced VersionVectorInfo struct for managing version vectors in the database.
server/clients/clients.go Updated Deactivate method to include version vector updates.
server/packs/packs.go Updated PushPull and BuildInternalDocForServerSeq methods to handle version vectors.
server/packs/pushpull.go Modified pullSnapshot to include VersionVector in the returned ServerPack.
server/packs/serverpacks.go Replaced MinSyncedTicket with VersionVector in ServerPack struct and updated Protobuf conversion methods.
server/packs/snapshots.go Updated storeSnapshot method to use minSyncedVersionVector.
server/rpc/admin_server.go Added VersionVector to the response of GetSnapshotMeta.
server/rpc/testcases/testcases.go Enhanced tests to validate version vector functionalities in various operations.
test/bench/document_bench_test.go Updated benchmarks to use helper.MaxVersionVector() for garbage collection.
test/helper/helper.go Introduced MaxVersionVector function to compute maximum version vectors for actors.
test/integration/array_test.go Updated assertions to use version vectors for garbage collection checks.
test/integration/gc_test.go Enhanced garbage collection tests with new version vector checks.
test/integration/object_test.go Updated assertions to use version vectors for garbage collection checks.
migrations/v0.5.3/add-version-vector.go Added migration functionality to include version vectors in the "changes" collection.
migrations/v0.5.3/drop-snapshots.go Introduced functionality to drop the "snapshots" collection during migration.
migrations/v0.5.3/main.go Created a migration package for version 0.5.3, orchestrating version vector updates and snapshot drops.

Possibly related PRs

Suggested reviewers

  • hackerwins: Suggested for review based on familiarity with the codebase and recent changes.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@hackerwins
Copy link
Member Author

@coderabbitai resume

Copy link

coderabbitai bot commented Jul 15, 2024

Actions performed

Reviews resumed.

Copy link

coderabbitai bot commented Jul 15, 2024

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@hackerwins hackerwins marked this pull request as ready for review July 15, 2024 09:25
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Outside diff range, codebase verification and nitpick comments (3)
api/yorkie/v1/resources.proto (1)

46-49: Issue Found: New fields in ChangePack are not being utilized.

The new fields snapshot_version_vector and min_synced_version_vector in the ChangePack message are defined but not used in the codebase. Ensure these fields are properly integrated and handled in the logic where ChangePack is utilized.

  • api/yorkie/v1/resources.proto
  • api/yorkie/v1/resources.pb.go
  • api/docs/yorkie/v1/resources.openapi.yaml
  • api/docs/yorkie/v1/yorkie.openapi.yaml
Analysis chain

LGTM! Verify the handling of the new fields.

The ChangePack message has new fields: VersionVector snapshot_version_vector and VersionVector min_synced_version_vector. Ensure that these fields are handled correctly in the codebase.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all usages of `ChangePack` handle the new fields correctly.

# Test: Search for the message usage. Expect: Proper handling of the new fields.
rg --type proto -A 5 $'ChangePack'

Length of output: 63


Script:

#!/bin/bash
# Description: Verify all usages of `ChangePack` handle the new fields correctly.

# Test: Search for the message usage. Expect: Proper handling of the new fields.
rg -A 5 'ChangePack'

Length of output: 69068


Script:

#!/bin/bash
# Description: Verify the correct handling of new fields in ChangePack.

# Test: Search for the new fields in ChangePack. Expect: Proper handling of the new fields.
rg -A 5 'snapshot_version_vector|min_synced_version_vector'

Length of output: 4285


Script:

#!/bin/bash
# Description: Verify the correct handling of new fields in ChangePack.

# Test: Search for the usage of the new fields in ChangePack. Expect: Proper handling of the new fields.
rg 'snapshot_version_vector|min_synced_version_vector'

Length of output: 1189

server/backend/database/mongo/client.go (1)

Line range hint 1104-1146:
LGTM! Updated UpdateAndFindMinSyncedTime to return VersionVector.

The update to return a VersionVector aligns with the PR's objective to handle version vectors. The TODO comment indicates that the implementation for building the version vector is pending.

Do you need assistance with implementing the version vector building logic?

api/docs/yorkie/v1/resources.openapi.yaml (1)

3-3: Update the description for clarity.

The description is clear and provides a good overview of Yorkie. However, consider adding more details about the new VersionVector feature to provide better context.

-  description: Yorkie is an open source document store for building collaborative
+  description: Yorkie is an open-source document store for building collaborative editing applications. The latest version introduces the `VersionVector` feature for better tracking and synchronization of document changes.
Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between b468f8b and c815a09.

Files ignored due to path filters (2)
  • api/yorkie/v1/admin.pb.go is excluded by !**/*.pb.go
  • api/yorkie/v1/resources.pb.go is excluded by !**/*.pb.go
Files selected for processing (32)
  • admin/client.go (1 hunks)
  • api/converter/from_pb.go (2 hunks)
  • api/converter/to_pb.go (2 hunks)
  • api/docs/yorkie/v1/admin.openapi.yaml (53 hunks)
  • api/docs/yorkie/v1/resources.openapi.yaml (63 hunks)
  • api/docs/yorkie/v1/yorkie.openapi.yaml (49 hunks)
  • api/yorkie/v1/admin.proto (1 hunks)
  • api/yorkie/v1/resources.proto (2 hunks)
  • pkg/document/change/change.go (1 hunks)
  • pkg/document/change/id.go (5 hunks)
  • pkg/document/change/pack.go (2 hunks)
  • pkg/document/crdt/root_test.go (1 hunks)
  • pkg/document/document.go (2 hunks)
  • pkg/document/document_test.go (3 hunks)
  • pkg/document/internal_document.go (8 hunks)
  • pkg/document/time/actor_id.go (1 hunks)
  • pkg/document/time/ticket.go (1 hunks)
  • pkg/document/time/vection_vector.go (1 hunks)
  • server/backend/database/change_info.go (2 hunks)
  • server/backend/database/database.go (1 hunks)
  • server/backend/database/memory/database.go (7 hunks)
  • server/backend/database/mongo/client.go (5 hunks)
  • server/backend/database/mongo/registry.go (3 hunks)
  • server/backend/database/mongo/registry_test.go (1 hunks)
  • server/backend/database/snapshot_info.go (3 hunks)
  • server/packs/packs.go (3 hunks)
  • server/packs/pushpull.go (1 hunks)
  • server/packs/serverpacks.go (3 hunks)
  • server/packs/snapshots.go (1 hunks)
  • server/rpc/admin_server.go (1 hunks)
  • server/rpc/testcases/testcases.go (4 hunks)
  • test/integration/gc_test.go (1 hunks)
Files not summarized due to errors (1)
  • api/docs/yorkie/v1/admin.openapi.yaml: Error: Message exceeds token limit
Files skipped from review due to trivial changes (2)
  • api/docs/yorkie/v1/yorkie.openapi.yaml
  • pkg/document/time/ticket.go
Additional comments not posted (88)
server/backend/database/snapshot_info.go (3)

20-23: LGTM! Import alias change and new import.

The alias change avoids naming conflicts and the new import is necessary for handling version vectors.


43-44: LGTM! Added VersionVector field to SnapshotInfo struct.

The new field is necessary for tracking version vectors in snapshots.


60-67: LGTM! Updated DeepCopy method to include VersionVector.

The update ensures that the VersionVector field is copied when creating a deep copy of SnapshotInfo.

pkg/document/change/pack.go (2)

38-44: LGTM! Added SnapshotVersionVector and MinSyncedVersionVector fields to Pack struct.

The new fields are necessary for handling version vectors in change packs.


68-68: LGTM! Updated NewPack function to initialize new fields.

The update ensures that the new fields are properly initialized when creating a new Pack.

pkg/document/time/vection_vector.go (4)

27-29: LGTM! Introduced new VersionVector type.

The new type is necessary for managing version vectors.


31-34: LGTM! Introduced NewVersionVector function.

The new function is necessary for creating instances of VersionVector.


36-44: LGTM! Introduced Set and VersionOf methods.

The new methods are necessary for managing versions within a VersionVector.


46-98: LGTM! Introduced DeepCopy, Marshal, and CausallyAfter methods.

The new methods are necessary for deep copying, serializing, and comparing VersionVector instances.

server/packs/snapshots.go (1)

80-80: LGTM! Updated storeSnapshot function to include VersionVector parameter.

The update ensures that the version vector is properly handled when storing snapshots.

server/backend/database/mongo/registry_test.go (2)

38-47: LGTM!

The test case for types.ID correctly verifies the marshaling and unmarshaling process.


50-65: LGTM!

The test case for versionVector correctly verifies the marshaling and unmarshaling process.

pkg/document/change/change.go (1)

116-119: LGTM!

The CausallyAfter method correctly checks if the current change is causally after another change.

server/backend/database/change_info.go (2)

43-50: LGTM!

The VersionVector field is correctly added to the ChangeInfo struct.


97-97: LGTM!

The ToChange method correctly initializes the changeID with the VersionVector field.

api/yorkie/v1/admin.proto (1)

143-143: LGTM!

The VersionVector field is correctly added to the GetSnapshotMetaResponse message.

pkg/document/time/actor_id.go (1)

59-60: Good abstraction of internal type.

Changing the internal type of ActorID to bytes actorID improves encapsulation and reduces the risk of errors.

server/backend/database/mongo/registry.go (5)

27-31: Necessary imports added.

The new imports for proto, converter, and api/yorkie/v1 are necessary for handling the VersionVector type.


37-37: Added type for VersionVector.

The new type tVersionVector is required for encoding and decoding the VersionVector.


52-52: Registered VersionVector decoder.

The registration of the VersionVector decoder is necessary for decoding the VersionVector from BSON.


57-57: Registered VersionVector encoder.

The registration of the VersionVector encoder is necessary for encoding the VersionVector to BSON.


87-135: Added functions for VersionVector encoding and decoding.

The versionVectorEncoder and versionVectorDecoder functions are crucial for converting VersionVector to and from BSON format.

server/packs/serverpacks.go (2)

46-50: Added fields for version vectors.

The new fields SnapshotVersionVector and MinSyncedVersionVector in the ServerPack struct are necessary for handling version vectors.


Line range hint 102-150:
Updated ToPBChangePack method.

The ToPBChangePack method has been updated to handle SnapshotVersionVector, which is necessary for converting the ServerPack to the Protobuf format.

pkg/document/change/id.go (5)

31-31: Initialized InitialID with VersionVector.

The InitialID now includes the InitialVersionVector, ensuring proper tracking from the initial state.


54-58: Added versionVector field to ID struct.

The versionVector field in the ID struct is necessary for detecting the relationship between changes.


67-74: Updated NewID function.

The NewID function now includes the versionVector parameter, which is necessary for creating an ID with the version vector.


80-87: Updated Next method.

The Next method now includes the versionVector, ensuring that the next ID includes the updated version vector.


Line range hint 100-168:
Updated methods to handle versionVector.

The SyncClocks, SetClocks, SetActor, and SetServerSeq methods, and the VersionVector and CausallyAfter methods have been updated to handle the versionVector, ensuring proper synchronization and comparison of ID instances.

server/packs/pushpull.go (1)

174-177: LGTM! But verify the function usage in the codebase.

The code changes are approved.

However, ensure that all function calls to pullSnapshot correctly handle the new VersionVector logic.

server/packs/packs.go (2)

Line range hint 133-142:
LGTM! But verify the function usage in the codebase.

The code changes are approved.

However, ensure that all function calls to PushPull correctly handle the new minSyncedVersionVector logic.


236-236: LGTM! But verify the function usage in the codebase.

The code changes are approved.

However, ensure that all function calls to BuildDocumentForServerSeq correctly handle the new VersionVector logic.

admin/client.go (2)

339-339: LGTM! But verify the function usage in the codebase.

The code changes are approved.

However, ensure that all function calls to NewInternalDocumentFromSnapshot correctly handle the new VersionVector logic.

Verification successful

Let's gather more context around the calls to NewInternalDocumentFromSnapshot to ensure that the VersionVector logic is correctly handled.


Function usage verified

The function NewInternalDocumentFromSnapshot is correctly handling the new VersionVector logic in all identified instances.

  • server/packs/snapshots.go
  • server/packs/packs.go
  • admin/client.go
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all function calls to `NewInternalDocumentFromSnapshot` handle the new `VersionVector` logic.

# Test: Search for the function usage. Expect: Correct handling of `VersionVector`.
rg --type go -A 5 $'NewInternalDocumentFromSnapshot'

Length of output: 1398


Script:

#!/bin/bash
# Description: Gather more context around the calls to `NewInternalDocumentFromSnapshot` to verify correct handling of `VersionVector`.

# Fetching additional lines around the function calls to understand the context.
rg --type go -A 10 -B 10 $'NewInternalDocumentFromSnapshot'

Length of output: 3831


330-334: LGTM! But verify the function usage in the codebase.

The code changes are approved.

However, ensure that all function calls to ListChangeSummaries correctly handle the new VersionVector logic.

server/backend/database/database.go (1)

249-256: LGTM! Verify the handling of the new return type.

The function signature has been updated to return *time.VersionVector. Ensure that all calls to this function handle the new return type correctly.

Verification successful

The new return type is handled correctly.

The function UpdateAndFindMinSyncedTime returns *time.VersionVector, which is correctly assigned to minSyncedVersionVector and subsequently used in respPack.MinSyncedVersionVector.

  • server/packs/packs.go: minSyncedVersionVector is correctly assigned and used.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all calls to `UpdateAndFindMinSyncedTime` handle the new return type correctly.

# Test: Search for the function usage. Expect: Proper handling of the new return type.
rg --type go -A 5 $'UpdateAndFindMinSyncedTime'

Length of output: 2195


Script:

#!/bin/bash
# Description: Verify the handling of the new return type `*time.VersionVector` in `server/packs/packs.go`.

# Test: Search for the usage of `minSyncedVersionVector` in `server/packs/packs.go`.
rg --type go -A 10 'minSyncedVersionVector' server/packs/packs.go

Length of output: 639

api/yorkie/v1/resources.proto (2)

68-69: LGTM! Verify the handling of the new message.

The VersionVector message is newly added. Ensure that this message is handled correctly in the codebase.

Verification successful

LGTM! The VersionVector message is handled correctly across the codebase.

The search results indicate that VersionVector is integrated and used appropriately in various contexts, including initialization, serialization, deserialization, and operations. The presence of tests further confirms its correct handling.

  • pkg/document/document_test.go
  • pkg/document/internal_document.go
  • pkg/document/time/vection_vector.go
  • server/rpc/admin_server.go
  • server/rpc/testcases/testcases.go
  • pkg/document/change/pack.go
  • server/packs/serverpacks.go
  • server/backend/database/change_info.go
  • server/backend/database/snapshot_info.go
  • server/backend/database/mongo/registry.go
  • server/backend/database/memory/database.go
  • admin/client.go
  • api/yorkie/v1/resources.proto
  • api/converter/to_pb.go
  • api/converter/from_pb.go
  • api/docs/yorkie/v1/resources.openapi.yaml
  • api/docs/yorkie/v1/yorkie.openapi.yaml
  • api/docs/yorkie/v1/admin.openapi.yaml
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all usages of `VersionVector` handle the new message correctly.

# Test: Search for the message usage. Expect: Proper handling of the new message.
rg --type proto -A 5 $'VersionVector'

Length of output: 66


Script:

#!/bin/bash
# Description: Verify all usages of `VersionVector` handle the new message correctly.

# Test: Search for the message usage. Expect: Proper handling of the new message.
rg -A 5 'VersionVector'

Length of output: 47750


65-66: LGTM! Verify the handling of the new field.

The ChangeID message has a new field: VersionVector version_vector. Ensure that this field is handled correctly in the codebase.

Verification successful

The new VersionVector field in ChangeID is handled correctly.

The rg results show that the VersionVector field is used appropriately in various parts of the codebase, including test cases and conversion functions.

  • server/rpc/testcases/testcases.go
  • server/packs/serverpacks.go
  • api/converter/from_pb.go
  • api/converter/to_pb.go
  • api/yorkie/v1/resources.proto
  • api/docs/yorkie/v1/resources.openapi.yaml
  • api/docs/yorkie/v1/yorkie.openapi.yaml
  • api/docs/yorkie/v1/admin.openapi.yaml
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all usages of `ChangeID` handle the new field correctly.

# Test: Search for the message usage. Expect: Proper handling of the new field.
rg --type proto -A 5 $'ChangeID'

Length of output: 61


Script:

#!/bin/bash
# Description: Verify all usages of `ChangeID` handle the new field correctly.

# Test 1: Search for the message usage in all files. Expect: Proper handling of the new field.
rg -A 5 $'ChangeID'

# Test 2: Use ast-grep to search for `ChangeID` in the codebase.
ast-grep --lang proto --pattern $'message ChangeID { $$$ }'

Length of output: 14450

server/rpc/admin_server.go (1)

265-273: LGTM! Verify the handling of the new field.

The GetSnapshotMeta function has been updated to handle VersionVector in the response. Ensure that this field is handled correctly in the codebase.

Verification successful

Verify the handling of the VersionVector field in the client code and protobuf definitions.

The GetSnapshotMeta function has been updated to handle VersionVector in the response. Ensure that this field is handled correctly in the client code and that the protobuf definitions are consistent.


Verification Successful: VersionVector field is handled correctly.

The GetSnapshotMeta function has been updated to handle VersionVector in the response. The client code and protobuf definitions are consistent and correctly handle the VersionVector field.

  • admin/client.go: Proper handling of the VersionVector field.
  • api/yorkie/v1/admin.pb.go: Consistent protobuf definitions for VersionVector.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all usages of `GetSnapshotMeta` handle the new field correctly.

# Test: Search for the function usage. Expect: Proper handling of the new field.
rg --type go -A 5 $'GetSnapshotMeta'

Length of output: 16542


Script:

#!/bin/bash
# Description: Verify the handling of the `VersionVector` field in the client code and protobuf definitions.

# Test: Search for the handling of the `VersionVector` field in the client code. Expect: Proper handling of the field.
rg --type go -A 5 'VersionVector' admin/client.go

# Test: Verify the protobuf definitions for consistency. Expect: Consistent definitions for `VersionVector`.
rg --type go -A 5 'VersionVector' api/yorkie/v1/admin.pb.go

Length of output: 1560

pkg/document/internal_document.go (4)

149-149: LGTM! Verify the handling of the new field.

The ApplyChangePack method has been updated to handle SnapshotVersionVector. Ensure that this field is handled correctly in the codebase.

Verification successful

Verified: The SnapshotVersionVector is handled correctly in the applySnapshot method within the ApplyChangePack implementation.

  • The applySnapshot method in pkg/document/internal_document.go sets the changeID clocks using the SnapshotVersionVector, ensuring proper handling of the new field.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all usages of `ApplyChangePack` handle the new field correctly.

# Test: Search for the method usage. Expect: Proper handling of the new field.
rg --type go -A 5 $'ApplyChangePack'

Length of output: 5959


Script:

#!/bin/bash
# Description: Verify the implementation and usage of `applySnapshot` method to ensure `SnapshotVersionVector` is handled correctly.

# Search for the definition of `applySnapshot` method.
ast-grep --lang go --pattern 'func (d *$_) applySnapshot($_, $_, $_) $_' 

# Search for all usages of `applySnapshot` method.
rg --type go -A 5 'applySnapshot'

Length of output: 2285


223-226: LGTM! Verify the handling of the new method.

The VersionVector method is newly added. Ensure that this method is handled correctly in the codebase.

Verification successful

The VersionVector method is correctly integrated and handled across the codebase.

The new method VersionVector is used in various contexts such as snapshot handling, change application, and protobuf conversions, and appears to be correctly integrated.

  • server/packs/snapshots.go
  • server/packs/pushpull.go
  • server/rpc/testcases/testcases.go
  • server/rpc/admin_server.go
  • server/packs/packs.go
  • server/backend/database/change_info.go
  • server/backend/database/snapshot_info.go
  • server/packs/serverpacks.go
  • server/backend/database/mongo/registry_test.go
  • server/backend/database/memory/database.go
  • pkg/document/internal_document.go
  • pkg/document/document.go
  • pkg/document/document_test.go
  • pkg/document/time/vection_vector.go
  • pkg/document/change/pack.go
  • pkg/document/change/id.go
  • api/converter/to_pb.go
  • api/yorkie/v1/admin.pb.go
  • api/converter/from_pb.go
  • api/yorkie/v1/resources.pb.go
  • admin/client.go
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all usages of `VersionVector` handle the new method correctly.

# Test: Search for the method usage. Expect: Proper handling of the new method.
rg --type go -A 5 $'VersionVector'

Length of output: 39880


Line range hint 111-126:
LGTM! Verify the handling of the new parameter.

The NewInternalDocumentFromSnapshot method has been updated to include a VersionVector parameter. Ensure that this parameter is handled correctly in the codebase.

Verification successful

Verified: The VersionVector parameter is handled correctly.

The NewInternalDocumentFromSnapshot method correctly passes the VersionVector parameter in all identified usages:

  • server/packs/packs.go
  • server/packs/snapshots.go
  • admin/client.go

No issues found in the handling of the new parameter.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all usages of `NewInternalDocumentFromSnapshot` handle the new parameter correctly.

# Test: Search for the method usage. Expect: Proper handling of the new parameter.
rg --type go -A 5 $'NewInternalDocumentFromSnapshot'

Length of output: 1398


Script:

#!/bin/bash
# Verify the context around the usages of `NewInternalDocumentFromSnapshot` to ensure the correct `VersionVector` is passed.

# Inspect the usage in server/packs/packs.go
rg --type go -A 10 -B 10 'NewInternalDocumentFromSnapshot' server/packs/packs.go

# Inspect the usage in server/packs/snapshots.go
rg --type go -A 10 -B 10 'NewInternalDocumentFromSnapshot' server/packs/snapshots.go

# Inspect the usage in admin/client.go
rg --type go -A 10 -B 10 'NewInternalDocumentFromSnapshot' admin/client.go

Length of output: 1309


248-258: LGTM! Verify the handling of the new parameter.

The applySnapshot method has been updated to include a VersionVector parameter. Ensure that this parameter is handled correctly in the codebase.

Verification successful

LGTM! Verify the handling of the new parameter.

The applySnapshot method has been updated to include a VersionVector parameter. The SnapshotVersionVector is correctly instantiated and used across the codebase, ensuring proper handling of the new parameter.

  • server/packs/serverpacks.go
  • pkg/document/change/pack.go
  • api/yorkie/v1/resources.pb.go
  • server/packs/pushpull.go
  • pkg/document/document_test.go
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all usages of `applySnapshot` handle the new parameter correctly.

# Test: Search for the method usage. Expect: Proper handling of the new parameter.
rg --type go -A 5 $'applySnapshot'

Length of output: 1233


Script:

#!/bin/bash
# Description: Verify the definition and usage of `SnapshotVersionVector`

# Search for the definition of `SnapshotVersionVector`
rg --type go 'SnapshotVersionVector'

# Search for the instantiation of `SnapshotVersionVector`
rg --type go 'SnapshotVersionVector ='

Length of output: 1717

pkg/document/document.go (2)

188-188: Verify the handling of SnapshotVersionVector.

Ensure that the SnapshotVersionVector is correctly handled and integrated with the existing logic in applySnapshot.


296-299: LGTM! Ensure the integration of the new method.

The new method VersionVector looks good. Ensure that it is correctly integrated and used in the codebase.

api/converter/to_pb.go (2)

158-170: Verify the handling of VersionVector in ToChangeID.

Ensure that the VersionVector is correctly handled and integrated with the existing logic in ToChangeID.


172-186: LGTM! Ensure the integration of the new method.

The new method ToVersionVector looks good. Ensure that it is correctly integrated and used in the codebase.

pkg/document/document_test.go (2)

571-571: Verify the handling of SnapshotVersionVector in the test case.

Ensure that the SnapshotVersionVector is correctly handled and integrated with the existing logic in the test case.


585-641: LGTM! Ensure the coverage of the new test case.

The new test case version vector test looks good. Ensure that it provides adequate coverage for version vectors.

api/converter/from_pb.go (3)

105-121: Correctly handle Snapshot and SnapshotVersionVector.

The changes correctly handle the Snapshot and SnapshotVersionVector fields. Ensure that the SnapshotVersionVector is properly validated and used in the rest of the codebase.


159-170: Correctly handle VersionVector in fromChangeID.

The changes correctly handle the VersionVector field in the fromChangeID function. Ensure that the VersionVector is properly validated and used in the rest of the codebase.


174-192: New function FromVersionVector correctly converts Protobuf VersionVector to model format.

The function correctly converts the VersionVector and handles potential errors. The TODO comment regarding old clients is noted.

server/rpc/testcases/testcases.go (1)

Line range hint 311-366:
Correctly handle VersionVector in RunPushPullChangeTest.

The changes correctly handle the VersionVector field in the RunPushPullChangeTest function. Ensure that the test cases cover all relevant scenarios for VersionVector.

server/backend/database/memory/database.go (5)

850-851: Addition of VersionVector to ChangeInfo.

The VersionVector field has been added to the ChangeInfo struct. Ensure that this field is correctly populated and utilized in other parts of the code.


1014-1020: Addition of VersionVector to SnapshotInfo.

The VersionVector field has been added to the SnapshotInfo struct. Ensure that this field is correctly populated and utilized in other parts of the code.


1075-1080: Addition of VersionVector to SnapshotInfo.

The VersionVector field has been added to the SnapshotInfo struct. Ensure that this field is correctly populated and utilized in other parts of the code.


1091-1093: Initialization of SnapshotInfo with VersionVector.

A new SnapshotInfo struct is initialized with a VersionVector. Ensure that this initialization is correct and consistent with the rest of the code.


Line range hint 1128-1173:
Addition of UpdateAndFindMinSyncedTime function.

The new function UpdateAndFindMinSyncedTime includes handling for VersionVector. Ensure that the function logic is correct and the VersionVector is correctly utilized.

server/backend/database/mongo/client.go (3)

844-844: LGTM! Added version_vector field to CreateChangeInfos.

The addition of the version_vector field aligns with the PR's objective to handle version vectors for changes.


998-1004: LGTM! Added version_vector field to CreateSnapshotInfo.

The addition of the version_vector field aligns with the PR's objective to handle version vectors for snapshots.


1063-1063: LGTM! Initialized version_vector in FindClosestSnapshotInfo.

The initialization of the version_vector field ensures that it is always set, even when no snapshot is found.

api/docs/yorkie/v1/resources.openapi.yaml (24)

19-22: Ensure schema references are correct.

The schema references for connect.error are consistent and correctly updated.


31-47: Ensure enum values are correct and consistent.

The enum values for connect.error are correctly updated and consistent with the documentation.


Line range hint 166-185:
Ensure properties are correctly updated for yorkie.v1.Change.

The properties for yorkie.v1.Change are correctly updated to include the new VersionVector feature.


211-226: Ensure properties are correctly updated for yorkie.v1.ChangeID.

The properties for yorkie.v1.ChangeID are correctly updated to include the new VersionVector feature.


Line range hint 239-282:
Ensure properties are correctly updated for yorkie.v1.ChangePack.

The properties for yorkie.v1.ChangePack are correctly updated to include the new VersionVector feature.


298-299: Ensure properties are correctly updated for yorkie.v1.Checkpoint.

The properties for yorkie.v1.Checkpoint are correctly updated and consistent with the new VersionVector feature.


Line range hint 308-319:
Ensure properties are correctly updated for yorkie.v1.DocEvent.

The properties for yorkie.v1.DocEvent are correctly updated and consistent with the new VersionVector feature.


345-352: Ensure enum values are correct and consistent.

The enum values for yorkie.v1.DocEventType are correctly updated and consistent with the documentation.


Line range hint 360-387:
Ensure properties are correctly updated for yorkie.v1.DocumentSummary.

The properties for yorkie.v1.DocumentSummary are correctly updated and consistent with the new VersionVector feature.


Line range hint 399-496:
Ensure properties are correctly updated for yorkie.v1.JSONElement.

The properties for yorkie.v1.JSONElement are correctly updated and consistent with the new VersionVector feature.


Line range hint 508-528:
Ensure properties are correctly updated for yorkie.v1.JSONElement.JSONObject.

The properties for yorkie.v1.JSONElement.JSONObject are correctly updated and consistent with the new VersionVector feature.


540-552: Ensure properties are correctly updated for yorkie.v1.JSONElement.Primitive.

The properties for yorkie.v1.JSONElement.Primitive are correctly updated and consistent with the new VersionVector feature.


Line range hint 575-595:
Ensure properties are correctly updated for yorkie.v1.JSONElement.Text.

The properties for yorkie.v1.JSONElement.Text are correctly updated and consistent with the new VersionVector feature.


Line range hint 607-627:
Ensure properties are correctly updated for yorkie.v1.JSONElement.Tree.

The properties for yorkie.v1.JSONElement.Tree are correctly updated and consistent with the new VersionVector feature.


639-651: Ensure properties are correctly updated for yorkie.v1.JSONElementSimple.

The properties for yorkie.v1.JSONElementSimple are correctly updated and consistent with the new VersionVector feature.


Line range hint 679-684:
Ensure properties are correctly updated for yorkie.v1.NodeAttr.

The properties for yorkie.v1.NodeAttr are correctly updated and consistent with the new VersionVector feature.


696-750: Ensure properties are correctly updated for yorkie.v1.Operation.

The properties for yorkie.v1.Operation are correctly updated and consistent with the new VersionVector feature.


Line range hint 762-999:
Ensure properties are correctly updated for Operation subtypes.

The properties for various Operation subtypes are correctly updated and consistent with the new VersionVector feature.


1021-1039: Ensure properties are correctly updated for yorkie.v1.Operation.Style.

The properties for yorkie.v1.Operation.Style are correctly updated and consistent with the new VersionVector feature.


Line range hint 1087-1120:
Ensure properties are correctly updated for yorkie.v1.Operation.TreeEdit.

The properties for yorkie.v1.Operation.TreeEdit are correctly updated and consistent with the new VersionVector feature.


1166-1184: Ensure properties are correctly updated for yorkie.v1.Operation.TreeStyle.

The properties for yorkie.v1.Operation.TreeStyle are correctly updated and consistent with the new VersionVector feature.


1256-1262: Ensure properties are correctly updated for yorkie.v1.PresenceChange.

The properties for yorkie.v1.PresenceChange are correctly updated and consistent with the new VersionVector feature.


Line range hint 1303-1329:
Ensure properties are correctly updated for yorkie.v1.Project.

The properties for yorkie.v1.Project are correctly updated and consistent with the new VersionVector feature.


1341-1347: Ensure properties are correctly updated for yorkie.v1.RGANode.

The properties for yorkie.v1.RGANode are correctly updated and consistent with the new VersionVector feature.

api/docs/yorkie/v1/admin.openapi.yaml (4)

612-617: LGTM! Addition of versionVector property in ChangeID schema.

The addition of the versionVector property in the ChangeID schema looks correct and aligns with the PR objectives.


805-810: LGTM! Addition of versionVector property in GetSnapshotMetaResponse schema.

The addition of the versionVector property in the GetSnapshotMetaResponse schema looks correct and aligns with the PR objectives.


2036-2064: LGTM! Addition of VersionVector schema.

The addition of the new VersionVector schema looks correct and aligns with the PR objectives. The schema is well-defined with appropriate properties.


557-557: LGTM! Update to include ChangeID schema reference in Change schema.

The update to include a reference to the ChangeID schema in the Change schema looks correct and aligns with the PR objectives.

test/integration/gc_test.go Outdated Show resolved Hide resolved
pkg/document/crdt/root_test.go Outdated Show resolved Hide resolved
Copy link

coderabbitai bot commented Jul 15, 2024

Actions performed

Reviews resumed.

Copy link

coderabbitai bot commented Jul 15, 2024

Note

Currently processing new changes in this PR. This may take a few minutes, please wait...

Commits

Files that changed from the base of the PR and between b468f8b and c815a09.

Files ignored due to path filters (2)
  • api/yorkie/v1/admin.pb.go is excluded by !**/*.pb.go
  • api/yorkie/v1/resources.pb.go is excluded by !**/*.pb.go
Files selected for processing (32)
  • admin/client.go (1 hunks)
  • api/converter/from_pb.go (2 hunks)
  • api/converter/to_pb.go (2 hunks)
  • api/docs/yorkie/v1/admin.openapi.yaml (53 hunks)
  • api/docs/yorkie/v1/resources.openapi.yaml (63 hunks)
  • api/docs/yorkie/v1/yorkie.openapi.yaml (49 hunks)
  • api/yorkie/v1/admin.proto (1 hunks)
  • api/yorkie/v1/resources.proto (2 hunks)
  • pkg/document/change/change.go (1 hunks)
  • pkg/document/change/id.go (5 hunks)
  • pkg/document/change/pack.go (2 hunks)
  • pkg/document/crdt/root_test.go (1 hunks)
  • pkg/document/document.go (2 hunks)
  • pkg/document/document_test.go (3 hunks)
  • pkg/document/internal_document.go (8 hunks)
  • pkg/document/time/actor_id.go (1 hunks)
  • pkg/document/time/ticket.go (1 hunks)
  • pkg/document/time/vection_vector.go (1 hunks)
  • server/backend/database/change_info.go (2 hunks)
  • server/backend/database/database.go (1 hunks)
  • server/backend/database/memory/database.go (7 hunks)
  • server/backend/database/mongo/client.go (5 hunks)
  • server/backend/database/mongo/registry.go (3 hunks)
  • server/backend/database/mongo/registry_test.go (1 hunks)
  • server/backend/database/snapshot_info.go (3 hunks)
  • server/packs/packs.go (3 hunks)
  • server/packs/pushpull.go (1 hunks)
  • server/packs/serverpacks.go (3 hunks)
  • server/packs/snapshots.go (1 hunks)
  • server/rpc/admin_server.go (1 hunks)
  • server/rpc/testcases/testcases.go (4 hunks)
  • test/integration/gc_test.go (1 hunks)
________________________________
< Overly attached code reviewer. >
--------------------------------
 \
  \   (\__/)
      (•ㅅ•)
      /   づ

Tip

You can make CodeRabbit's review more strict using the `assertive` profile, if that's what you prefer.

Change the reviews.profile setting in your project's settings in CodeRabbit to assertive to make CodeRabbit's nitpick more issues in your PRs.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Outside diff range, codebase verification and nitpick comments (1)
server/backend/database/memory/database.go (1)

Line range hint 1128-1173: Necessary function signature change and TODO comment.

The function signature change is necessary to accommodate the new return type. The TODO comment indicates that further work is needed to build the version vector.

Do you want me to help with building the version vector or open a GitHub issue to track this task?

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between b468f8b and c815a09.

Files ignored due to path filters (2)
  • api/yorkie/v1/admin.pb.go is excluded by !**/*.pb.go
  • api/yorkie/v1/resources.pb.go is excluded by !**/*.pb.go
Files selected for processing (32)
  • admin/client.go (1 hunks)
  • api/converter/from_pb.go (2 hunks)
  • api/converter/to_pb.go (2 hunks)
  • api/docs/yorkie/v1/admin.openapi.yaml (53 hunks)
  • api/docs/yorkie/v1/resources.openapi.yaml (63 hunks)
  • api/docs/yorkie/v1/yorkie.openapi.yaml (49 hunks)
  • api/yorkie/v1/admin.proto (1 hunks)
  • api/yorkie/v1/resources.proto (2 hunks)
  • pkg/document/change/change.go (1 hunks)
  • pkg/document/change/id.go (5 hunks)
  • pkg/document/change/pack.go (2 hunks)
  • pkg/document/crdt/root_test.go (1 hunks)
  • pkg/document/document.go (2 hunks)
  • pkg/document/document_test.go (3 hunks)
  • pkg/document/internal_document.go (8 hunks)
  • pkg/document/time/actor_id.go (1 hunks)
  • pkg/document/time/ticket.go (1 hunks)
  • pkg/document/time/vection_vector.go (1 hunks)
  • server/backend/database/change_info.go (2 hunks)
  • server/backend/database/database.go (1 hunks)
  • server/backend/database/memory/database.go (7 hunks)
  • server/backend/database/mongo/client.go (5 hunks)
  • server/backend/database/mongo/registry.go (3 hunks)
  • server/backend/database/mongo/registry_test.go (1 hunks)
  • server/backend/database/snapshot_info.go (3 hunks)
  • server/packs/packs.go (3 hunks)
  • server/packs/pushpull.go (1 hunks)
  • server/packs/serverpacks.go (3 hunks)
  • server/packs/snapshots.go (1 hunks)
  • server/rpc/admin_server.go (1 hunks)
  • server/rpc/testcases/testcases.go (4 hunks)
  • test/integration/gc_test.go (1 hunks)
Files not summarized due to errors (2)
  • api/docs/yorkie/v1/admin.openapi.yaml: Error: Message exceeds token limit
  • api/docs/yorkie/v1/yorkie.openapi.yaml: Error: Message exceeds token limit
Files skipped from review due to trivial changes (1)
  • pkg/document/crdt/root_test.go
Files skipped from review as they are similar to previous changes (23)
  • admin/client.go
  • api/converter/from_pb.go
  • api/converter/to_pb.go
  • api/docs/yorkie/v1/resources.openapi.yaml
  • api/yorkie/v1/admin.proto
  • api/yorkie/v1/resources.proto
  • pkg/document/change/change.go
  • pkg/document/change/id.go
  • pkg/document/change/pack.go
  • pkg/document/document.go
  • pkg/document/internal_document.go
  • pkg/document/time/actor_id.go
  • pkg/document/time/vection_vector.go
  • server/backend/database/change_info.go
  • server/backend/database/mongo/client.go
  • server/backend/database/mongo/registry_test.go
  • server/backend/database/snapshot_info.go
  • server/packs/packs.go
  • server/packs/pushpull.go
  • server/packs/serverpacks.go
  • server/packs/snapshots.go
  • server/rpc/testcases/testcases.go
  • test/integration/gc_test.go
Additional comments not posted (76)
pkg/document/time/ticket.go (1)

50-53: LGTM!

The comments provide clarity on the purpose and limitations of the Ticket struct.

server/backend/database/mongo/registry.go (4)

27-31: LGTM!

The new imports are necessary for handling VersionVector encoding and decoding.


37-37: LGTM!

The new variable is necessary for registering the VersionVector type with the BSON codec registry.


52-52: LGTM!

The decoder registration is necessary for decoding VersionVector from BSON.


57-57: LGTM!

The encoder registration is necessary for encoding VersionVector to BSON.

server/backend/database/database.go (1)

249-256: LGTM!

The changes to the UpdateAndFindMinSyncedTime method are necessary to support the new VersionVector functionality.

server/rpc/admin_server.go (1)

265-273: LGTM!

The changes to the GetSnapshotMeta method are necessary to include the VersionVector in the response.

pkg/document/document_test.go (2)

571-571: LGTM!

The inclusion of SnapshotVersionVector in the change pack is a good addition to maintain version consistency.


585-641: Well-structured tests for version vector functionality.

The tests comprehensively cover the creation, update, and verification of version vectors, including causal relationships and concurrent updates.

server/backend/database/memory/database.go (1)

850-851: Good addition of VersionVector to change info.

The inclusion of VersionVector in the change info is crucial for accurate change tracking.

api/docs/yorkie/v1/yorkie.openapi.yaml (61)

3-3: Ensure the description is accurate and complete.

The description provides a brief overview of the Yorkie service. Verify that it accurately reflects the current state of the service.


8-11: Verify server URLs.

Ensure that the URLs for the production and local servers are correct and accessible.


17-17: Check the request body reference.

Ensure that the reference to ActivateClientRequest is correct and the schema is defined in the components section.


20-22: Check the response references.

Ensure that the references to ActivateClientResponse and connect.error are correct and the schemas are defined in the components section.


24-24: Verify the tag.

Ensure that the tag yorkie.v1.YorkieService is correctly defined and used consistently across the API specification.


115-118: Check the schema reference for ActivateClientRequest.

Ensure that the schema for ActivateClientRequest is correctly defined and referenced.


124-127: Check the schema reference for AttachDocumentRequest.

Ensure that the schema for AttachDocumentRequest is correctly defined and referenced.


133-136: Check the schema reference for BroadcastRequest.

Ensure that the schema for BroadcastRequest is correctly defined and referenced.


142-145: Check the schema reference for DeactivateClientRequest.

Ensure that the schema for DeactivateClientRequest is correctly defined and referenced.


151-154: Check the schema reference for DetachDocumentRequest.

Ensure that the schema for DetachDocumentRequest is correctly defined and referenced.


160-163: Check the schema reference for PushPullChangesRequest.

Ensure that the schema for PushPullChangesRequest is correctly defined and referenced.


169-172: Check the schema reference for RemoveDocumentRequest.

Ensure that the schema for RemoveDocumentRequest is correctly defined and referenced.


182-185: Check the schema reference for connect.error.

Ensure that the schema for connect.error is correctly defined and referenced.


191-194: Check the schema reference for ActivateClientResponse.

Ensure that the schema for ActivateClientResponse is correctly defined and referenced.


200-203: Check the schema reference for AttachDocumentResponse.

Ensure that the schema for AttachDocumentResponse is correctly defined and referenced.


209-212: Check the schema reference for BroadcastResponse.

Ensure that the schema for BroadcastResponse is correctly defined and referenced.


218-221: Check the schema reference for DeactivateClientResponse.

Ensure that the schema for DeactivateClientResponse is correctly defined and referenced.


227-230: Check the schema reference for DetachDocumentResponse.

Ensure that the schema for DetachDocumentResponse is correctly defined and referenced.


236-239: Check the schema reference for PushPullChangesResponse.

Ensure that the schema for PushPullChangesResponse is correctly defined and referenced.


245-248: Check the schema reference for RemoveDocumentResponse.

Ensure that the schema for RemoveDocumentResponse is correctly defined and referenced.


255-275: Verify the connect.error schema.

Ensure that the connect.error schema is correctly defined and includes all necessary properties and examples.


308-310: Check the schema reference for ChangePack in AttachDocumentRequest.

Ensure that the schema for ChangePack is correctly defined and referenced.


325-327: Check the schema reference for ChangePack in AttachDocumentResponse.

Ensure that the schema for ChangePack is correctly defined and referenced.


Line range hint 374-388:
Check the schema reference for ChangeID in Change.

Ensure that the schema for ChangeID is correctly defined and referenced.


419-434: Verify the ChangeID schema.

Ensure that the ChangeID schema is correctly defined and includes the new versionVector property.


Line range hint 447-490:
Verify the ChangePack schema.

Ensure that the ChangePack schema is correctly defined and includes the new minSyncedVersionVector and snapshotVersionVector properties.


506-507: Check the schema reference for server_seq in Checkpoint.

Ensure that the schema for server_seq is correctly defined and referenced.


532-533: Check the schema reference for ChangePack in DetachDocumentRequest.

Ensure that the schema for ChangePack is correctly defined and referenced.


559-559: Check the schema reference for ChangePack in DetachDocumentResponse.

Ensure that the schema for ChangePack is correctly defined and referenced.


Line range hint 571-582:
Verify the DocEvent schema.

Ensure that the DocEvent schema is correctly defined and includes all necessary properties.


608-615: Verify the DocEventType schema.

Ensure that the DocEventType schema is correctly defined and includes all necessary enum values.


623-641: Check the schema references in JSONElementSimple.

Ensure that the schemas for createdAt, movedAt, removedAt, and type are correctly defined and referenced.


663-663: Check the schema reference for updated_at in NodeAttr.

Ensure that the schema for updated_at is correctly defined and referenced.


680-734: Verify the Operation schema and its nested schemas.

Ensure that the Operation schema and its nested schemas (Add, Edit, Increase, Move, Remove, Select, Set, Style, TreeEdit, TreeStyle) are correctly defined and include all necessary properties.


746-764: Check the schema reference for JSONElementSimple in Operation.Add.

Ensure that the schema for JSONElementSimple is correctly defined and referenced.


791-809: Check the schema references in Operation.Edit.

Ensure that the schemas for executed_at, from, and to are correctly defined and referenced.


854-866: Check the schema reference for JSONElementSimple in Operation.Increase.

Ensure that the schema for JSONElementSimple is correctly defined and referenced.


878-896: Check the schema references in Operation.Move.

Ensure that the schemas for created_at, executed_at, parent_created_at, and prev_created_at are correctly defined and referenced.


908-920: Check the schema references in Operation.Remove.

Ensure that the schemas for created_at, executed_at, and parent_created_at are correctly defined and referenced.


936-954: Check the schema references in Operation.Select.

Ensure that the schemas for executed_at, from, parent_created_at, and to are correctly defined and referenced.


Line range hint 966-983:
Check the schema references in Operation.Set.

Ensure that the schemas for executed_at, parent_created_at, and JSONElementSimple are correctly defined and referenced.


1005-1023: Check the schema references in Operation.Style.

Ensure that the schemas for executed_at, from, parent_created_at, and to are correctly defined and referenced.


Line range hint 1071-1104:
Check the schema references in Operation.TreeEdit.

Ensure that the schemas for contents, created_at_map_by_actor, executed_at, from, parent_created_at, and to are correctly defined and referenced.


1150-1168: Check the schema references in Operation.TreeStyle.

Ensure that the schemas for executed_at, from, parent_created_at, and to are correctly defined and referenced.


1201-1201: Check the schema reference for TimeTicket in Operation.TreeStyle.CreatedAtMapByActorEntry.

Ensure that the schema for TimeTicket is correctly defined and referenced.


1240-1246: Check the schema references in PresenceChange.

Ensure that the schemas for presence and type are correctly defined and referenced.


1270-1270: Check the schema reference for ChangePack in PushPullChangesRequest.

Ensure that the schema for ChangePack is correctly defined and referenced.


1297-1297: Check the schema reference for ChangePack in PushPullChangesResponse.

Ensure that the schema for ChangePack is correctly defined and referenced.


1309-1309: Check the schema reference for ChangePack in RemoveDocumentRequest.

Ensure that the schema for ChangePack is correctly defined and referenced.


1331-1331: Check the schema reference for ChangePack in RemoveDocumentResponse.

Ensure that the schema for ChangePack is correctly defined and referenced.


1343-1343: Check the schema reference for TimeTicket in TextNodePos.

Ensure that the schema for TimeTicket is correctly defined and referenced.


1399-1417: Check the schema references in TreeNode.

Ensure that the schemas for id, ins_next_id, ins_prev_id, and removed_at are correctly defined and referenced.


1444-1444: Check the schema reference for NodeAttr in TreeNode.AttributesEntry.

Ensure that the schema for NodeAttr is correctly defined and referenced.


1456-1456: Check the schema reference for TimeTicket in TreeNodeID.

Ensure that the schema for TimeTicket is correctly defined and referenced.


1476-1476: Check the schema reference for TreeNode in TreeNodes.

Ensure that the schema for TreeNode is correctly defined and referenced.


1487-1493: Check the schema references in TreePos.

Ensure that the schemas for left_sibling_id and parent_id are correctly defined and referenced.


1503-1530: Verify the ValueType schema.

Ensure that the ValueType schema is correctly defined and includes all necessary enum values.


1534-1561: Verify the VersionVector schema.

Ensure that the VersionVector schema is correctly defined and includes all necessary properties and nested schemas.


1583-1589: Check the schema references in WatchDocumentResponse.

Ensure that the schemas for event and initialization are correctly defined and referenced.


1615-1615: Check the security scheme reference for ApiKeyAuth.

Ensure that the security scheme for ApiKeyAuth is correctly defined and referenced.


1617-1618: Verify the tag description.

Ensure that the tag description for yorkie.v1.YorkieService is accurate and complete.

api/docs/yorkie/v1/admin.openapi.yaml (5)

3-3: LGTM!

The updated description of the Yorkie API is clear and concise.


8-11: LGTM!

The updated server URLs for the Yorkie API are correctly specified.


557-557: LGTM!

The Change schema now correctly references the yorkie.v1.ChangeID schema.


612-617: LGTM!

The ChangeID schema now includes a versionVector field, aligning with the PR's objective.


805-810: LGTM!

The GetSnapshotMetaResponse schema now includes a versionVector field, aligning with the PR's objective.

@hackerwins hackerwins force-pushed the hybrid-clock branch 7 times, most recently from d4ef095 to fb3a582 Compare July 18, 2024 06:42
Co-authored-by: Youngteac Hong <susukang98@gmail.com>
@hackerwins hackerwins force-pushed the hybrid-clock branch 2 times, most recently from 46081f7 to adb5e87 Compare September 26, 2024 10:24
JOOHOJANG and others added 5 commits October 2, 2024 14:13
Unify VersionVectors in ChangePack across three scenarios:

1. Pushing pack to server: represents document's current version vector
2. Pulling pack: represents minSyncedVersionVector for GC
3. Pulling pack(snapshot): represents snapshot's version vector at creation

This commit simplifies the codebase and ensures consistent version
vector handling throughout the document synchronization process.
* Update garbage collection design document

* Fix typo

* Update version vector design document

* Add contents to garbage collection design document

* Updated explanation based on review feedback

* Updated explanation based on review feedback
This commit establishes a framework for handling database migrations between
versions, with a specific implementation for version vector updates. The
migration process follows these steps:

1. Creates version-specific directories in migrations/
2. Implements migration script files for data transformation
3. Updates migrationMap in /cmd/migration.go to register new versions

Migration can be executed using the CLI command:
$ yorkie migration --from v0.5.1 --to v0.6.0 \
    --mongo-connection-uri mongodb://localhost:27017 \
    --batch-size 10000 \
    --database-name yorkie-meta

---------

Co-authored-by: Youngteac Hong <susukang98@gmail.com>
@hackerwins hackerwins marked this pull request as ready for review October 22, 2024 01:34
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 48

🧹 Outside diff range and nitpick comments (60)
server/backend/database/version_vector.go (2)

24-31: LGTM: Well-structured VersionVectorInfo definition.

The VersionVectorInfo struct is well-defined with appropriate fields and BSON tags. It clearly represents the necessary information for version vectors in the context of documents and clients.

Consider enhancing the comment to briefly explain the purpose of each field:

// VersionVectorInfo represents version vector information for documents and clients.
// It includes:
//   - ID: Unique identifier for this VersionVectorInfo
//   - ProjectID: ID of the associated project
//   - DocID: ID of the associated document
//   - ClientID: ID of the client
//   - VersionVector: The actual version vector data

24-31: Consider performance implications of VersionVectorInfo.

The introduction of VersionVectorInfo is a significant enhancement to the system's ability to track changes and relationships. However, it's important to consider the following:

  1. Scalability: As the number of documents and clients grows, the storage and processing of these structures may impact performance. Consider implementing efficient storage and retrieval mechanisms.

  2. Indexing: Ensure that appropriate database indexes are in place for the fields that will be frequently queried, especially ProjectID, DocID, and ClientID.

  3. Memory usage: The time.VersionVector field could potentially grow large for long-running or heavily collaborative documents. Consider implementing compression or pruning strategies if needed.

To address these concerns:

  • Implement database indexing strategies.
  • Consider adding a timestamp or last updated field to enable efficient pruning of old data.
  • Monitor the size and access patterns of VersionVectorInfo in production to identify optimization opportunities.
server/backend/database/snapshot_info.go (2)

20-23: LGTM! Consider using a more descriptive alias for the standard time package.

The addition of the new imports is correct and necessary for the changes made to the SnapshotInfo struct. However, to improve code readability, consider using a more descriptive alias for the standard time package.

You could use an alias like stdtime instead of gotime:

-gotime "time"
+stdtime "time"

This would make it clearer that it's the standard time package being aliased.


50-50: LGTM! Update the field type if the import alias is changed.

The change of the CreatedAt field type to use the aliased standard time package is correct.

If you decide to change the import alias as suggested earlier, remember to update this line accordingly:

-CreatedAt gotime.Time `bson:"created_at"`
+CreatedAt stdtime.Time `bson:"created_at"`
server/backend/database/mongo/registry_test.go (2)

38-47: LGTM! Consider adding an error check for ID creation.

The new sub-test for types.ID is well-structured and tests the essential functionality of marshaling and unmarshaling. Good job on separating this into its own sub-test for clarity.

Consider adding an error check when creating the ID:

-		id := types.ID(primitive.NewObjectID().Hex())
+		id, err := types.IDFromHex(primitive.NewObjectID().Hex())
+		assert.NoError(t, err)

This would ensure that the ID creation itself is also validated.


50-66: LGTM! Consider enhancing the assertion for VersionVector.

The new sub-test for versionVector is well-structured and tests the essential functionality of marshaling and unmarshaling. Good job on separating this into its own sub-test for clarity.

To improve the test coverage, consider adding an assertion to check the specific value set in the VersionVector:

 		assert.NoError(t, bson.UnmarshalWithRegistry(registry, data, &info))
 		assert.Equal(t, vector, info.VersionVector)
+		value, exists := info.VersionVector.Get(actorID)
+		assert.True(t, exists)
+		assert.Equal(t, uint64(1), value)

This addition would ensure that not only the overall VersionVector matches, but also that the specific value set for the ActorID is correctly preserved through the marshal/unmarshal process.

pkg/document/change/change.go (1)

116-119: LGTM! Consider adding a brief comment for clarity.

The implementation of AfterOrEqual is correct and aligns with the PR objectives. It effectively delegates the comparison to the ID struct's AfterOrEqual method, which is assumed to handle the version vector comparisons.

Consider adding a brief comment to explain the purpose of this method:

// AfterOrEqual returns true if this change is causally after or concurrent with the other change,
// based on their version vectors.
func (c *Change) AfterOrEqual(other *Change) bool {
	return c.id.AfterOrEqual(other.id)
}
pkg/document/time/ticket.go (1)

50-53: Approve documentation update with a minor suggestion.

The updated documentation for the Ticket struct provides clearer information about its purpose and limitations. This aligns well with the PR objectives of introducing a VersionVector for handling causal and concurrent relationships.

To further improve clarity, consider adding a brief mention of the VersionVector as the mechanism for detecting change relationships.

Suggested addition:

 // Ticket represents the logical clock. It is used to determine the order of
 // changes and identify elements and nodes in the document. It can't be used
 // to detect the relationship between changes whether they are causally related
-// or concurrent.
+// or concurrent. For detecting such relationships, refer to the VersionVector.
api/yorkie/v1/admin.proto (1)

165-165: LGTM! Consider adding a comment for clarity.

The addition of the VersionVector version_vector field to the GetSnapshotMetaResponse message aligns well with the PR objectives. It enhances the API's capability to provide version vector information along with snapshot metadata.

Consider adding a brief comment above this field to explain its purpose, for example:

// Version vector representing the causal history of the snapshot
VersionVector version_vector = 3;

This would improve the self-documentation of the protobuf definition.

pkg/document/gc_test.go (2)

143-143: LGTM: Consistent use of MaxVersionVector for final garbage collection.

The change from time.MaxTicket to helper.MaxVersionVector(doc.ActorID()) for the final garbage collection is consistent with the previous change and aligns with the PR's objective.

Consider refactoring the garbage collection calls to avoid code duplication. You could create a helper function within the test file:

func gcWithMaxVector(doc *document.Document) {
    doc.GarbageCollect(helper.MaxVersionVector(doc.ActorID()))
}

Then use this function in both places:

gcWithMaxVector(doc)

This would make the tests more maintainable and reduce the risk of inconsistencies if the garbage collection method changes in the future.


209-209: LGTM: Consistent use of MaxVersionVector in TestTextGC.

The changes from time.MaxTicket to helper.MaxVersionVector(doc.ActorID()) in the TestTextGC function are consistent with the changes made in TestTreeGC. This ensures that both tree and text elements use the new version vector-based approach for garbage collection.

As suggested for TestTreeGC, consider refactoring these garbage collection calls to use a helper function:

func gcWithMaxVector(doc *document.Document) {
    doc.GarbageCollect(helper.MaxVersionVector(doc.ActorID()))
}

Then use this function in both TestTreeGC and TestTextGC:

gcWithMaxVector(doc)

This refactoring would improve consistency and maintainability across both test functions.

Also applies to: 218-218

design/version-vector.md (5)

1-20: LGTM! Consider adding a comma for clarity.

The header and summary section provide a clear introduction to the topic of Version Vectors in CRDT systems. The goals and non-goals effectively set the scope of the document.

Consider adding a comma after "Version Vector" in line 11 for improved readability:

-Yorkie provides a special type of Version Vector called Lamport Synced Version Vector which consists of Lamport and version vector.
+Yorkie provides a special type of Version Vector called Lamport Synced Version Vector, which consists of Lamport and version vector.
🧰 Tools
🪛 LanguageTool

[uncategorized] ~11-~11: Possible missing comma found.
Context: ...on Vector called Lamport Synced Version Vector which consists of Lamport and version v...

(AI_HYDRA_LEO_MISSING_COMMA)


23-36: LGTM! Consider minor improvements for clarity and formatting.

The explanation of Lamport Timestamps is clear and accurate, and the example effectively illustrates the concept.

  1. Consider adding a colon after "Lamport Timestamp" in the section header for consistency with other headers:
-### What is Lamport Timestamp
+### What is Lamport Timestamp:
  1. Specify a language for the code block to improve formatting:
-```
+```plaintext
 Here's a simplified example of Lamport Timestamps in a distributed system with three processes (P1, P2, P3):

 1. P1 performs an event: P1's clock becomes 1.
 ...
🧰 Tools
🪛 LanguageTool

[uncategorized] ~23-~23: Possible missing comma found.
Context: ...ructure. ## Proposal Details ### What is Lamport Timestamp Lamport Timestamps ...

(AI_HYDRA_LEO_MISSING_COMMA)


[uncategorized] ~24-~24: The verb “are” doesn’t seem to fit in this context, “is” is probably more formally correct.
Context: ... Lamport Timestamp Lamport Timestamps are a logical clock mechanism used in distr...

(AI_HYDRA_LEO_CPT_ARE_IS)

🪛 Markdownlint

25-25: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


38-40: Consider expanding this section for clarity and completeness.

While this section provides a brief overview of Yorkie's logical clock system, it could benefit from more detail and a clearer connection to the previous section on Lamport Timestamps.

Consider expanding this section to:

  1. Explain how Yorkie's system relates to or differs from standard Lamport Timestamps.
  2. Provide more detail on what a TimeTicket is and how it's composed.
  3. Give a brief example of how a change is recorded and identified in Yorkie's system.

This additional information would help readers better understand Yorkie's specific implementation and its relation to the general concepts introduced earlier.


52-57: LGTM! Consider improving paragraph structure for readability.

This section effectively explains the differences between Lamport Timestamps and Version Vectors, and introduces the concept of Lamport Synced Version Vector as a solution.

Consider breaking the long paragraph (lines 55-57) into smaller, more focused paragraphs for improved readability. For example:

As previously mentioned, both CRDT node IDs and change IDs were composed of `TimeTickets`, which led to garbage collection (GC) issues. If we attempt to resolve causality problems by applying Version Vectors to change IDs, it would require converting all corresponding CRDT node IDs from Lamport-based to Version Vector-based.

This introduces memory overhead proportional to the number of participants, as Version Vectors consist of (ID, counter) pairs for all participants involved in the edits, leading to increased memory usage and computational complexity.

To address the issues that arise from introducing Version Vectors, we implemented a method that synchronizes Lamport Timestamps with Version Vectors, allowing both to be used together. This way, CRDT nodes continue to use `TimeTickets` as their IDs, while change IDs utilize the Lamport Synced Version Vector.
🧰 Tools
🪛 LanguageTool

[uncategorized] ~53-~53: Possible missing comma found.
Context: ...ements received, without creating a new max by adding to the maximum found during t...

(AI_HYDRA_LEO_MISSING_COMMA)


59-66: LGTM! Consider refining the final sentence for clarity.

This section provides a clear and detailed explanation of the Lamport Synced Version Vector concept. The included diagram (lamport-synced-version-vector-and-version-vector.jpg) effectively illustrates the differences between traditional Version Vectors and Lamport Synced Version Vectors.

Consider refining the last sentence for improved clarity and conciseness:

-This ensures that causality is always satisfied, allowing the existing TimeTicket system to be used together with the version vector.
+This approach ensures causality while allowing the existing TimeTicket system to coexist with the version vector.
design/garbage-collection.md (7)

3-13: Document version and context updates look good.

The target version update to 0.6.0 and the new introductory section provide valuable context for the changes in this document. The reference to the pull request for understanding version vectors is particularly helpful.

Consider making a minor grammatical correction:

-You'd better read [this](https://github.com/yorkie-team/yorkie/pull/981) first to understand concepts and usage of version vector.
+You'd better read [this](https://github.com/yorkie-team/yorkie/pull/981) first to understand the concepts and usage of version vectors.
🧰 Tools
🪛 LanguageTool

[uncategorized] ~11-~11: You might be missing the article “the” here.
Context: ...am/yorkie/pull/981) first to understand concepts and usage of version vector. ## Summa...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


30-46: Clear explanation of the transition to version vectors

The rationale for moving from Lamport timestamps to version vectors is well-explained. The introduction of minVersionVector and its role in garbage collection is clearly presented.

Consider adding a brief explanation of how minVersionVector is computed and maintained by the server, as this information would be valuable for implementers.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~31-~31: Possible missing article found.
Context: ...mple and lightweight to use, but it has crucial weakness that Lamport doesn't guarantee...

(AI_HYDRA_LEO_MISSING_THE)


[uncategorized] ~31-~31: Possible missing comma found.
Context: ... lightweight to use, but it has crucial weakness that Lamport doesn't guarantee causalit...

(AI_HYDRA_LEO_MISSING_COMMA)


[style] ~34-~34: This phrasing can be wordy, so consider using a more descriptive and concise alternative.
Context: ...-gc-issue](media/prev-gc-issue.jpg) As you can see from the image above, min_synced_seq doesn...

(AS_YOU_CAN_SEE)


[uncategorized] ~39-~39: A determiner appears to be missing. Consider inserting it.
Context: ...d remotely and purges them completely. Server records the version vector of the last ...

(AI_EN_LECTOR_MISSING_DETERMINER)


[uncategorized] ~40-~40: You might be missing the article “the” here.
Context: ...never the client requests PushPull. And Server returns the min version vector, `minVer...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~40-~40: Possible missing preposition found.
Context: ...rsionVectorof all clients in response PushPull to the client.minVersionVector` is us...

(AI_EN_LECTOR_MISSING_PREPOSITION)


[uncategorized] ~43-~43: You might be missing the article “the” here.
Context: ... vector is the vector which consists of minimum value of every version vector stored in...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~43-~43: You might be missing the article “a” here.
Context: ...ector stored in version vector table in database. Conceptually, min version vector is v...

(AI_EN_LECTOR_MISSING_DETERMINER_A)


[uncategorized] ~45-~45: You might be missing the article “a” here.
Context: ...e. Conceptually, min version vector is version vector that is uniformly applied to all...

(AI_EN_LECTOR_MISSING_DETERMINER_A)


48-67: Well-structured code examples for GC algorithm and min version vector calculation

The code blocks effectively demonstrate the GC algorithm and the process of finding the min version vector. These examples provide clear guidance for implementation.

Consider adding comments to the code blocks to explain each step, especially in the min version vector calculation example.

🧰 Tools
🪛 Markdownlint

48-48: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


56-56: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


Line range hint 69-107: Comprehensive example with state diagrams

The step-by-step example with state diagrams excellently illustrates the garbage collection process in action. This visual representation greatly aids in understanding the concept.

Consider adding a brief textual summary after the final state (State 6) to reinforce the key points demonstrated by the example.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~80-~80: You might be missing the article “the” here.
Context: ...And the Client a sends change 3a to server through PushPull API and receives as a ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~82-~82: You might be missing the article “the” here.
Context: ... and it has not been sent (pushpull) to server yet. #### State 3 ![garbage-collectio...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~88-~88: You might be missing the article “the” here.
Context: ....png) Client b pushes change 3b to server and receives as a response that `minVer...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~88-~88: You might be missing the article “the” here.
Context: ...ent applies change 4, the contents of document are changed to ac. This time, all cli...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[grammar] ~88-~88: The noun should probably be in the singular form.
Context: ...t's still marked as tombstone for every clients, because minVersionVector[a] = 1 < 3 ...

(EVERY_EACH_SINGULAR)


[formatting] ~88-~88: If the ‘because’ clause is essential to the meaning, do not use a comma before the clause.
Context: ...ll marked as tombstone for every clients, because minVersionVector[a] = 1 < 3 #### St...

(COMMA_BEFORE_BECAUSE)


[uncategorized] ~100-~100: Possible missing comma found.
Context: ...e-collection-5.png) Client b pushpull but nothing to push or pull. `minVersionVec...

(AI_HYDRA_LEO_MISSING_COMMA)


[uncategorized] ~106-~106: Possible missing comma found.
Context: ...e-collection-6.png) Client a pushpull but nothing to push or pull. `minVersionVec...

(AI_HYDRA_LEO_MISSING_COMMA)


[uncategorized] ~143-~143: Possible missing preposition found.
Context: ...ort from db.versionVector. So we choose remove only client's version vector from table...

(AI_HYDRA_LEO_MISSING_TO)


[uncategorized] ~143-~143: You might be missing the article “the” here.
Context: ...emove only client's version vector from table, and filter minVersionVector by active ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~158-~158: You might be missing the article “the” here.
Context: ....Filter([c3]) = {c1:3, c2:3} ``` After client receive this minVersionVector, it will ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~158-~158: This verb does not appear to agree with the subject. Consider using a different form.
Context: ...([c3]) = {c1:3, c2:3} ``` After client receive this minVersionVector, it will filter i...

(AI_EN_LECTOR_REPLACEMENT_VERB_AGREEMENT)


[uncategorized] ~159-~159: This verb does not appear to agree with the subject. Consider using a different form.
Context: ...lamport. The next pushpull request will contains filtered version vector so that eventua...

(AI_EN_LECTOR_REPLACEMENT_VERB_AGREEMENT)


[uncategorized] ~159-~159: You might be missing the article “a” here.
Context: ...The next pushpull request will contains filtered version vector so that eventually db.ve...

(AI_EN_LECTOR_MISSING_DETERMINER_A)


[uncategorized] ~159-~159: You might be missing the article “the” here.
Context: ...eventually db.version vector will store attached client's version vector only. ![filter-...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

🪛 Markdownlint

108-108: Punctuation: '.'
Trailing punctuation in heading

(MD026, no-trailing-punctuation)


113-113: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


145-145: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


108-161: Thorough explanation of detached client handling

The section on handling detached clients' version vectors is comprehensive and addresses an important edge case. The proposed solution to filter minVersionVector by active clients is a good approach to avoid the n+1 query problem.

Consider adding a code example or pseudocode for the filtering process of minVersionVector and client's local version vector. This would provide clearer guidance for implementation.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~143-~143: Possible missing preposition found.
Context: ...ort from db.versionVector. So we choose remove only client's version vector from table...

(AI_HYDRA_LEO_MISSING_TO)


[uncategorized] ~143-~143: You might be missing the article “the” here.
Context: ...emove only client's version vector from table, and filter minVersionVector by active ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~158-~158: You might be missing the article “the” here.
Context: ....Filter([c3]) = {c1:3, c2:3} ``` After client receive this minVersionVector, it will ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~158-~158: This verb does not appear to agree with the subject. Consider using a different form.
Context: ...([c3]) = {c1:3, c2:3} ``` After client receive this minVersionVector, it will filter i...

(AI_EN_LECTOR_REPLACEMENT_VERB_AGREEMENT)


[uncategorized] ~159-~159: This verb does not appear to agree with the subject. Consider using a different form.
Context: ...lamport. The next pushpull request will contains filtered version vector so that eventua...

(AI_EN_LECTOR_REPLACEMENT_VERB_AGREEMENT)


[uncategorized] ~159-~159: You might be missing the article “a” here.
Context: ...The next pushpull request will contains filtered version vector so that eventually db.ve...

(AI_EN_LECTOR_MISSING_DETERMINER_A)


[uncategorized] ~159-~159: You might be missing the article “the” here.
Context: ...eventually db.version vector will store attached client's version vector only. ![filter-...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

🪛 Markdownlint

108-108: Punctuation: '.'
Trailing punctuation in heading

(MD026, no-trailing-punctuation)


113-113: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


145-145: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


Line range hint 1-161: Minor language and formatting improvements

There are several minor language and formatting issues throughout the document. Addressing these would enhance readability and professionalism.

  1. Add language specifications to code blocks (e.g., ```go for Go code).
  2. Fix grammatical issues:
    • Add missing articles (e.g., "the" or "a") where appropriate.
    • Ensure subject-verb agreement (e.g., "After client receives" instead of "After client receive").
  3. Remove trailing punctuation from headings (e.g., "How we handle if min version vector includes detached client's lamport").
  4. Consider using more concise phrasing (e.g., replace "As you can see from the image above" with "The image above illustrates that").

Would you like assistance in making these minor corrections throughout the document?

🧰 Tools
🪛 LanguageTool

[uncategorized] ~80-~80: You might be missing the article “the” here.
Context: ...And the Client a sends change 3a to server through PushPull API and receives as a ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~82-~82: You might be missing the article “the” here.
Context: ... and it has not been sent (pushpull) to server yet. #### State 3 ![garbage-collectio...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~88-~88: You might be missing the article “the” here.
Context: ....png) Client b pushes change 3b to server and receives as a response that `minVer...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~88-~88: You might be missing the article “the” here.
Context: ...ent applies change 4, the contents of document are changed to ac. This time, all cli...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[grammar] ~88-~88: The noun should probably be in the singular form.
Context: ...t's still marked as tombstone for every clients, because minVersionVector[a] = 1 < 3 ...

(EVERY_EACH_SINGULAR)


[formatting] ~88-~88: If the ‘because’ clause is essential to the meaning, do not use a comma before the clause.
Context: ...ll marked as tombstone for every clients, because minVersionVector[a] = 1 < 3 #### St...

(COMMA_BEFORE_BECAUSE)


[uncategorized] ~100-~100: Possible missing comma found.
Context: ...e-collection-5.png) Client b pushpull but nothing to push or pull. `minVersionVec...

(AI_HYDRA_LEO_MISSING_COMMA)


[uncategorized] ~106-~106: Possible missing comma found.
Context: ...e-collection-6.png) Client a pushpull but nothing to push or pull. `minVersionVec...

(AI_HYDRA_LEO_MISSING_COMMA)


[uncategorized] ~143-~143: Possible missing preposition found.
Context: ...ort from db.versionVector. So we choose remove only client's version vector from table...

(AI_HYDRA_LEO_MISSING_TO)


[uncategorized] ~143-~143: You might be missing the article “the” here.
Context: ...emove only client's version vector from table, and filter minVersionVector by active ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~158-~158: You might be missing the article “the” here.
Context: ....Filter([c3]) = {c1:3, c2:3} ``` After client receive this minVersionVector, it will ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~158-~158: This verb does not appear to agree with the subject. Consider using a different form.
Context: ...([c3]) = {c1:3, c2:3} ``` After client receive this minVersionVector, it will filter i...

(AI_EN_LECTOR_REPLACEMENT_VERB_AGREEMENT)


[uncategorized] ~159-~159: This verb does not appear to agree with the subject. Consider using a different form.
Context: ...lamport. The next pushpull request will contains filtered version vector so that eventua...

(AI_EN_LECTOR_REPLACEMENT_VERB_AGREEMENT)


[uncategorized] ~159-~159: You might be missing the article “a” here.
Context: ...The next pushpull request will contains filtered version vector so that eventually db.ve...

(AI_EN_LECTOR_MISSING_DETERMINER_A)


[uncategorized] ~159-~159: You might be missing the article “the” here.
Context: ...eventually db.version vector will store attached client's version vector only. ![filter-...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

🪛 Markdownlint

108-108: Punctuation: '.'
Trailing punctuation in heading

(MD026, no-trailing-punctuation)


113-113: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


145-145: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


Line range hint 1-161: Excellent documentation of the new garbage collection system

This document provides a comprehensive and well-structured explanation of the transition from Lamport timestamps to version vectors for garbage collection. The use of examples, diagrams, and code snippets effectively illustrates the concepts and implementation details.

To further enhance this excellent document, consider the following suggestions:

  1. Add a brief conclusion section summarizing the key benefits of the new system.
  2. Include a section on potential performance implications or trade-offs of using version vectors.
  3. Provide guidance on migration strategies for existing systems using Lamport timestamps.

Would you like assistance in drafting any of these additional sections?

🧰 Tools
🪛 LanguageTool

[uncategorized] ~80-~80: You might be missing the article “the” here.
Context: ...And the Client a sends change 3a to server through PushPull API and receives as a ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~82-~82: You might be missing the article “the” here.
Context: ... and it has not been sent (pushpull) to server yet. #### State 3 ![garbage-collectio...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~88-~88: You might be missing the article “the” here.
Context: ....png) Client b pushes change 3b to server and receives as a response that `minVer...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~88-~88: You might be missing the article “the” here.
Context: ...ent applies change 4, the contents of document are changed to ac. This time, all cli...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[grammar] ~88-~88: The noun should probably be in the singular form.
Context: ...t's still marked as tombstone for every clients, because minVersionVector[a] = 1 < 3 ...

(EVERY_EACH_SINGULAR)


[formatting] ~88-~88: If the ‘because’ clause is essential to the meaning, do not use a comma before the clause.
Context: ...ll marked as tombstone for every clients, because minVersionVector[a] = 1 < 3 #### St...

(COMMA_BEFORE_BECAUSE)


[uncategorized] ~100-~100: Possible missing comma found.
Context: ...e-collection-5.png) Client b pushpull but nothing to push or pull. `minVersionVec...

(AI_HYDRA_LEO_MISSING_COMMA)


[uncategorized] ~106-~106: Possible missing comma found.
Context: ...e-collection-6.png) Client a pushpull but nothing to push or pull. `minVersionVec...

(AI_HYDRA_LEO_MISSING_COMMA)


[uncategorized] ~143-~143: Possible missing preposition found.
Context: ...ort from db.versionVector. So we choose remove only client's version vector from table...

(AI_HYDRA_LEO_MISSING_TO)


[uncategorized] ~143-~143: You might be missing the article “the” here.
Context: ...emove only client's version vector from table, and filter minVersionVector by active ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~158-~158: You might be missing the article “the” here.
Context: ....Filter([c3]) = {c1:3, c2:3} ``` After client receive this minVersionVector, it will ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~158-~158: This verb does not appear to agree with the subject. Consider using a different form.
Context: ...([c3]) = {c1:3, c2:3} ``` After client receive this minVersionVector, it will filter i...

(AI_EN_LECTOR_REPLACEMENT_VERB_AGREEMENT)


[uncategorized] ~159-~159: This verb does not appear to agree with the subject. Consider using a different form.
Context: ...lamport. The next pushpull request will contains filtered version vector so that eventua...

(AI_EN_LECTOR_REPLACEMENT_VERB_AGREEMENT)


[uncategorized] ~159-~159: You might be missing the article “a” here.
Context: ...The next pushpull request will contains filtered version vector so that eventually db.ve...

(AI_EN_LECTOR_MISSING_DETERMINER_A)


[uncategorized] ~159-~159: You might be missing the article “the” here.
Context: ...eventually db.version vector will store attached client's version vector only. ![filter-...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

🪛 Markdownlint

108-108: Punctuation: '.'
Trailing punctuation in heading

(MD026, no-trailing-punctuation)


113-113: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


145-145: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

pkg/document/crdt/root_test.go (1)

Line range hint 108-171: Suggestion: Remove duplicate test case

The "garbage collection for fragments of text" test case has been duplicated, with the only difference being the use of helper.MaxVersionVector() in the second instance. To improve maintainability and reduce redundancy, consider removing the duplicate test case and updating the original to use helper.MaxVersionVector().

Here's a suggested modification to the original test case:

 t.Run("garbage collection for fragments of text", func(t *testing.T) {
     // ... (keep the existing test setup)

     for _, tc := range steps {
         // ... (keep the existing test logic)
     }

-    n, err := root.GarbageCollect(time.MaxTicket)
+    n, err := root.GarbageCollect(helper.MaxVersionVector())
     assert.NoError(t, err)
     assert.Equal(t, 3, n)
     assert.Equal(t, 0, root.GarbageLen())
 })

- // Remove the entire duplicate test case
server/rpc/admin_server.go (1)

319-327: LGTM! Consider combining error checks for improved readability.

The addition of VersionVector to the GetSnapshotMeta response aligns well with the PR objective. The implementation correctly converts the document's version vector to a protobuf-compatible format and includes proper error handling.

Consider combining the error checks for SnapshotToBytes and ToVersionVector to improve readability:

-	snapshot, err := converter.SnapshotToBytes(doc.RootObject(), doc.AllPresences())
-	if err != nil {
-		return nil, err
-	}
-
-	pbVersionVector, err := converter.ToVersionVector(doc.VersionVector())
-	if err != nil {
-		return nil, err
-	}
+	snapshot, err := converter.SnapshotToBytes(doc.RootObject(), doc.AllPresences())
+	if err != nil {
+		return nil, fmt.Errorf("failed to convert snapshot to bytes: %w", err)
+	}
+
+	pbVersionVector, err := converter.ToVersionVector(doc.VersionVector())
+	if err != nil {
+		return nil, fmt.Errorf("failed to convert version vector: %w", err)
+	}

This change wraps the errors with more context, which can be helpful for debugging.

test/helper/helper.go (1)

144-156: Consider minor optimizations and add documentation

The implementation looks correct and serves its purpose well. However, consider the following suggestions:

  1. For better performance, especially when dealing with a large number of actors, consider pre-allocating the slice:

    actors = make([]*time.ActorID, 0, 1)

    This is more efficient than appending to a nil slice.

  2. Add a comment explaining the purpose of this function, especially its use in testing scenarios. For example:

    // MaxVersionVector returns the max version vector of the given actors.
    // This is particularly useful in testing scenarios, such as garbage collection tests,
    // where we need to simulate a version vector with maximum values.
test/integration/array_test.go (1)

Line range hint 1-569: Impact analysis of garbage collection changes

The changes to the garbage collection mechanism in this file are part of a larger refactoring effort to use version vectors instead of time-based tickets. This update improves the accuracy of garbage collection in distributed scenarios without altering the core logic of the tests.

Key points:

  1. Only two lines (267 and 469) were modified, both related to garbage collection.
  2. The changes are consistent and focused, suggesting a systematic update across the codebase.
  3. The test cases themselves remain unchanged, maintaining their original intent and coverage.

These modifications enhance the robustness of the tests in distributed environments while preserving the existing test scenarios.

Consider updating the test documentation or adding comments to explain the significance of using helper.MaxVersionVector(d1.ActorID()) for garbage collection, as it may not be immediately clear to developers unfamiliar with the change.

test/bench/document_bench_test.go (3)

620-620: LGTM. Consider refactoring for DRY principle.

The update to use helper.MaxVersionVector(doc.ActorID()) is consistent with the previous changes and supports the new version vector implementation.

Consider extracting the garbage collection logic into a helper function to reduce code duplication across benchmark tests. For example:

+func performGarbageCollection(doc *document.Document) int {
+    return doc.GarbageCollect(helper.MaxVersionVector(doc.ActorID()))
+}

 func benchmarkTreeSplitGC(cnt int, b *testing.B) {
     // ... (existing code)
-    assert.Equal(b, cnt, doc.GarbageCollect(helper.MaxVersionVector(doc.ActorID())))
+    assert.Equal(b, cnt, performGarbageCollection(doc))
 }

This refactoring would make the code more maintainable and reduce the risk of inconsistencies in future updates.


689-689: LGTM. Reiterate refactoring suggestion.

The update to use helper.MaxVersionVector(doc.ActorID()) is consistent with the previous changes and supports the new version vector implementation.

As mentioned earlier, consider extracting the garbage collection logic into a helper function to reduce code duplication across benchmark tests. This would apply to this instance as well:

 func benchmarkTextEditGC(cnt int, b *testing.B) {
     // ... (existing code)
-    assert.Equal(b, cnt, doc.GarbageCollect(helper.MaxVersionVector(doc.ActorID())))
+    assert.Equal(b, cnt, performGarbageCollection(doc))
 }

This refactoring would improve code maintainability and consistency across all benchmark tests.


721-721: LGTM. Reiterate refactoring suggestion.

The update to use helper.MaxVersionVector(doc.ActorID()) is consistent with the previous changes and supports the new version vector implementation.

As suggested earlier, consider extracting the garbage collection logic into a helper function to reduce code duplication across benchmark tests. This would apply to this instance as well:

 func benchmarkTextSplitGC(cnt int, b *testing.B) {
     // ... (existing code)
-    assert.Equal(b, cnt, doc.GarbageCollect(helper.MaxVersionVector(doc.ActorID())))
+    assert.Equal(b, cnt, performGarbageCollection(doc))
 }

Implementing this refactoring across all benchmark tests would significantly improve code maintainability and consistency.

test/integration/object_test.go (2)

287-287: LGTM: Consistent VersionVector usage, consider refactoring

This change is consistent with the previous updates, replacing time.MaxTicket with helper.MaxVersionVector(d1.ActorID()) for garbage collection.

Consider extracting the helper.MaxVersionVector(d.ActorID()) call into a helper function within the test file to reduce duplication and improve maintainability. For example:

func maxVersionVector(d *document.Document) *time.Ticket {
    return helper.MaxVersionVector(d.ActorID())
}

Then you can use it like this:

assert.Equal(t, 3, d1.GarbageCollect(maxVersionVector(d1)))

This would make the tests more readable and easier to update if the version vector logic changes in the future.


Line range hint 225-287: Overall assessment: Well-implemented VersionVector integration

The changes in this file consistently update the garbage collection mechanism to use version vectors instead of time-based tickets. This aligns well with the PR's objective of introducing VersionVector for Change.ID. The implementation is consistent throughout the file, which is commendable.

To further improve the code:

  1. Consider the previously suggested refactoring to extract the helper.MaxVersionVector(d.ActorID()) call into a helper function.
  2. Update the test case descriptions or add comments to clarify that these tests now use version vectors for garbage collection, which might be helpful for future maintainers.
  3. If not already done, ensure that similar changes have been made consistently across other test files that involve garbage collection.
api/docs/yorkie/v1/yorkie.openapi.yaml (2)

Line range hint 470-484: LGTM: ChangePack updated with VersionVector, consider updating documentation.

The ChangePack schema has been appropriately updated to include the new versionVector property, and the minSyncedTicket has been correctly marked as deprecated. These changes align well with the PR objectives.

Consider updating the documentation or adding comments to explain the transition from minSyncedTicket to versionVector and its implications for users of the API.


Line range hint 1-1591: Summary: VersionVector successfully integrated, consider documentation enhancements.

The changes in this file successfully introduce the VersionVector concept to the Yorkie API specification. Key points:

  1. The API version has been appropriately updated to reflect these significant changes.
  2. VersionVector has been integrated into relevant schemas (ChangeID, ChangePack).
  3. New schemas for VersionVector and VectorEntry have been added.

These changes align well with the PR objectives of introducing VersionVector to detect relationships between changes.

To enhance the API documentation:

  1. Consider adding more detailed descriptions for the new VersionVector and VectorEntry schemas.
  2. Provide usage examples or explanations for the flexible value type in VectorEntry.
  3. Include information about the transition from minSyncedTicket to versionVector in the ChangePack schema.

These additions would greatly assist API consumers in understanding and implementing these new concepts.

api/docs/yorkie/v1/admin.openapi.yaml (1)

2240-2268: New VersionVector schema added

The introduction of the VersionVector schema, along with its nested VectorEntry structure, is a key addition that supports the PR's main objective of distinguishing between Concurrent and Causal relationships among changes.

The schema structure looks appropriate for representing version vectors. However, to improve clarity and maintainability, consider adding a brief description to the VersionVector and VectorEntry schemas.

Consider adding descriptions to the VersionVector and VectorEntry schemas. For example:

yorkie.v1.VersionVector:
  description: "Represents a version vector used to track the state of distributed replicas."
  # ... rest of the schema ...

yorkie.v1.VersionVector.VectorEntry:
  description: "Represents an entry in the version vector, mapping an identifier to its corresponding version number."
  # ... rest of the schema ...
migrations/v0.6.0/drop-snapshots.go (2)

61-61: Use structured logging instead of fmt.Println

Using fmt.Println for logging may not be suitable for production code. Consider using a structured logging library or the standard log package to provide consistent and configurable logging.

Apply this diff to use the log package:

-    fmt.Println("drop snapshots completed")
+    log.Println("drop snapshots completed")

19-26: Organize imports into standard, third-party, and local groups

For better readability and maintenance, it's recommended to group imports into three sections: standard library packages, third-party packages, and local packages, each separated by a blank line.

Apply this diff to organize the imports:

 import (
     "context"
     "fmt"
     "log"

+    "go.mongodb.org/mongo-driver/bson"
+    "go.mongodb.org/mongo-driver/mongo"
 )
pkg/document/change/pack.go (1)

57-65: Ensure consistent ordering of parameters and struct assignments

In the NewPack function, the order of parameters includes versionVector before snapshot. However, in the struct initialization, Snapshot is assigned after VersionVector. For consistency and readability, consider aligning the order of parameters with the order of struct field assignments.

server/backend/database/mongo/registry.go (4)

47-52: Update comment to include tVersionVector decoder registration

The comment // Register the decoders for types.ID. does not mention that tVersionVector is also being registered. For clarity and completeness, consider updating the comment to reflect all registered decoders.

Proposed change:

-	// Register the decoders for types.ID.
+	// Register the decoders for types.ID and time.VersionVector.

54-57: Update comment to reflect all registered encoders

The comment // Register the encoders for types.ID and time.ActorID. does not mention that tVersionVector is also being registered. For consistency and clarity, please update the comment to include time.VersionVector.

Proposed change:

-	// Register the encoders for types.ID and time.ActorID.
+	// Register the encoders for types.ID, time.ActorID, and time.VersionVector.

93-95: Wrap error for consistency when conversion fails in versionVectorEncoder

When converter.ToVersionVector returns an error, it is currently returned directly. For consistency with other error handling in the function and to provide more context, consider wrapping the error.

Proposed change:

	pbChangeVector, err := converter.ToVersionVector(val.Interface().(time.VersionVector))
	if err != nil {
-		return err
+		return fmt.Errorf("version vector conversion error: %w", err)
	}

127-129: Wrap error for consistency when conversion fails in versionVectorDecoder

When converter.FromVersionVector returns an error, it is returned directly. For consistency and to provide better error context, consider wrapping the error with additional information.

Proposed change:

	vector, err := converter.FromVersionVector(&pbVector)
	if err != nil {
-		return err
+		return fmt.Errorf("version vector conversion error: %w", err)
	}
pkg/document/change/id.go (2)

54-58: Clarify the comment for versionVector field

The current comment can be improved for clarity and readability. Consider rephrasing it as follows.

Suggested revision:

- // versionVector is similar to vector clock, and it is used to detect the
- // relationship between changes whether they are causally related or concurrent.
- // It is synced with lamport timestamp of the change for mapping TimeTicket to
- // the change.
+ // versionVector is similar to a vector clock and is used to detect
+ // whether changes are causally related or concurrent.
+ // It is synchronized with the Lamport timestamp of the change
+ // for mapping TimeTicket to the change.

162-165: Return a copy of versionVector in VersionVector method

Currently, the VersionVector method returns the internal versionVector, which could be modified externally. To maintain encapsulation and prevent unintended side effects, consider returning a copy.

Suggested change:

func (id ID) VersionVector() time.VersionVector {
-   return id.versionVector
+   return id.versionVector.DeepCopy()
}
pkg/document/crdt/root.go (2)

Line range hint 150-154: Add nil check before comparing RemovedAt() to prevent potential nil pointer dereference.

In the loop over gcElementPairMap, when comparing vector.EqualToOrAfter(pair.elem.RemovedAt()), if pair.elem.RemovedAt() is nil, this could cause a runtime error. Consider adding a nil check for pair.elem.RemovedAt() before performing the comparison to ensure stability.

Apply this diff to add the nil check:

         // Existing code
-        if vector.EqualToOrAfter(pair.elem.RemovedAt()) {
+        if pair.elem.RemovedAt() != nil && vector.EqualToOrAfter(pair.elem.RemovedAt()) {
             if err := pair.parent.Purge(pair.elem); err != nil {
                 return 0, err
             }

Line range hint 160-164: Add nil check before comparing RemovedAt() of pair.Child to prevent potential nil pointer dereference.

In the loop over gcNodePairMap, when using vector.EqualToOrAfter(pair.Child.RemovedAt()), ensure that pair.Child.RemovedAt() is not nil to avoid possible runtime errors.

Apply this diff to add the nil check:

         // Existing code
-        if vector.EqualToOrAfter(pair.Child.RemovedAt()) {
+        if pair.Child.RemovedAt() != nil && vector.EqualToOrAfter(pair.Child.RemovedAt()) {
             if err := pair.Parent.Purge(pair.Child); err != nil {
                 return 0, err
             }
cmd/yorkie/migration.go (4)

166-167: Correct the typographical error in the error message

The error message has a typo: "to version must be larger then from version". The word "then" should be "than".

Apply this diff to fix the typo:

 return fmt.Errorf("to version must be larger then from version: %s %s", from, to)
+return fmt.Errorf("to version must be larger than from version: %s %s", from, to)

70-73: Enhance the error message for invalid version formats

In the parseVersion function, when an invalid version segment is encountered, the error message could be more informative by specifying the problematic segment.

Apply this diff to improve the error message:

 return nil, fmt.Errorf("invalid version format: %s", version)
+return nil, fmt.Errorf("invalid version segment '%s' in version '%s'", part, version)

156-198: Provide user feedback during the migration process

Enhance the user experience by adding progress indicators or summaries after each migration step. This helps users understand the migration progress, especially for long-running migrations.

Consider adding logs or a progress bar within the runMigrationsInRange function:

for i, version := range versions {
    fmt.Printf("Running migration %d of %d: %s\n", i+1, len(versions), version)
    // Existing migration code
}

154-201: Add validation for required flags

While you mark certain flags as required, it might be helpful to add validation logic to ensure that the provided values meet expected formats, such as version string formats.

Implement additional validation in the RunE function:

if !isValidVersionFormat(from) || !isValidVersionFormat(to) {
    return fmt.Errorf("invalid version format for 'from' or 'to'")
}

You can define isValidVersionFormat to check the version string against a regex pattern.

server/packs/packs.go (1)

148-151: Address the TODO: Remove deprecated code when appropriate

The code block starting at line 148 is marked with a TODO to remove support for previous SDKs that do not use the version vector. Consider checking if all clients have been updated to the new SDKs. If so, you can safely remove this legacy code to simplify maintenance.

Would you like assistance in determining if it's safe to remove this code now?

pkg/document/internal_document.go (3)

111-113: Add documentation for the new vector parameter

The function NewInternalDocumentFromSnapshot now accepts a vector time.VersionVector parameter. Consider updating the function's documentation to explain the purpose of this parameter and how it affects the creation of an InternalDocument.


205-206: Update comments for GarbageCollect parameter change

The GarbageCollect method now accepts a time.VersionVector instead of a *time.Ticket. Consider updating the method's comment to reflect this change, providing clarity on how the version vector is used during garbage collection.


234-235: Implement the TODO: Update the root object in SetActor

There's a TODO indicating the need to update the root object when setting the actor:

// TODO(hackerwins): We need to update the root object as well.

Would you like assistance in implementing this update to ensure that the root object's state is consistent with the new actor?

pkg/document/document.go (1)

74-75: Consider using 'TODO' instead of 'NOTE' for temporary comments

It's a common convention to use 'TODO' comments to indicate sections of code that require future attention. Replacing 'NOTE' with 'TODO' can improve consistency and assist in tracking tasks that need to be addressed.

api/converter/to_pb.go (1)

181-186: Simplify slice usage in actor conversion

In ToVersionVector, the expression actor[:] can be simplified to actor since actor is already a slice of bytes. This makes the code cleaner and avoids unnecessary slicing.

Apply this diff to simplify the code:

-		id, err := time.ActorIDFromBytes(actor[:])
+		id, err := time.ActorIDFromBytes(actor)
api/converter/from_pb.go (1)

176-179: Plan for future removal of compatibility code

The TODO comment indicates that old clients do not send VersionVector, and this check should be removed in the future. Consider tracking this in your project management tool to ensure this compatibility code is cleaned up once support for old clients is deprecated.

server/backend/database/memory/database.go (1)

1256-1313: Consider adding error logging for better observability

While the UpdateVersionVector function correctly updates or deletes a client's version vector based on their attachment status, adding logging statements for error cases can aid in diagnostics and improve maintainability.

test/integration/gc_test.go (2)

872-873: Chained Edit calls on Text can be simplified

In the Update function, multiple Edit calls are chained on the Text object:

root.SetNewText("text").
    Edit(0, 0, "a").
    Edit(1, 1, "b").
    Edit(2, 2, "c")

While chaining is acceptable, consider separating the edits for clarity:

text := root.SetNewText("text")
text.Edit(0, 0, "a")
text.Edit(1, 1, "b")
text.Edit(2, 2, "c")

This can improve readability and make it easier to understand each edit operation.


1100-1102: Error handling in nested Update calls

Ensure that errors returned by nested calls within the Update function are properly handled. While the current implementation may not produce errors, explicitly checking for errors can enhance robustness.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 93c192b and bb9c7bb.

⛔ Files ignored due to path filters (13)
  • api/yorkie/v1/admin.pb.go is excluded by !**/*.pb.go
  • api/yorkie/v1/resources.pb.go is excluded by !**/*.pb.go
  • design/media/detached-user-version-vector.jpg is excluded by !**/*.jpg
  • design/media/filter-version-vector.jpg is excluded by !**/*.jpg
  • design/media/garbage-collection-1.png is excluded by !**/*.png
  • design/media/garbage-collection-2.png is excluded by !**/*.png
  • design/media/garbage-collection-3.png is excluded by !**/*.png
  • design/media/garbage-collection-4.png is excluded by !**/*.png
  • design/media/garbage-collection-5.png is excluded by !**/*.png
  • design/media/garbage-collection-6.png is excluded by !**/*.png
  • design/media/lamport-synced-version-vector-and-version-vector.jpg is excluded by !**/*.jpg
  • design/media/prev-gc-issue.jpg is excluded by !**/*.jpg
  • design/media/version-vector-example.jpg is excluded by !**/*.jpg
📒 Files selected for processing (49)
  • admin/client.go (1 hunks)
  • api/converter/from_pb.go (2 hunks)
  • api/converter/to_pb.go (3 hunks)
  • api/docs/yorkie/v1/admin.openapi.yaml (3 hunks)
  • api/docs/yorkie/v1/resources.openapi.yaml (4 hunks)
  • api/docs/yorkie/v1/yorkie.openapi.yaml (4 hunks)
  • api/yorkie/v1/admin.proto (1 hunks)
  • api/yorkie/v1/resources.proto (2 hunks)
  • cmd/yorkie/migration.go (1 hunks)
  • design/garbage-collection.md (3 hunks)
  • design/version-vector.md (1 hunks)
  • migrations/v0.6.0/add-version-vector.go (1 hunks)
  • migrations/v0.6.0/drop-snapshots.go (1 hunks)
  • migrations/v0.6.0/main.go (1 hunks)
  • pkg/document/change/change.go (1 hunks)
  • pkg/document/change/id.go (5 hunks)
  • pkg/document/change/pack.go (1 hunks)
  • pkg/document/crdt/root.go (2 hunks)
  • pkg/document/crdt/root_test.go (5 hunks)
  • pkg/document/document.go (5 hunks)
  • pkg/document/document_test.go (2 hunks)
  • pkg/document/gc_test.go (5 hunks)
  • pkg/document/internal_document.go (11 hunks)
  • pkg/document/time/actor_id.go (1 hunks)
  • pkg/document/time/ticket.go (1 hunks)
  • pkg/document/time/version_vector.go (1 hunks)
  • server/backend/database/change_info.go (2 hunks)
  • server/backend/database/database.go (3 hunks)
  • server/backend/database/memory/database.go (7 hunks)
  • server/backend/database/memory/indexes.go (3 hunks)
  • server/backend/database/mongo/client.go (13 hunks)
  • server/backend/database/mongo/indexes.go (3 hunks)
  • server/backend/database/mongo/registry.go (3 hunks)
  • server/backend/database/mongo/registry_test.go (1 hunks)
  • server/backend/database/snapshot_info.go (3 hunks)
  • server/backend/database/testcases/testcases.go (2 hunks)
  • server/backend/database/version_vector.go (1 hunks)
  • server/clients/clients.go (1 hunks)
  • server/packs/packs.go (5 hunks)
  • server/packs/pushpull.go (2 hunks)
  • server/packs/serverpacks.go (3 hunks)
  • server/packs/snapshots.go (4 hunks)
  • server/rpc/admin_server.go (1 hunks)
  • server/rpc/testcases/testcases.go (4 hunks)
  • test/bench/document_bench_test.go (6 hunks)
  • test/helper/helper.go (1 hunks)
  • test/integration/array_test.go (2 hunks)
  • test/integration/gc_test.go (14 hunks)
  • test/integration/object_test.go (3 hunks)
🧰 Additional context used
🪛 LanguageTool
design/garbage-collection.md

[uncategorized] ~11-~11: You might be missing the article “the” here.
Context: ...am/yorkie/pull/981) first to understand concepts and usage of version vector. ## Summa...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~31-~31: Possible missing article found.
Context: ...mple and lightweight to use, but it has crucial weakness that Lamport doesn't guarantee...

(AI_HYDRA_LEO_MISSING_THE)


[uncategorized] ~31-~31: Possible missing comma found.
Context: ... lightweight to use, but it has crucial weakness that Lamport doesn't guarantee causalit...

(AI_HYDRA_LEO_MISSING_COMMA)


[style] ~34-~34: This phrasing can be wordy, so consider using a more descriptive and concise alternative.
Context: ...-gc-issue](media/prev-gc-issue.jpg) As you can see from the image above, min_synced_seq doesn...

(AS_YOU_CAN_SEE)


[uncategorized] ~39-~39: A determiner appears to be missing. Consider inserting it.
Context: ...d remotely and purges them completely. Server records the version vector of the last ...

(AI_EN_LECTOR_MISSING_DETERMINER)


[uncategorized] ~40-~40: You might be missing the article “the” here.
Context: ...never the client requests PushPull. And Server returns the min version vector, `minVer...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~40-~40: Possible missing preposition found.
Context: ...rsionVectorof all clients in response PushPull to the client.minVersionVector` is us...

(AI_EN_LECTOR_MISSING_PREPOSITION)


[uncategorized] ~43-~43: You might be missing the article “the” here.
Context: ... vector is the vector which consists of minimum value of every version vector stored in...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~43-~43: You might be missing the article “a” here.
Context: ...ector stored in version vector table in database. Conceptually, min version vector is v...

(AI_EN_LECTOR_MISSING_DETERMINER_A)


[uncategorized] ~45-~45: You might be missing the article “a” here.
Context: ...e. Conceptually, min version vector is version vector that is uniformly applied to all...

(AI_EN_LECTOR_MISSING_DETERMINER_A)


[uncategorized] ~80-~80: You might be missing the article “the” here.
Context: ...And the Client a sends change 3a to server through PushPull API and receives as a ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~82-~82: You might be missing the article “the” here.
Context: ... and it has not been sent (pushpull) to server yet. #### State 3 ![garbage-collectio...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~88-~88: You might be missing the article “the” here.
Context: ....png) Client b pushes change 3b to server and receives as a response that `minVer...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~88-~88: You might be missing the article “the” here.
Context: ...ent applies change 4, the contents of document are changed to ac. This time, all cli...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[grammar] ~88-~88: The noun should probably be in the singular form.
Context: ...t's still marked as tombstone for every clients, because minVersionVector[a] = 1 < 3 ...

(EVERY_EACH_SINGULAR)


[formatting] ~88-~88: If the ‘because’ clause is essential to the meaning, do not use a comma before the clause.
Context: ...ll marked as tombstone for every clients, because minVersionVector[a] = 1 < 3 #### St...

(COMMA_BEFORE_BECAUSE)


[uncategorized] ~100-~100: Possible missing comma found.
Context: ...e-collection-5.png) Client b pushpull but nothing to push or pull. `minVersionVec...

(AI_HYDRA_LEO_MISSING_COMMA)


[uncategorized] ~106-~106: Possible missing comma found.
Context: ...e-collection-6.png) Client a pushpull but nothing to push or pull. `minVersionVec...

(AI_HYDRA_LEO_MISSING_COMMA)


[uncategorized] ~143-~143: Possible missing preposition found.
Context: ...ort from db.versionVector. So we choose remove only client's version vector from table...

(AI_HYDRA_LEO_MISSING_TO)


[uncategorized] ~143-~143: You might be missing the article “the” here.
Context: ...emove only client's version vector from table, and filter minVersionVector by active ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~158-~158: You might be missing the article “the” here.
Context: ....Filter([c3]) = {c1:3, c2:3} ``` After client receive this minVersionVector, it will ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~158-~158: This verb does not appear to agree with the subject. Consider using a different form.
Context: ...([c3]) = {c1:3, c2:3} ``` After client receive this minVersionVector, it will filter i...

(AI_EN_LECTOR_REPLACEMENT_VERB_AGREEMENT)


[uncategorized] ~159-~159: This verb does not appear to agree with the subject. Consider using a different form.
Context: ...lamport. The next pushpull request will contains filtered version vector so that eventua...

(AI_EN_LECTOR_REPLACEMENT_VERB_AGREEMENT)


[uncategorized] ~159-~159: You might be missing the article “a” here.
Context: ...The next pushpull request will contains filtered version vector so that eventually db.ve...

(AI_EN_LECTOR_MISSING_DETERMINER_A)


[uncategorized] ~159-~159: You might be missing the article “the” here.
Context: ...eventually db.version vector will store attached client's version vector only. ![filter-...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

design/version-vector.md

[uncategorized] ~11-~11: Possible missing comma found.
Context: ...on Vector called Lamport Synced Version Vector which consists of Lamport and version v...

(AI_HYDRA_LEO_MISSING_COMMA)


[uncategorized] ~23-~23: Possible missing comma found.
Context: ...ructure. ## Proposal Details ### What is Lamport Timestamp Lamport Timestamps ...

(AI_HYDRA_LEO_MISSING_COMMA)


[uncategorized] ~24-~24: The verb “are” doesn’t seem to fit in this context, “is” is probably more formally correct.
Context: ... Lamport Timestamp Lamport Timestamps are a logical clock mechanism used in distr...

(AI_HYDRA_LEO_CPT_ARE_IS)


[uncategorized] ~53-~53: Possible missing comma found.
Context: ...ements received, without creating a new max by adding to the maximum found during t...

(AI_HYDRA_LEO_MISSING_COMMA)

🪛 Markdownlint
design/garbage-collection.md

69-69: Punctuation: ':'
Trailing punctuation in heading

(MD026, no-trailing-punctuation)


48-48: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


56-56: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


108-108: Punctuation: '.'
Trailing punctuation in heading

(MD026, no-trailing-punctuation)


113-113: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


145-145: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

design/version-vector.md

25-25: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (115)
server/backend/database/version_vector.go (2)

17-22: LGTM: Package declaration and imports are appropriate.

The package declaration and imports are correct and relevant for the VersionVectorInfo struct being defined in this file.


1-31: LGTM: Approval for the introduction of VersionVectorInfo.

Overall, the introduction of VersionVectorInfo in this new file is a well-structured and valuable addition to the project. It enhances the system's capability to manage version vectors for documents and clients. The code is clean, well-organized, and follows good practices.

Key points:

  1. Appropriate package declaration and imports.
  2. Well-defined struct with clear field names and BSON tags.
  3. Good use of custom types for consistency (types.ID and time.VersionVector).

Please consider the minor suggestions for comment improvements and keep in mind the potential performance implications as the system scales. Great work on this addition!

migrations/v0.6.0/main.go (2)

1-18: LGTM: License header and package documentation are appropriate.

The license header is correctly formatted, and the package documentation provides a clear description of the package's purpose.


20-25: LGTM: Imports are appropriate and necessary.

The imports include all the required packages for the functionality of this migration file.

server/backend/database/snapshot_info.go (3)

43-45: LGTM! The VersionVector field has been correctly added.

The addition of the VersionVector field to the SnapshotInfo struct is in line with the PR objectives. The field type and BSON tag are correctly implemented.


60-67: LGTM! The DeepCopy method has been correctly updated.

The DeepCopy method has been properly updated to include the new VersionVector field. This ensures that deep copies of SnapshotInfo will correctly include all fields.


Line range hint 1-85: Summary: VersionVector integration enhances snapshot management capabilities.

The changes to SnapshotInfo successfully introduce the VersionVector field and update related components. These modifications align with the PR objectives and enhance the system's ability to track relationships between document changes. The updates to imports, struct fields, and the DeepCopy method are all implemented correctly.

Key improvements:

  1. Addition of VersionVector field for better change relationship tracking.
  2. Updated CreatedAt field to use the standard time package.
  3. Proper deep copying of all fields, including the new VersionVector.

These changes lay a solid foundation for improved conflict resolution and synchronization in the system.

server/clients/clients.go (1)

91-101: Approve the addition of UpdateVersionVector call with suggestions.

The new code aligns well with the PR objective of introducing VersionVector for Change.ID. It ensures that the version vector is updated when a client is deactivated, which is crucial for maintaining the correct state of document changes.

However, I have a couple of points for consideration:

  1. Could you please clarify the purpose of the nil parameter in the UpdateVersionVector call? Understanding its role would help ensure the correctness of this implementation.

  2. As this update is performed in a loop for each attached document, there might be a performance impact for clients with many documents. Have you considered implementing a batch update mechanism to optimize this process?

To verify the impact and usage of this new call, please run the following script:

✅ Verification successful

Verified that UpdateVersionVector is only called with nil in server/clients/clients.go. No issues found.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for other occurrences of UpdateVersionVector and its parameter usage

# Test: Search for UpdateVersionVector calls
echo "Occurrences of UpdateVersionVector:"
rg "UpdateVersionVector" --type go

# Test: Check for nil usage with UpdateVersionVector
echo "\nOccurrences of UpdateVersionVector with nil:"
rg "UpdateVersionVector.*nil" --type go

Length of output: 1393

server/backend/database/mongo/registry_test.go (1)

38-66: Great addition of tests for VersionVector functionality!

These new sub-tests align well with the PR objectives of introducing VersionVector to detect relationships between changes. They ensure that the types.ID and VersionVector can be correctly marshaled and unmarshaled, which is crucial for the proper functioning of the new features in MongoDB operations.

The separation into distinct sub-tests improves the clarity and maintainability of the test suite. Good job on enhancing the test coverage for these important components!

pkg/document/change/change.go (1)

Line range hint 1-119: Summary of changes and potential impact

The addition of the AfterOrEqual method to the Change struct is a well-implemented change that aligns with the PR objectives of introducing version vector functionality. This method will allow for more precise detection of causal relationships between changes, which is crucial for conflict resolution in distributed systems.

Key points:

  1. The implementation is correct and concise.
  2. It integrates well with the existing Change struct.
  3. The change is minimal but has a potentially significant impact on how changes are compared throughout the system.

To ensure the success of this change:

  1. Verify that the ID struct's AfterOrEqual method has been updated to handle version vector comparisons correctly.
  2. Update other parts of the codebase to use this new method where appropriate.
  3. Add comprehensive unit tests to cover various scenarios of change comparisons.
  4. Update documentation to reflect this new capability in the Change struct.
pkg/document/time/ticket.go (1)

Line range hint 54-190: Verify consistency between documentation and implementation.

The updated documentation accurately reflects the unchanged implementation of the Ticket struct and its methods. The Ticket continues to handle ordering and identification of changes, while the detection of causal and concurrent relationships is delegated to the newly introduced VersionVector (in other files).

To ensure full alignment with the new VersionVector system:

Run the following script to check for any missed updates related to Ticket usage:

This script will help identify any areas where Ticket might be misused for detecting causal or concurrent relationships, and confirm that VersionVector is being used appropriately across the codebase.

✅ Verification successful

Verification Successful: The documentation is consistent with the implementation of the Ticket struct and its methods. No discrepancies found.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that Ticket usage is consistent with its updated role

# Test 1: Check for any comparisons or ordering operations using Ticket
echo "Checking for Ticket comparisons and ordering:"
rg -n '\.Compare\(' --type go
rg -n '\.After\(' --type go

# Test 2: Look for potential places where VersionVector should be used instead of Ticket
echo "Checking for potential VersionVector usage:"
rg -n 'causal|concurrent' --type go

# Test 3: Verify VersionVector is used for relationship detection
echo "Verifying VersionVector usage:"
rg -n 'VersionVector' --type go

Length of output: 61951

server/backend/database/change_info.go (3)

43-53: LGTM: VersionVector field added to ChangeInfo struct

The addition of the VersionVector field to the ChangeInfo struct aligns well with the PR objectives. The field is correctly typed and tagged for BSON serialization, which is consistent with the existing fields in the struct.


97-97: LGTM: ToChange method updated to include VersionVector

The ToChange method has been correctly updated to include the VersionVector when creating a new change ID. This modification aligns with the PR objectives and ensures that the version vector information is properly incorporated into the change model.


Line range hint 43-97: Summary: VersionVector successfully integrated into ChangeInfo

The changes in this file successfully integrate the VersionVector into the ChangeInfo struct and its related methods. These modifications align well with the PR objectives of introducing version vector functionality to detect relationships between changes. The implementation maintains consistency with the existing code structure and follows good practices for BSON serialization and change management.

pkg/document/time/actor_id.go (1)

59-60: Excellent addition of the actorID type.

The introduction of the actorID type enhances type safety and encapsulation. It provides a clear, domain-specific type for actor IDs, which can improve code readability and maintainability. This change aligns well with the broader update to integrate version vectors into the system.

server/backend/database/mongo/indexes.go (4)

43-44: LGTM: New constant for VersionVectors collection.

The addition of ColVersionVectors constant is consistent with the existing naming convention and accurately represents the new collection for version vectors. This aligns well with the PR objective of introducing VersionVector functionality.


165-175: LGTM: New collection info for ColVersionVectors.

The addition of a new collection info for ColVersionVectors with a unique index on project_id, doc_id, and client_id is well-structured and aligns with the PR objective. This ensures that there's only one version vector per project, document, and client combination.

To ensure proper integration:

Please run the following script to check if the new collection and its index are being used correctly throughout the codebase:

#!/bin/bash
# Description: Verify usage of ColVersionVectors collection and its index

# Test 1: Check for insertions into the ColVersionVectors collection
rg -i 'insert.*ColVersionVectors' --type go

# Test 2: Check for queries using project_id, doc_id, and client_id on ColVersionVectors
rg -i 'find\(.*project_id.*doc_id.*client_id.*ColVersionVectors' --type go

Ensure that the results show appropriate usage of the new collection and its index in the relevant parts of the system.


Line range hint 46-54: Verify update to Collections slice.

The AI summary mentions that the Collections slice has been updated to include ColVersionVectors. However, this change is not visible in the provided code snippet. Please ensure that ColVersionVectors has been added to the Collections slice.

#!/bin/bash
# Description: Verify that ColVersionVectors has been added to the Collections slice.

# Test: Search for the Collections slice definition
rg -A 10 'var Collections = \[\]string\{'

150-155: LGTM: New index for ColSyncedSeqs collection.

The addition of a new index on doc_id, project_id, and server_seq for the ColSyncedSeqs collection is consistent with the existing indexing strategy. This index may improve query performance for operations involving these fields.

However, to ensure its necessity:

Please run the following script to check if there are queries that would benefit from this index:

If no matching queries are found, consider removing this index to avoid unnecessary overhead during write operations.

pkg/document/gc_test.go (3)

28-28: LGTM: Import statement added for helper package.

The addition of the helper package import is necessary for using the MaxVersionVector function, which aligns with the PR's objective of introducing VersionVector.


Line range hint 1-221: Overall assessment: Changes are well-implemented and align with PR objectives.

The modifications consistently replace time-based garbage collection with version vector-based garbage collection throughout the file. This aligns well with the PR's objective of introducing VersionVector for change relationship detection.

Suggestions for improvement:

  1. Refactor the garbage collection calls into a helper function to improve maintainability and consistency.
  2. Consider adding or updating tests to specifically verify the behavior of the new version vector-based garbage collection, especially in scenarios involving concurrent changes.

These changes represent a significant shift in how garbage collection is performed, which should lead to more precise handling of document changes. Great work on maintaining consistency across different document element types (tree and text).


134-134: LGTM: Updated garbage collection to use MaxVersionVector.

The change from time.MaxTicket to helper.MaxVersionVector(doc.ActorID()) aligns with the PR's objective of introducing VersionVector. This shift from time-based to version-based garbage collection is likely to improve the precision of determining which changes can be safely collected.

To ensure the new garbage collection behavior works as expected, consider adding or updating tests that specifically verify:

  1. The correct identification of garbage based on version vectors.
  2. The proper handling of concurrent changes during garbage collection.
✅ Verification successful

Verified: Garbage collection tests are updated correctly.

Both TestTreeGC and TestTextGC functions utilize helper.MaxVersionVector(doc.ActorID()), ensuring that the new version vector-based garbage collection is properly tested. The tests cover various scenarios and validate the expected behavior of the garbage collection mechanism.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify if there are any new or updated tests for version vector-based garbage collection
rg -p 'func Test.*GC' pkg/document/gc_test.go

Length of output: 118


Script:

#!/bin/bash
# Extract TestTreeGC function implementation
ast-grep --pattern $'func TestTreeGC(t *testing.T) {\n  $$$\n}' pkg/document/gc_test.go

# Extract TestTextGC function implementation
ast-grep --pattern $'func TestTextGC(t *testing.T) {\n  $$$\n}' pkg/document/gc_test.go

Length of output: 10273

design/version-vector.md (2)

45-50: LGTM! Clear explanation with helpful visual aid.

This section provides a clear and detailed explanation of Version Vectors. The inclusion of an image (version-vector-example.jpg) is particularly helpful for visualizing the concept.


1-66: Excellent document overall. Well-structured and informative.

This design document provides a comprehensive and clear explanation of Version Vectors and their implementation in Yorkie's logical clock system. The content is well-structured, with helpful examples and diagrams that effectively illustrate the concepts.

The document successfully achieves its stated goals of introducing Yorkie's logical clock system and explaining the transition from a Lamport-based system to a Version Vector approach. The introduction of the Lamport Synced Version Vector as a solution to the challenges of this transition is particularly well-explained.

While some minor improvements in formatting and language have been suggested, these do not detract from the overall high quality and value of the document. Great job on creating a thorough and informative design document!

🧰 Tools
🪛 LanguageTool

[uncategorized] ~11-~11: Possible missing comma found.
Context: ...on Vector called Lamport Synced Version Vector which consists of Lamport and version v...

(AI_HYDRA_LEO_MISSING_COMMA)


[uncategorized] ~23-~23: Possible missing comma found.
Context: ...ructure. ## Proposal Details ### What is Lamport Timestamp Lamport Timestamps ...

(AI_HYDRA_LEO_MISSING_COMMA)


[uncategorized] ~24-~24: The verb “are” doesn’t seem to fit in this context, “is” is probably more formally correct.
Context: ... Lamport Timestamp Lamport Timestamps are a logical clock mechanism used in distr...

(AI_HYDRA_LEO_CPT_ARE_IS)


[uncategorized] ~53-~53: Possible missing comma found.
Context: ...ements received, without creating a new max by adding to the maximum found during t...

(AI_HYDRA_LEO_MISSING_COMMA)

🪛 Markdownlint

25-25: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

server/packs/pushpull.go (4)

175-175: LGTM: Improved readability

Splitting the NewServerPack function call into multiple lines improves code readability. This is a good practice, especially for function calls with multiple parameters.


Line range hint 154-177: Summary of changes in pushpull.go

The changes in this file contribute to the introduction of VersionVector for detecting relationships between changes:

  1. A new parameter was added to the change.NewPack function call, which may be related to VersionVector functionality.
  2. The ServerPack now includes a VersionVector field, populated from the document's VersionVector.

These modifications align with the PR objectives and should enhance the system's ability to distinguish between concurrent and causal relationships among changes. The changes are well-integrated into the existing code structure.

To ensure full compatibility and correctness:

  1. Verify that all callers of change.NewPack have been updated if necessary.
  2. Confirm that consumers of ServerPack are prepared to handle the new VersionVector field.
  3. Consider adding unit tests to validate the VersionVector functionality in various scenarios.

176-177: LGTM: Added VersionVector to ServerPack

The addition of VersionVector to the ServerPack aligns with the PR objectives of introducing version vectors to detect relationships between changes. This enhancement will allow for better distinction between concurrent and causal relationships among changes.

To ensure the correct implementation of VersionVector, please run the following command to check its definition and usage:

rg -A 5 "func \(.*\) VersionVector\(" && rg "\.VersionVector\(\)"

154-154: Verify the purpose of the new parameter in change.NewPack

A new nil parameter has been added to the change.NewPack function call. Please confirm:

  1. The purpose of this new parameter in the change.NewPack function.
  2. Whether nil is the appropriate value in this context.
  3. If any documentation needs to be updated to reflect this change.

To help verify the usage of this new parameter, you can run the following command to find its definition and other usages:

✅ Verification successful

Verified the addition of the new nil parameter in change.NewPack

The snapshot parameter has been added to the change.NewPack function and is appropriately passed as nil in all current usages.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

rg -A 5 "func NewPack\(" && rg "change\.NewPack\("

Length of output: 20366


Script:

rg "func NewPack(" 

Length of output: 97


Script:

rg "func NewPack\\(" -A 5 && rg "change\\.NewPack\\("

Length of output: 999

pkg/document/crdt/root_test.go (4)

56-59: LGTM: Updated garbage collection test for array

The test case has been successfully updated to use helper.MaxVersionVector() instead of time.MaxTicket. The assertions correctly reflect the expected behavior of the new garbage collection mechanism.


63-105: Great improvement in text garbage collection test coverage

This new test case provides a more comprehensive examination of the garbage collection process for text. It covers various scenarios including insertions, deletions, and partial edits, which will help ensure the robustness of the garbage collection mechanism. The use of helper.MaxVersionVector() is consistent with the PR objectives.


Line range hint 204-208: LGTM: Updated garbage collection test for rich text

The test case has been successfully updated to use helper.MaxVersionVector() instead of time.MaxTicket. The existing test logic and assertions remain intact, ensuring that the test coverage is maintained while adapting to the new VersionVector implementation.


242-245: LGTM: Updated garbage collection test for container

The test case has been successfully updated to use helper.MaxVersionVector() instead of time.MaxTicket. The multiple garbage collection calls help verify the behavior after different operations, providing good coverage of the container garbage collection process.

Also applies to: 252-255

test/helper/helper.go (1)

144-156: LGTM! The new MaxVersionVector function enhances testing capabilities

The MaxVersionVector function is a valuable addition to the helper package. It aligns well with the PR objectives of introducing version vectors to detect relationships between changes. This function will be particularly useful in testing scenarios, especially for garbage collection tests where simulating maximum version vector values is necessary.

The implementation is correct and consistent with the broader transition to a version vector system throughout the application. It provides a flexible way to generate maximum version vectors for given actors, defaulting to the initial actor if none are provided.

test/integration/array_test.go (1)

469-469: Consistent update to garbage collection mechanism

This change is consistent with the update on line 267, replacing time.MaxTicket with helper.MaxVersionVector(d1.ActorID()) for garbage collection. This ensures that the new version vector-based approach is used consistently throughout the test suite.

To verify that this change has been applied consistently in test files, let's search for any remaining uses of GarbageCollect with time.MaxTicket:

#!/bin/bash
# Search for any remaining uses of GarbageCollect with time.MaxTicket in test files
rg 'GarbageCollect\(time\.MaxTicket\)' --type go --glob 'test/**/*_test.go'
test/bench/document_bench_test.go (3)

33-33: LGTM. Verify helper package functionality.

The change from time to helper package import is consistent with the updates in the garbage collection calls. This modification supports the new version vector implementation.

To ensure the helper package is correctly implemented and imported, run the following:

#!/bin/bash
# Verify the helper package implementation
rg -p 'func MaxVersionVector' test/helper

759-759: LGTM. Final refactoring suggestion and verification.

The update to use helper.MaxVersionVector(doc.ActorID()) is consistent with the previous changes and supports the new version vector implementation.

As suggested earlier, consider extracting the garbage collection logic into a helper function to reduce code duplication across all benchmark tests:

 func benchmarkArrayGC(cnt int, b *testing.B) {
     // ... (existing code)
-    assert.Equal(b, cnt+1, doc.GarbageCollect(helper.MaxVersionVector(doc.ActorID())))
+    assert.Equal(b, cnt+1, performGarbageCollection(doc))
 }

To ensure all instances of GarbageCollect have been updated consistently, run the following:

This will help identify any missed updates and ensure consistency across the entire file.

✅ Verification successful

All GarbageCollect calls are correctly updated. Verification successful.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify all GarbageCollect calls use the new MaxVersionVector
rg -p 'GarbageCollect\(' test/bench

Length of output: 231


Script:

#!/bin/bash
# Verify all GarbageCollect calls use helper.MaxVersionVector(doc.ActorID())
rg 'GarbageCollect\s*\(\s*helper\.MaxVersionVector\s*\(\s*doc\.ActorID\(\)\s*\)\s*\)' test/bench

Length of output: 685


585-585: LGTM. Verify MaxVersionVector behavior.

The update to use helper.MaxVersionVector(doc.ActorID()) instead of time.MaxTicket is consistent with the new version vector implementation. This change should provide more accurate garbage collection based on the document's version history.

To ensure the MaxVersionVector function is correctly implemented and used, run the following:

✅ Verification successful

Verified.

The MaxVersionVector function is correctly implemented in test/helper/helper.go and appropriately utilized in test/bench/document_bench_test.go. This update ensures that garbage collection accurately reflects the document's version history based on the actor's version vector.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify the MaxVersionVector function implementation and usage
rg -p 'func MaxVersionVector' test/helper
rg -p 'MaxVersionVector\(' test/bench

Length of output: 231


Script:

#!/bin/bash
# Corrected script to verify the MaxVersionVector function implementation and usage

# Search for the definition of MaxVersionVector in test/helper
rg 'func MaxVersionVector' test/helper/

# Search for usages of MaxVersionVector in test/bench
rg 'MaxVersionVector\(' test/bench/

Length of output: 756

test/integration/object_test.go (2)

225-225: LGTM: Consistent with VersionVector introduction

This change from time.MaxTicket to helper.MaxVersionVector(d1.ActorID()) aligns with the PR's objective of introducing VersionVector for Change.ID. It's a good update that shifts from time-based tickets to version vectors for garbage collection.


227-227: LGTM: Consistent VersionVector usage

This change mirrors the previous one, updating the garbage collection call for d2 to use helper.MaxVersionVector(d2.ActorID()). It maintains consistency in the implementation of the new VersionVector system.

api/docs/yorkie/v1/yorkie.openapi.yaml (3)

Line range hint 5-5: LGTM: API version updated appropriately.

The API version has been incremented to v0.5.1, which correctly reflects the significant changes introduced, such as the VersionVector concept.


429-434: LGTM: VersionVector integrated into ChangeID.

The addition of the versionVector property to the ChangeID schema is a key change that supports the new functionality for detecting relationships between changes. This integration is well-structured and consistent with the PR objectives.


1574-1591: LGTM: VectorEntry schema added, clarify value type usage.

The VectorEntry schema for VersionVector has been successfully added, providing a structure for the vector entries.

The value property allows both string and number types. Could you clarify the intended usage and provide examples of when each type would be used? This information would be valuable for API consumers.

#!/bin/bash
# Search for any usage examples or additional context for VectorEntry
rg -i "vectorentry|vector entry" api/docs/yorkie/v1/yorkie.openapi.yaml
api/docs/yorkie/v1/resources.openapi.yaml (5)

221-226: Addition of VersionVector to ChangeID schema

The versionVector property has been added to the yorkie.v1.ChangeID schema. This addition is consistent with the PR objectives of introducing a VersionVector to detect relationships between changes.


Line range hint 262-276: Updates to ChangePack schema

  1. The minSyncedTicket property has been marked as deprecated (line 262). This is good practice for maintaining backwards compatibility while signaling future removal.

  2. A new versionVector property has been added (lines 271-276), which is consistent with the ChangeID schema update and the PR objectives.

These changes align well with the goal of introducing VersionVector for change relationship detection.


1766-1776: New VersionVector schema

The yorkie.v1.VersionVector schema has been introduced. It contains a single property vector of type object, which is appropriate for representing a version vector.


1777-1794: New VersionVector.VectorEntry schema

The yorkie.v1.VersionVector.VectorEntry schema has been added to define the structure of entries in the version vector. It includes:

  1. A key property of type string (lines 1781-1785)
  2. A value property that can be either a string or a number (lines 1786-1792)

This structure allows for flexible representation of version vector entries, which is suitable for various use cases.


221-226: Summary of VersionVector introduction

The changes successfully introduce the VersionVector concept to the Yorkie API specification:

  1. ChangeID and ChangePack schemas now include a versionVector property.
  2. A new VersionVector schema has been added with a flexible structure.
  3. The minSyncedTicket property in ChangePack has been appropriately marked as deprecated.

These modifications align well with the PR objectives and should enable better detection of relationships between changes in the Yorkie system. The flexible typing of VersionVector entries (string or number) allows for various implementation strategies, which is a good design choice.

Also applies to: 262-276, 1766-1794

api/docs/yorkie/v1/admin.openapi.yaml (3)

702-707: New versionVector property added to ChangeID schema

The ChangeID schema has been updated to include a new versionVector property. This addition aligns with the PR objective of introducing a VersionVector for Change.ID to distinguish between Concurrent and Causal relationships among changes.


973-978: New versionVector property added to GetSnapshotMetaResponse schema

The GetSnapshotMetaResponse schema has been updated to include a versionVector property. This change is consistent with the introduction of the VersionVector concept and enhances the API's capability to provide versioning information in snapshot metadata responses.


Line range hint 1-2268: Summary of changes to the OpenAPI specification

The changes made to this OpenAPI specification file successfully introduce the VersionVector concept and update related schemas. These modifications enhance the API's capability to handle and communicate version information, which is crucial for distinguishing between Concurrent and Causal relationships among changes.

Key changes include:

  1. Addition of a versionVector property to the ChangeID schema.
  2. Addition of a versionVector property to the GetSnapshotMetaResponse schema.
  3. Introduction of a new VersionVector schema with a nested VectorEntry structure.

These changes align well with the PR objectives and provide a solid foundation for implementing the version vector functionality in the Yorkie API.

server/packs/snapshots.go (3)

34-34: Function signature update is appropriate.

The storeSnapshot function now accepts minSyncedVersionVector as a parameter, aligning with the new version vector synchronization mechanism.


80-81: Proper initialization with VersionVector.

Passing snapshotInfo.VersionVector to document.NewInternalDocumentFromSnapshot ensures the document is initialized with the correct versioning information.


92-92: Verify the use of nil in change.NewPack.

Ensure that passing nil as an argument to change.NewPack is intentional and does not affect the subsequent operations.

Run the following script to confirm that change.NewPack handles a nil argument appropriately:

✅ Verification successful

Passing nil to change.NewPack is safe and intentional.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that `change.NewPack` can accept `nil` as the fourth argument without issues.

# Test: Search for the definition of `NewPack` and check parameter handling.
rg 'func NewPack\(' --context 5

Length of output: 595


Script:

#!/bin/bash
# Search for the definition of 'VersionVector' in the 'time' package to verify its type.

rg 'type VersionVector' pkg/document/change/pack.go --context 5

Length of output: 65


Script:

#!/bin/bash
# Search for the definition of 'VersionVector' across the entire codebase to verify its type.

rg 'type VersionVector' --context 5

Length of output: 2483


Script:

#!/bin/bash
# Extract the full definition of the 'NewPack' function to inspect how 'versionVector' is handled.

rg -A 50 'func NewPack(' pkg/document/change/pack.go

Length of output: 131


Script:

#!/bin/bash
# Extract the full definition of the 'NewPack' function to inspect how 'versionVector' is handled.

rg -A 50 'func NewPack\(' pkg/document/change/pack.go

Length of output: 891

server/backend/database/mongo/registry.go (1)

87-107: Implementation of versionVectorEncoder looks good

The versionVectorEncoder function correctly encodes time.VersionVector into BSON binary data using Protobuf serialization. Error handling is properly implemented, and the function adheres to the expected encoder interface.

server/packs/serverpacks.go (4)

103-103: Update to change.NewID includes VersionVector

The addition of info.VersionVector to the change.NewID function call reflects the new versioning mechanism. This update appears correct and aligns with the intended enhancements for version tracking.


119-122: Proper conversion of changeID with error handling

The conversion of changeID to pbChangeID using converter.ToChangeID is correctly implemented with appropriate error handling. This ensures that any issues during conversion are captured and dealt with promptly.


125-125: Correct assignment of pbChangeID to Id field

Assigning pbChangeID to the Id field in the pbChanges slice is correctly done, ensuring that the Protobuf change pack has the proper identifiers.


141-147: Inclusion of VersionVector in pbPack with error handling

The conversion of VersionVector to its Protobuf representation and its inclusion in pbPack is properly handled. The error checking ensures robustness, preventing potential runtime issues due to serialization errors.

pkg/document/change/id.go (2)

100-110: Ensure correct synchronization in SyncClocks method

Verify that the logic in SyncClocks correctly synchronizes the Lamport timestamp and merges the versionVector using the Max function. This ensures proper causal ordering when integrating changes from different sources.

Run the following script to examine the usage of SyncClocks():

#!/bin/bash
# Description: Find all usages of the `SyncClocks` method to verify proper synchronization.

# Test: Search for calls to `SyncClocks()` and review their context.
rg -A 5 '\.SyncClocks\('

80-87: Verify the update logic in the Next method

Ensure that the versionVector is correctly updated in the Next method to accurately reflect the incremented Lamport timestamp. This is crucial for maintaining correct causal relationships between changes.

Run the following script to check the usage of Next() and verify consistency:

✅ Verification successful

[/run_scripts]


The Next method's update logic for versionVector has been verified and is correctly implemented.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find all usages of the `Next` method to verify versionVector updates.

# Test: Search for calls to `Next()` in the codebase.
# Expect: Confirm that `versionVector` updates are handled properly.
rg -A 3 '\.Next\('

Length of output: 6839


Script:

#!/bin/bash
# Description: Find all usages of the `Next` method to verify versionVector updates.

# Test: Search for calls to `Next()` in the codebase.
# Expect: Confirm that `versionVector` updates are handled properly.
rg -A 3 '\.Next\('

Length of output: 6441

pkg/document/crdt/root.go (1)

146-146: Ensure all callers of GarbageCollect are updated to accept VersionVector.

The GarbageCollect method's signature has changed to accept a time.VersionVector parameter. Please verify that all invocations of this method throughout the codebase have been updated accordingly to prevent any breaking changes or compilation errors.

You can run the following script to find all usages of GarbageCollect that may need updating:

✅ Verification successful

All GarbageCollect callers are correctly updated with VersionVector.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find all calls to `GarbageCollect` that may still use the old parameter type.

# Expected: All calls should match the new signature.

# Command:
rg --type go --context 2 'GarbageCollect\('

Length of output: 11117

pkg/document/time/version_vector.go (1)

97-113: ⚠️ Potential issue

Review the logic in the AfterOrEqual method

The AfterOrEqual method may not correctly determine causal relationships according to vector clock semantics. The current implementation might fail to account for concurrent events accurately.

Please review and verify the vector clock comparison logic to ensure it aligns with the standard definitions. Consider the following:

  • For two vectors V and W, V ≥ W if and only if V[i] ≥ W[i] for all i.
  • The current implementation might return false prematurely if any actor's version in other is not present in v.

To assist with verification, you can test the method with various VersionVector instances representing different causal relationships.

server/backend/database/memory/indexes.go (2)

29-29: Addition of tblVersionVectors is consistent and appropriate

The introduction of tblVersionVectors = "versionvectors" aligns with the existing naming conventions for table variables and integrates well into the schema.


210-219: Verify uniqueness constraint on doc_id_server_seq index in tblSyncedSeqs

A new unique index doc_id_server_seq has been added to the tblSyncedSeqs table, indexing DocID and ServerSeq. Please verify that the combination of DocID and ServerSeq is intended to be unique within this table. This ensures that the uniqueness constraint does not conflict with the expected data patterns or existing records.

If the uniqueness is confirmed, this index will enhance query performance for operations involving synchronization sequences.

To automate the verification, you can run the following script to check for any duplicate entries:

api/yorkie/v1/resources.proto (3)

48-49: Addition of version_vector field to ChangePack

The new field VersionVector version_vector = 7; in the ChangePack message is appropriately added. Field numbers are correctly assigned, and existing field numbers are maintained for backward compatibility.


65-66: Addition of version_vector field to ChangeID

The addition of VersionVector version_vector = 5; to the ChangeID message is appropriate. Field numbers are sequential and maintain backward compatibility.


68-69: Definition of VersionVector message

The VersionVector message is properly defined with a map of string to int64. This effectively represents the version vectors needed for concurrency control.

server/packs/packs.go (4)

132-147: Verify error handling for version vector updates

The introduction of UpdateAndFindMinSyncedVersionVectorAfterPushPull enhances garbage collection using version vectors. Please ensure that any errors returned by this method are appropriately handled and logged to aid in debugging and maintenance.


220-220: Ensure consistent usage of minSyncedVersionVector in storeSnapshot

Passing minSyncedVersionVector to storeSnapshot is crucial for accurate garbage collection. Verify that this variable correctly reflects the minimum synced state across all clients to prevent potential data inconsistencies.


273-274: Include VersionVector when building the internal document

Including snapshotInfo.VersionVector ensures that the internal document accurately represents the state of the document with respect to version vectors. This is a good practice for maintaining consistency during document reconstruction.


298-298: Check for unnecessary nil arguments

At line 298, a nil is passed as an argument in the ApplyChangePack method. Confirm whether this nil is intentional or if a valid value should be provided to avoid potential nil pointer dereferences.

admin/client.go (2)

332-335: Proper error handling for FromVersionVector conversion

The error handling after converting VersionVector is appropriate and ensures that any conversion errors are propagated correctly.


341-341: vector parameter correctly added to NewInternalDocumentFromSnapshot

Adding the vector parameter to the NewInternalDocumentFromSnapshot function call ensures that the version vector is used when reconstructing the document from the snapshot.

server/backend/database/database.go (1)

50-52: New error constant ErrChangeNotFound added successfully

The addition of ErrChangeNotFound aligns with the existing error handling conventions.

pkg/document/internal_document.go (6)

126-126: Ensure correct initialization of changeID with SetClocks

The changeID is being initialized with SetClocks(lamport, vector). Verify that this correctly sets both the lamport timestamp and the version vector as intended.


148-148: Verify version vector consistency in SyncCheckpoint

When synchronizing the checkpoint, you're setting the changeID with the current version vector using d.VersionVector(). Ensure that the version vector at this point reflects the correct state of the document to prevent inconsistencies during synchronization.


185-189: Confirm Garbage Collection condition logic

The condition for triggering garbage collection has been updated:

if !disableGC && pack.VersionVector != nil && !hasSnapshot {

Verify that this logic aligns with the intended behavior, ensuring that garbage collection occurs when appropriate and does not overlook necessary scenarios.


191-198: Validate removal of detached clients from the version vector

The code filters the version vector to remove lamports of detached clients:

actorIDs, err := pack.VersionVector.Keys()
// ...
d.changeID = d.changeID.SetVersionVector(d.changeID.VersionVector().Filter(actorIDs))

Ensure that actorIDs correctly represent the clients to retain and that the Filter method effectively removes the detached clients' entries from the version vector.


248-252: Good addition of VersionVector accessor

Introducing the VersionVector method enhances encapsulation by providing access to the version vector through the InternalDocument. This addition aligns with best practices.


333-333: Ensure SyncClocks correctly updates changeID

When applying changes, you're synchronizing clocks with:

d.changeID = d.changeID.SyncClocks(c.ID())

Verify that SyncClocks correctly merges both the lamport timestamps and the version vectors to maintain the correct causal relationship between changes.

pkg/document/document.go (4)

187-192: Use of 'hasSnapshot' enhances code readability

Introducing the hasSnapshot boolean variable clarifies the condition checks, making the code easier to understand and maintain.


220-222: Conditional Garbage Collection aligns with configuration

The added condition ensures that garbage collection is performed only when DisableGC is false and no snapshot is present, which respects the configuration options and optimizes performance.


224-231: Version vector filtered to exclude detached clients

Updating the changeID by filtering out actor IDs of detached clients maintains an accurate version vector, which is crucial for consistent document state management.


310-313: Added 'VersionVector' method to expose document's version vector

Introducing the VersionVector() method provides external access to the document's version vector, supporting enhanced synchronization and conflict resolution mechanisms.

api/converter/to_pb.go (5)

139-143: Proper error handling in ToChangePack

The addition of error handling when converting the VersionVector ensures that any issues during conversion are correctly propagated.


150-150: Including VersionVector in ChangePack

Adding the VersionVector field to the ChangePack aligns with the updated data model and ensures that version vectors are properly serialized and transmitted.


178-192: Implementation of ToVersionVector is correct

The ToVersionVector function correctly converts the VersionVector to its Protobuf representation, handling potential errors during actor ID conversion. This ensures accurate serialization of version vectors for synchronization.


274-278: Proper error handling in ToChanges after updating ToChangeID

The updated code correctly handles the new error return from ToChangeID, ensuring that errors are propagated appropriately. This maintains robust error handling within the ToChanges function.


164-176: Ensure all callers handle the new error return from ToChangeID

By modifying ToChangeID to return an error, it's important to verify that all places where this function is called have been updated to handle the potential error. This prevents unhandled exceptions and maintains the integrity of error propagation throughout the application.

Run the following script to identify all call sites of ToChangeID and check for proper error handling:

✅ Verification successful

All ToChangeID call sites correctly handle errors

Verified that all instances of ToChangeID handle the returned error appropriately.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find all call sites of ToChangeID and ensure they handle errors.

# Search for calls to ToChangeID and display the following lines to inspect error handling.
rg "ToChangeID\(" -A 2

Length of output: 551

pkg/document/document_test.go (3)

491-493: LGTM! The Garbage Collection Update is Appropriate

The use of helper.MaxVersionVector(doc.ActorID()) in doc.GarbageCollect() correctly updates the garbage collection mechanism to work with version vectors.


524-526: LGTM! Consistent Use of Version Vectors in Garbage Collection

The adjustment to use helper.MaxVersionVector(doc.ActorID()) ensures consistency in garbage collection across the tests.


570-578: Verify Correctness of Version Vector Versions After Updates

The version vectors after the updates should reflect the correct versions for each actor. Ensure that the versions at lines 574 and 575 accurately represent the state after the concurrent updates.

To confirm, you can run the following script to inspect the version vectors:

api/converter/from_pb.go (5)

100-104: Handle VersionVector conversion in FromChangePack

The code correctly adds the conversion of pbPack.VersionVector by calling FromVersionVector and handles potential errors. This ensures that the versionVector is properly initialized and integrated into the change.Pack.


116-118: Include VersionVector and MinSyncedTicket in change.Pack initialization

Adding VersionVector: versionVector, and MinSyncedTicket: minSyncedTicket, to the change.Pack struct ensures that these new fields are appropriately set, enhancing the functionality of the change pack with versioning information.


159-163: Convert VersionVector in fromChangeID

By incorporating the conversion of id.VersionVector using FromVersionVector, the code ensures that the change.ID now includes the VersionVector, which is essential for distinguishing between concurrent and causal relationships among changes.


169-169: Pass VersionVector to change.NewID

Including vector, in the change.NewID function call correctly passes the VersionVector to the new change ID, ensuring that the versioning information is part of the change identification process.


173-191: Implement FromVersionVector function for Protobuf conversion

The new FromVersionVector function effectively converts a Protobuf VersionVector into the model's time.VersionVector. The function includes:

  • Handling of nil pbVersionVector to maintain compatibility with older clients.
  • Iteration over pbVersionVector.Vector to populate the versionVector.
  • Conversion of actorID from hex and proper error handling.
server/backend/database/memory/database.go (2)

916-917: LGTM!

The addition of Lamport and ActorID fields to ChangeInfo correctly integrates the version vector functionality. This enhances change tracking and synchronization.


1090-1097: LGTM!

Including Lamport and VersionVector in SnapshotInfo improves the accuracy of snapshots and supports the new version vector mechanism effectively.

server/backend/database/mongo/client.go (12)

520-520: Consistent use of StatusKey constant enhances maintainability

Replacing the hardcoded "status" string with the StatusKey constant improves code consistency and reduces potential errors from typos.


611-612: Refactored key construction using clientDocInfoKey function

Using clientDocInfoKey to dynamically build keys for client document information enhances code readability and reduces hardcoding of strings.


615-616: Improved maintainability with clientDocInfoKey usage

Applying clientDocInfoKey for constructing keys ensures consistency and reduces the potential for errors in key names.


628-631: Consistent key construction for unattached clients

Using clientDocInfoKey function when handling unattached clients maintains code consistency and clarity.


664-664: Consistent use of StatusKey constant in client search

Replacing the hardcoded "status" with StatusKey improves code consistency and maintainability.


882-882: Storing version_vector in changes enhances conflict resolution

Including version_vector in the changes collection allows for better tracking of change dependencies and improves conflict resolution mechanisms.


1046-1052: Capturing version_vector in snapshot information

Adding the version_vector field to snapshots ensures that the document's versioning information is preserved, which is essential for accurate synchronization between clients.


1111-1111: Initializing VersionVector when no snapshot is found

Setting snapshotInfo.VersionVector to a new VersionVector when no snapshot exists prevents potential nil reference errors and ensures consistent behavior.


1173-1173: Using errors.Is for error comparison improves error handling

Replacing direct error comparison with errors.Is enhances robustness by correctly identifying wrapped errors.


1200-1316: Introduction of methods for version vector management

The new methods UpdateAndFindMinSyncedVersionVectorAfterPushPull and UpdateVersionVector effectively handle version vector updates and synchronization, improving document synchronization across clients.


1449-1450: Consistent key usage with clientDocInfoKey and StatusKey

Applying clientDocInfoKey and StatusKey ensures consistent access to client document information and improves code maintainability.


1535-1537: Helper function clientDocInfoKey centralizes key construction

Introducing the clientDocInfoKey function centralizes key construction logic, enhancing code clarity and reducing duplication.

server/backend/database/testcases/testcases.go (3)

552-554: Confirm VersionVector is correctly initialized before use

The doc.VersionVector() is added to the change.NewPack call. Ensure that doc.VersionVector() is properly initialized and accurately reflects the document's state before this call, to prevent any inconsistencies during snapshot creation.


559-561: Maintain consistency in passing VersionVector to NewPack

Good job including doc.VersionVector() in the subsequent change.NewPack call. Consistently passing the VersionVector ensures that all snapshots and change packs have the necessary versioning information.


435-437: Ensure all NewPack calls include the VersionVector parameter

The addition of doc.VersionVector() to the change.NewPack call is necessary for the version vector functionality. Please verify that all other calls to change.NewPack across the codebase are updated to include the VersionVector parameter where appropriate, to maintain consistency and ensure proper version tracking.

Run the following script to identify any NewPack calls missing the VersionVector parameter:

test/integration/gc_test.go (1)

37-65: Addition of helper functions for version vector verification

The introduction of the versionInfo struct and helper functions checkVV and versionOf enhances the readability and maintainability of the tests by abstracting the version vector assertions.

Comment on lines 27 to 40
// RunMigration runs migrations for v0.6.0
func RunMigration(ctx context.Context, db *mongo.Client, databaseName string, batchSize int) error {
if err := AddVersionVector(ctx, db, databaseName, batchSize); err != nil {
return err
}

if err := DropSnapshots(ctx, db, databaseName); err != nil {
return err
}

fmt.Println("v0.6.0 migration completed")

return nil
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider enhancing error handling and logging.

The RunMigration function is well-structured, but consider the following improvements:

  1. The batchSize parameter is only used in AddVersionVector. Consider adding a comment explaining its purpose or use it consistently across both operations if applicable.

  2. Replace the fmt.Println with a proper logging mechanism. This allows for better control over output in different environments.

  3. Consider adding progress tracking or more detailed logging to help monitor the migration process, especially for large databases.

Here's a suggested improvement:

import (
    "context"
    "log"
    "go.mongodb.org/mongo-driver/mongo"
)

func RunMigration(ctx context.Context, db *mongo.Client, databaseName string, batchSize int) error {
    log.Println("Starting v0.6.0 migration...")

    if err := AddVersionVector(ctx, db, databaseName, batchSize); err != nil {
        log.Printf("Error adding version vector: %v", err)
        return err
    }
    log.Println("Version vector addition completed.")

    if err := DropSnapshots(ctx, db, databaseName); err != nil {
        log.Printf("Error dropping snapshots: %v", err)
        return err
    }
    log.Println("Snapshots dropped successfully.")

    log.Println("v0.6.0 migration completed successfully")
    return nil
}

This version uses the log package for better logging control and provides more detailed progress information.

Comment on lines +116 to +119
// AfterOrEqual returns whether this change is after or equal to the given change.
func (c *Change) AfterOrEqual(other *Change) bool {
return c.id.AfterOrEqual(other.id)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

AfterOrEqual method is not utilized in the main codebase

The AfterOrEqual method is only used in pkg/document/document_test.go. Ensure that this method is employed in the main codebase where change comparisons occur to align with the version vector mechanism.

🔗 Analysis chain

Verify usage of AfterOrEqual method across the codebase

The introduction of this method is a significant change in how changes are compared. To ensure its correct usage:

  1. Review other parts of the codebase where changes are currently compared.
  2. Update those comparisons to use this new method where appropriate.
  3. Add unit tests to verify the behavior of this method with different version vector scenarios.

To help with this verification, you can run the following script to find potential places where this method should be used:

This will help identify areas that might need updating to use the new AfterOrEqual method.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find potential places where AfterOrEqual should be used

# Search for comparisons involving Change objects
rg -n "Change.*\.(Before|After|Equal)" --type go

# Search for time-based comparisons that might need to be updated
rg -n "(time\.Time|Timestamp).*\.(Before|After|Equal)" --type go

Length of output: 519


Script:

#!/bin/bash
# Description: Find usages of AfterOrEqual method in non-test Go files

rg -n "AfterOrEqual" --type go -g '!*.test.go'

Length of output: 1214

// ActorID represents the unique ID of the client. It is composed of 12 bytes.
// It caches the string representation of ActorID to reduce the number of calls
// to hex.EncodeToString. This causes a multi-routine problem, so it is
// recommended to use it in a single routine or to use it after locking.
type ActorID struct {
bytes [actorIDSize]byte

bytes actorID
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Incomplete replacement of actorIDSize type detected.

Multiple instances of [actorIDSize]byte remain in pkg/document/time/actor_id.go, indicating that the actorID type replacement is not fully implemented. Please update the following usages to ensure consistency and type safety:

  • bytes: [actorIDSize]byte{
  • type actorID [actorIDSize]byte
  • result, err := json.Marshal(&struct{ Bytes [actorIDSize]byte }{
  • temp := &(struct{ Bytes [actorIDSize]byte }{
🔗 Analysis chain

LGTM: Consistent use of the new actorID type.

The update of the bytes field in the ActorID struct to use the new actorID type is consistent with the introduction of the new type. This change maintains the same functionality while improving type safety. The public interface of the ActorID struct remains unchanged, ensuring backward compatibility.

To ensure consistency, let's verify that all usages of the bytes field have been updated:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for any remaining usages of [actorIDSize]byte in the file

# Test: Search for [actorIDSize]byte. Expect: No results except in comments or constants.
rg '\[actorIDSize\]byte' pkg/document/time/actor_id.go

Length of output: 226

@@ -265,7 +264,7 @@ func TestArray(t *testing.T) {
}))

assert.Equal(t, 8, d1.GarbageLen())
assert.Equal(t, 8, d1.GarbageCollect(time.MaxTicket))
assert.Equal(t, 8, d1.GarbageCollect(helper.MaxVersionVector(d1.ActorID())))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Remaining Uses of time.MaxTicket Found

The following files still use time.MaxTicket and should be updated to use helper.MaxVersionVector for consistency:

  • api/converter/converter_test.go
  • api/converter/from_bytes.go
  • pkg/document/crdt/rga_tree_split.go
  • pkg/document/crdt/tree_test.go
  • pkg/document/crdt/tree.go
  • pkg/document/crdt/text.go
🔗 Analysis chain

Update to garbage collection mechanism

The garbage collection mechanism has been updated to use helper.MaxVersionVector(d1.ActorID()) instead of time.MaxTicket. This change aligns with the new version vector-based approach for tracking changes and should provide more accurate garbage collection in distributed scenarios.

To ensure this change is consistent across the codebase, let's check for any remaining uses of time.MaxTicket:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any remaining uses of time.MaxTicket
rg 'time\.MaxTicket' --type go

Length of output: 77


Script:

#!/bin/bash
# Comprehensive search for any remaining uses of time.MaxTicket in all Go files
rg 'time\.MaxTicket' --type go

Length of output: 652

Comment on lines +1563 to +1573
yorkie.v1.VersionVector:
additionalProperties: false
description: ""
properties:
vector:
additionalProperties: false
description: ""
title: vector
type: object
title: VersionVector
type: object
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

VersionVector schema missing details for the 'vector' property.

Please add descriptions or examples for the vector object to clarify its structure and usage for API consumers.

🔗 Analysis chain

LGTM: VersionVector schema added, consider adding more details.

The new VersionVector schema has been successfully added, which is crucial for the new functionality to detect relationships between changes.

Could you provide more information about the structure of the vector object? It might be helpful to add a description or example to clarify its usage and format for API consumers.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any additional comments or descriptions about VersionVector
rg -i "versionvector|version vector" api/docs/yorkie/v1/yorkie.openapi.yaml

Length of output: 222


Script:

#!/bin/bash
# Corrected script to search for descriptions or examples of VectorEntry
rg -i "VectorEntry" api/docs/yorkie/v1/yorkie.openapi.yaml

Length of output: 124

@@ -308,6 +308,7 @@
assert.NoError(t, err)

actorID, _ := hex.DecodeString(activateResp.Msg.ClientId)
pbVector, _ := converter.ToVersionVector(time.NewVersionVector())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Handle potential error returned by ToVersionVector

At line 311, the error returned by converter.ToVersionVector is being ignored. It's important to check and handle errors to ensure that any issues during the conversion process are caught and managed appropriately.

Apply this diff to handle the error:

-pbVector, _ := converter.ToVersionVector(time.NewVersionVector())
+pbVector, err := converter.ToVersionVector(time.NewVersionVector())
+if err != nil {
+    t.Fatalf("failed to convert VersionVector: %v", err)
+}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pbVector, _ := converter.ToVersionVector(time.NewVersionVector())
pbVector, err := converter.ToVersionVector(time.NewVersionVector())
if err != nil {
t.Fatalf("failed to convert VersionVector: %v", err)
}

Comment on lines +1326 to +1370
var attachedActorIDs []*time.ActorID
attached, err := clientInfo.IsAttached(docRefKey.DocID)
if err != nil {
return nil, err
}

if attached {
actorID, err := clientInfo.ID.ToActorID()
if err != nil {
return nil, err
}
attachedActorIDs = append(attachedActorIDs, actorID)
}

// 02. Find all version vectors of the given document from DB.
txn := d.db.Txn(true)
defer txn.Abort()
iterator, err := txn.Get(tblVersionVectors, "doc_id", docRefKey.DocID.String())
if err != nil {
return nil, fmt.Errorf("find all version vectors: %w", err)
}

// 03. Compute min version vector.
var minVersionVector time.VersionVector

// 03-1. Compute min version vector of other clients and collect attachedActorIDs.
for raw := iterator.Next(); raw != nil; raw = iterator.Next() {
vvi := raw.(*database.VersionVectorInfo)
if clientInfo.ID == vvi.ClientID {
continue
}

actorID, err := vvi.ClientID.ToActorID()
if err != nil {
return nil, err
}
attachedActorIDs = append(attachedActorIDs, actorID)

if minVersionVector == nil {
minVersionVector = vvi.VersionVector
continue
}

minVersionVector = minVersionVector.Min(vvi.VersionVector)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ensure only attached clients are included in attachedActorIDs

In UpdateAndFindMinSyncedVersionVectorAfterPushPull, when computing attachedActorIDs, the code collects all actorIDs from VersionVectorInfo entries without verifying if the clients are currently attached to the document. This may result in including detached clients, potentially leading to incorrect computation of the minVersionVector.

To ensure accuracy, consider checking the attachment status of each client before appending their actorID to attachedActorIDs.

Apply this diff to include the attachment check:

 for raw := iterator.Next(); raw != nil; raw = iterator.Next() {
     vvi := raw.(*database.VersionVectorInfo)
     if clientInfo.ID == vvi.ClientID {
         continue
     }

+    // Check if the client is attached to the document
+    client, err := d.FindClientInfoByID(vvi.ClientID)
+    if err != nil {
+        return nil, fmt.Errorf("find client by ID: %w", err)
+    }
+    attached, err := client.IsAttached(docRefKey.DocID)
+    if err != nil {
+        return nil, fmt.Errorf("client attachment status: %w", err)
+    }
+    if !attached {
+        continue
+    }

     actorID, err := vvi.ClientID.ToActorID()
     if err != nil {
         return nil, err
     }
     attachedActorIDs = append(attachedActorIDs, actorID)

     if minVersionVector == nil {
         minVersionVector = vvi.VersionVector
         continue
     }
     minVersionVector = minVersionVector.Min(vvi.VersionVector)
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
var attachedActorIDs []*time.ActorID
attached, err := clientInfo.IsAttached(docRefKey.DocID)
if err != nil {
return nil, err
}
if attached {
actorID, err := clientInfo.ID.ToActorID()
if err != nil {
return nil, err
}
attachedActorIDs = append(attachedActorIDs, actorID)
}
// 02. Find all version vectors of the given document from DB.
txn := d.db.Txn(true)
defer txn.Abort()
iterator, err := txn.Get(tblVersionVectors, "doc_id", docRefKey.DocID.String())
if err != nil {
return nil, fmt.Errorf("find all version vectors: %w", err)
}
// 03. Compute min version vector.
var minVersionVector time.VersionVector
// 03-1. Compute min version vector of other clients and collect attachedActorIDs.
for raw := iterator.Next(); raw != nil; raw = iterator.Next() {
vvi := raw.(*database.VersionVectorInfo)
if clientInfo.ID == vvi.ClientID {
continue
}
actorID, err := vvi.ClientID.ToActorID()
if err != nil {
return nil, err
}
attachedActorIDs = append(attachedActorIDs, actorID)
if minVersionVector == nil {
minVersionVector = vvi.VersionVector
continue
}
minVersionVector = minVersionVector.Min(vvi.VersionVector)
}
var attachedActorIDs []*time.ActorID
attached, err := clientInfo.IsAttached(docRefKey.DocID)
if err != nil {
return nil, err
}
if attached {
actorID, err := clientInfo.ID.ToActorID()
if err != nil {
return nil, err
}
attachedActorIDs = append(attachedActorIDs, actorID)
}
// 02. Find all version vectors of the given document from DB.
txn := d.db.Txn(true)
defer txn.Abort()
iterator, err := txn.Get(tblVersionVectors, "doc_id", docRefKey.DocID.String())
if err != nil {
return nil, fmt.Errorf("find all version vectors: %w", err)
}
// 03. Compute min version vector.
var minVersionVector time.VersionVector
// 03-1. Compute min version vector of other clients and collect attachedActorIDs.
for raw := iterator.Next(); raw != nil; raw = iterator.Next() {
vvi := raw.(*database.VersionVectorInfo)
if clientInfo.ID == vvi.ClientID {
continue
}
// Check if the client is attached to the document
client, err := d.FindClientInfoByID(vvi.ClientID)
if err != nil {
return nil, fmt.Errorf("find client by ID: %w", err)
}
attached, err := client.IsAttached(docRefKey.DocID)
if err != nil {
return nil, fmt.Errorf("client attachment status: %w", err)
}
if !attached {
continue
}
actorID, err := vvi.ClientID.ToActorID()
if err != nil {
return nil, err
}
attachedActorIDs = append(attachedActorIDs, actorID)
if minVersionVector == nil {
minVersionVector = vvi.VersionVector
continue
}
minVersionVector = minVersionVector.Min(vvi.VersionVector)
}

Comment on lines +1226 to +1238
// 02. Find all version vectors of the given document from DB.
cursor, err := c.collection(ColVersionVectors).Find(ctx, bson.M{
"project_id": docRefKey.ProjectID,
"doc_id": docRefKey.DocID,
})
if err != nil {
return nil, fmt.Errorf("find all version vectors: %w", err)
}

if err := cursor.All(ctx, &versionVectorInfos); err != nil {
return nil, fmt.Errorf("decode version vectors: %w", err)
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Potential performance considerations when fetching version vectors

Fetching all version vectors for a document may impact performance with a large number of clients. Consider optimizing this operation by limiting the data retrieved or implementing pagination if necessary.

assert.NoError(t, err)
assert.NoError(t, c1.Attach(ctx, d1))
// d1.vv = [c1:1], minvv = [c1:1], db.vv = {c1: [c1:1]}
assert.Equal(t, checkVV(d1.VersionVector(), versionOf(d1.ActorID(), 1)), true)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Incorrect argument order in assert.Equal calls

The assert.Equal function expects the arguments in the order assert.Equal(t, expected, actual). In the following lines, the actual value is placed before the expected value, which may lead to confusing test failure messages:

Lines: 76, 91, 97, 101, 108, 167, 173, 178, 190, 214, 221, 251, 267, 274, 329, 335, 339, 346, 360, 370, 391, 410, 417, 421, 437, 460, 469, 493, 500, 503, 514, 571, 587, 700, 708, 716, 741, 752, 761, 764, 772, 775, 782, 785, 793, 797, 800, 808, 820, 831, 858, 864, 869, 876, 892, 897, 900, 908, 911, 921, 926, 933, 942, 944, 950, 957, 973, 989, 997, 1003, 1007, 1014, 1020, 1023, 1045, 1051, 1058, 1060, 1068, 1071, 1078, 1085, 1090, 1098, 1104, 1122, 1127, 1132, 1135, 1141, 1156, 1163, 1171, 1177, 1181, 1188, 1194, 1200, 1206

Apply the following changes to correct the argument order:

- assert.Equal(t, checkVV(...), true)
+ assert.Equal(t, true, checkVV(...))

Alternatively, consider using assert.True for clarity:

- assert.Equal(t, checkVV(...), true)
+ assert.True(t, checkVV(...))

Also applies to: 91-91, 97-97, 101-101, 108-108, 167-167, 173-173, 178-178, 190-190, 214-214, 221-221, 251-251, 267-267, 274-274, 329-329, 335-335, 339-339, 346-346, 360-360, 370-370, 391-391, 410-410, 417-417, 421-421, 437-437, 460-460, 469-469, 493-493, 500-500, 503-503, 514-514, 571-571, 587-587, 700-700, 708-708, 716-716, 741-741, 752-752, 761-761, 764-764, 772-772, 775-775, 782-782, 785-785, 793-793, 797-797, 800-800, 808-808, 820-820, 831-831, 858-858, 864-864, 869-869, 876-876, 892-892, 897-897, 900-900, 908-908, 911-911, 921-921, 926-926, 933-933, 942-942, 944-944, 950-950, 957-957, 973-973, 989-989, 997-997, 1003-1003, 1007-1007, 1014-1014, 1020-1020, 1023-1023, 1045-1045, 1051-1051, 1058-1058, 1060-1060, 1068-1068, 1071-1071, 1078-1078, 1085-1085, 1090-1090, 1098-1098, 1104-1104, 1122-1122, 1127-1127, 1132-1132, 1135-1135, 1141-1141, 1156-1156, 1163-1163, 1171-1171, 1177-1177, 1181-1181, 1188-1188, 1194-1194, 1200-1200, 1206-1206

ctx := context.Background()
d1 := document.New(helper.TestDocKey(t))
assert.NoError(t, c1.Attach(ctx, d1))
// d2.vv =[c1:1], minvv =[c1:1], db.vv {c1: [c1:1]}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Typo in code comment

The comment references d2.vv, but it should be d1.vv based on the context:

- // d2.vv =[c1:1], minvv =[c1:1], db.vv {c1: [c1:1]}
+ // d1.vv = [c1:1], minvv = [c1:1], db.vv = {c1: [c1:1]}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// d2.vv =[c1:1], minvv =[c1:1], db.vv {c1: [c1:1]}
// d1.vv = [c1:1], minvv = [c1:1], db.vv = {c1: [c1:1]}

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Outside diff range and nitpick comments (21)
migrations/v0.5.3/main.go (1)

27-40: LGTM with minor suggestions: Well-structured migration function.

The RunMigration function is well-implemented with proper error handling and a logical order of operations. Consider the following minor improvements:

  1. Replace fmt.Println with a logging statement for better integration with logging systems.
  2. Verify if batchSize should also be passed to DropSnapshots or if it's intentionally only used in AddVersionVector.

Here's a suggested modification for the logging:

-	fmt.Println("v0.5.3 migration completed")
+	log.Println("v0.5.3 migration completed")

Don't forget to import the "log" package if you make this change.

design/version-vector.md (5)

11-11: Minor grammatical improvement needed.

Consider adding a comma after "Vector" for better readability:

-Yorkie provides a special type of Version Vector called Lamport Synced Version Vector which consists of Lamport and version vector.
+Yorkie provides a special type of Version Vector called Lamport Synced Version Vector, which consists of Lamport and version vector.
🧰 Tools
🪛 LanguageTool

[uncategorized] ~11-~11: Possible missing comma found.
Context: ...on Vector called Lamport Synced Version Vector which consists of Lamport and version v...

(AI_HYDRA_LEO_MISSING_COMMA)


24-24: Minor grammatical correction needed.

Change "are" to "is" for subject-verb agreement:

-`Lamport Timestamps` are a logical clock mechanism used in distributed systems to order events without relying on synchronized physical clocks.
+`Lamport Timestamps` is a logical clock mechanism used in distributed systems to order events without relying on synchronized physical clocks.
🧰 Tools
🪛 LanguageTool

[uncategorized] ~24-~24: The verb “are” doesn’t seem to fit in this context, “is” is probably more formally correct.
Context: ... Lamport Timestamp Lamport Timestamps are a logical clock mechanism used in distr...

(AI_HYDRA_LEO_CPT_ARE_IS)


25-36: Add language specification to code block.

To improve formatting and potentially enable syntax highlighting, add a language specification to the code block:

-```
+```text
 Here's a simplified example of Lamport Timestamps in a distributed system with three processes (P1, P2, P3):

 1. P1 performs an event: P1's clock becomes 1.
 2. P1 sends a message to P2: P1 increments its clock to 2 and sends the message with timestamp 2.
 3. P2 receives the message: P2 updates its clock to max(0, 2) + 1 = 3.
 4. P2 sends a message to P3: P2's clock becomes 4.
 5. P3 receives the message: P3 updates its clock to max(0, 4) + 1 = 5.
 6. P3 sends a message to P1: P3's clock becomes 6.
 7. P1 receives the message: P1 updates its clock to max(2, 6) + 1 = 7.
 8. The timestamps help order events across processes even without synchronized physical clocks.
🧰 Tools
🪛 Markdownlint

25-25: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


52-57: Consider restructuring for clarity.

The information in this section is valuable, but it could be presented more clearly. Consider restructuring it into distinct subsections:

  1. Differences between Lamport Timestamps and Version Vectors
  2. Challenges in converting from Lamport-based to Version Vector-based system
  3. Proposed solution: Synchronizing Lamport Timestamps with Version Vectors

This structure would make it easier for readers to follow the progression of ideas and understand the rationale behind the proposed solution.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~53-~53: Possible missing comma found.
Context: ...ements received, without creating a new max by adding to the maximum found during t...

(AI_HYDRA_LEO_MISSING_COMMA)


53-53: Minor grammatical improvement needed.

Consider adding a comma for better readability:

-As explained above, Lamport Timestamps increase their max value with each merge, while Version Vectors do not. Version Vectors simply take the maximum value of the elements received, without creating a new max by adding to the maximum found during the merge process.
+As explained above, Lamport Timestamps increase their max value with each merge, while Version Vectors do not. Version Vectors simply take the maximum value of the elements received, without creating a new max by adding to the maximum found during the merge process.
🧰 Tools
🪛 LanguageTool

[uncategorized] ~53-~53: Possible missing comma found.
Context: ...ements received, without creating a new max by adding to the maximum found during t...

(AI_HYDRA_LEO_MISSING_COMMA)

design/garbage-collection.md (7)

3-11: Approve content changes and suggest minor grammatical fix

The update to the target version and the addition of the new section are valuable improvements. However, there's a minor grammatical issue in the new section title.

Consider changing "Before watch this document" to "Before reading this document" for better grammatical correctness.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~8-~8: This verb may not be in the correct form. Consider using a different form for this context.
Context: ....3 --- # Garbage Collection ## Before watch this document In this document, the te...

(AI_EN_LECTOR_REPLACEMENT_VERB_FORM)


[uncategorized] ~11-~11: You might be missing the article “the” here.
Context: ...am/yorkie/pull/981) first to understand concepts and usage of version vector. ## Summa...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


30-35: Approve content and suggest minor grammatical improvements

The new section effectively explains the limitations of Lamport timestamps and introduces version vectors as a solution. The included image enhances understanding.

Consider the following minor grammatical improvements:

  1. Change "Why We Need Version Vector rather than only use Lamport" to "Why We Need Version Vectors Rather Than Only Using Lamport Timestamps"
  2. In the first sentence, add "a" before "crucial weakness"
  3. In the last sentence, consider rephrasing to "As shown in the image above, ..."
🧰 Tools
🪛 LanguageTool

[uncategorized] ~31-~31: Possible missing article found.
Context: ...mple and lightweight to use, but it has crucial weakness that Lamport doesn't guarantee...

(AI_HYDRA_LEO_MISSING_THE)


[uncategorized] ~31-~31: Possible missing comma found.
Context: ... lightweight to use, but it has crucial weakness that Lamport doesn't guarantee causalit...

(AI_HYDRA_LEO_MISSING_COMMA)


[style] ~34-~34: This phrasing can be wordy, so consider using a more descriptive and concise alternative.
Context: ...-gc-issue](media/prev-gc-issue.jpg) As you can see from the image above, min_synced_seq doesn...

(AS_YOU_CAN_SEE)


37-46: Approve content and suggest minor improvements

The updated "How It Works" section effectively explains the new process using version vectors and introduces the concept of minVersionVector.

Consider the following improvements:

  1. Add "The" at the beginning of the first sentence in line 40.
  2. In line 40, change "in response PushPull" to "in response to a PushPull request".
  3. In line 43, change "Min version vector is the vector which consists of minimum value" to "The min version vector is the vector that consists of the minimum values".
  4. In line 45, change "Conceptually, min version vector is version vector" to "Conceptually, the min version vector is a version vector".
🧰 Tools
🪛 LanguageTool

[uncategorized] ~39-~39: A determiner appears to be missing. Consider inserting it.
Context: ...d remotely and purges them completely. Server records the version vector of the last ...

(AI_EN_LECTOR_MISSING_DETERMINER)


[uncategorized] ~40-~40: You might be missing the article “the” here.
Context: ...never the client requests PushPull. And Server returns the min version vector, `minVer...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~40-~40: Possible missing preposition found.
Context: ...rsionVectorof all clients in response PushPull to the client.minVersionVector` is us...

(AI_EN_LECTOR_MISSING_PREPOSITION)


[uncategorized] ~43-~43: You might be missing the article “the” here.
Context: ... vector is the vector which consists of minimum value of every version vector stored in...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~43-~43: You might be missing the article “a” here.
Context: ...ector stored in version vector table in database. Conceptually, min version vector is v...

(AI_EN_LECTOR_MISSING_DETERMINER_A)


[uncategorized] ~45-~45: You might be missing the article “a” here.
Context: ...e. Conceptually, min version vector is version vector that is uniformly applied to all...

(AI_EN_LECTOR_MISSING_DETERMINER_A)


48-67: Approve content and suggest minor formatting improvement

The GC algorithm and min version vector calculation examples effectively illustrate the concepts.

Consider adding language specifications to the code blocks. For example:

-```
+```python
 # GC algorithm

 if (removedAt.lamport <= minVersionVector[removedAt.actor]) {
     runGC()
 }

This improvement will enhance syntax highlighting and readability.

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 Markdownlint</summary><blockquote>

48-48: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

56-56: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</blockquote></details>

</details>

---

Line range hint `69-107`: **Approve content and suggest minor grammatical improvements**

The garbage collection example states provide a clear and detailed illustration of the process. The diagrams enhance understanding significantly.



Consider the following minor grammatical improvements:
1. In State 2 (line 80), change "Client a sends change" to "Client a sends the change".
2. In State 2 (line 82), change "client b inserts" to "Client b inserts".
3. In State 3 (line 88), change "Client b pushes change" to "Client b pushes the change".
4. In State 3 (line 88), change "for every clients" to "for every client".
5. In State 5 (line 100), change "Client b pushpull" to "Client b performs a pushpull".
6. In State 6 (line 106), change "Client a pushpull" to "Client a performs a pushpull".

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 LanguageTool</summary><blockquote>

[uncategorized] ~80-~80: You might be missing the article “the” here.
Context: ...And the `Client a` sends change `3a` to server through PushPull API and receives as a ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

---

[uncategorized] ~82-~82: You might be missing the article “the” here.
Context: ... and it has not been sent (pushpull) to server yet.  #### State 3  ![garbage-collectio...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

---

[uncategorized] ~88-~88: You might be missing the article “the” here.
Context: ....png)  `Client b` pushes change `3b` to server and receives as a response that `minVer...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

---

[uncategorized] ~88-~88: You might be missing the article “the” here.
Context: ...ent applies change `4`, the contents of document are changed to `ac`. This time, all cli...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

---

[grammar] ~88-~88: The noun should probably be in the singular form.
Context: ...t's still marked as tombstone for every clients, because `minVersionVector[a] = 1 < 3` ...

(EVERY_EACH_SINGULAR)

---

[formatting] ~88-~88: If the ‘because’ clause is essential to the meaning, do not use a comma before the clause.
Context: ...ll marked as tombstone for every clients, because `minVersionVector[a] = 1 < 3`   #### St...

(COMMA_BEFORE_BECAUSE)

---

[uncategorized] ~143-~143: Possible missing preposition found.
Context: ...ort from db.versionVector. So we choose remove only client's version vector from table...

(AI_HYDRA_LEO_MISSING_TO)

---

[uncategorized] ~143-~143: You might be missing the article “the” here.
Context: ...emove only client's version vector from table, and filter minVersionVector by active ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

---

[uncategorized] ~158-~158: You might be missing the article “the” here.
Context: ....Filter([c3]) = {c1:3, c2:3} ```  After client receive this minVersionVector, it will ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

---

[uncategorized] ~158-~158: This verb does not appear to agree with the subject. Consider using a different form.
Context: ...([c3]) = {c1:3, c2:3} ```  After client receive this minVersionVector, it will filter i...

(AI_EN_LECTOR_REPLACEMENT_VERB_AGREEMENT)

---

[uncategorized] ~159-~159: This verb does not appear to agree with the subject. Consider using a different form.
Context: ...lamport. The next pushpull request will contains filtered version vector so that eventua...

(AI_EN_LECTOR_REPLACEMENT_VERB_AGREEMENT)

---

[uncategorized] ~159-~159: You might be missing the article “a” here.
Context: ...The next pushpull request will contains filtered version vector so that eventually db.ve...

(AI_EN_LECTOR_MISSING_DETERMINER_A)

---

[uncategorized] ~159-~159: You might be missing the article “the” here.
Context: ...eventually db.version vector will store attached client's version vector only. ![filter-...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

</blockquote></details>
<details>
<summary>🪛 Markdownlint</summary><blockquote>

108-108: Punctuation: '.'
Trailing punctuation in heading

(MD026, no-trailing-punctuation)

---

113-113: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

145-145: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</blockquote></details>

</details>

---

`108-160`: **Approve content and suggest minor improvements**

The section on handling detached client's Lamport timestamps provides a clear explanation of the problem and the chosen solution. The examples effectively illustrate the process.



Consider the following improvements:
1. In line 143, change "So we choose remove" to "So we choose to remove".
2. In line 158, change "After client receive" to "After the client receives".
3. In line 159, change "will contains" to "will contain".
4. Add language specifications to the code blocks for better syntax highlighting.
5. Consider breaking down the long paragraph in lines 139-144 into smaller, more digestible chunks for improved readability.

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 LanguageTool</summary><blockquote>

[uncategorized] ~143-~143: Possible missing preposition found.
Context: ...ort from db.versionVector. So we choose remove only client's version vector from table...

(AI_HYDRA_LEO_MISSING_TO)

---

[uncategorized] ~143-~143: You might be missing the article “the” here.
Context: ...emove only client's version vector from table, and filter minVersionVector by active ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

---

[uncategorized] ~158-~158: You might be missing the article “the” here.
Context: ....Filter([c3]) = {c1:3, c2:3} ```  After client receive this minVersionVector, it will ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

---

[uncategorized] ~158-~158: This verb does not appear to agree with the subject. Consider using a different form.
Context: ...([c3]) = {c1:3, c2:3} ```  After client receive this minVersionVector, it will filter i...

(AI_EN_LECTOR_REPLACEMENT_VERB_AGREEMENT)

---

[uncategorized] ~159-~159: This verb does not appear to agree with the subject. Consider using a different form.
Context: ...lamport. The next pushpull request will contains filtered version vector so that eventua...

(AI_EN_LECTOR_REPLACEMENT_VERB_AGREEMENT)

---

[uncategorized] ~159-~159: You might be missing the article “a” here.
Context: ...The next pushpull request will contains filtered version vector so that eventually db.ve...

(AI_EN_LECTOR_MISSING_DETERMINER_A)

---

[uncategorized] ~159-~159: You might be missing the article “the” here.
Context: ...eventually db.version vector will store attached client's version vector only. ![filter-...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

</blockquote></details>
<details>
<summary>🪛 Markdownlint</summary><blockquote>

108-108: Punctuation: '.'
Trailing punctuation in heading

(MD026, no-trailing-punctuation)

---

113-113: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

145-145: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</blockquote></details>

</details>

---

Line range hint `1-161`: **Approve document updates and commend comprehensive explanation**

The updated garbage collection document provides a thorough and well-structured explanation of the new mechanism using version vectors. Key improvements include:

1. Clear justification for transitioning from Lamport timestamps to version vectors.
2. Detailed explanation of the `minVersionVector` concept and its role in garbage collection.
3. Step-by-step example with diagrams illustrating the garbage collection process.
4. Explanation of how to handle detached clients' version vectors.

These updates significantly enhance the document's value for understanding the new garbage collection mechanism.



Consider adding a brief section on the performance implications of this new approach, including any potential trade-offs between the old and new methods. This would provide a more complete picture of the change's impact on the system.

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 LanguageTool</summary><blockquote>

[uncategorized] ~80-~80: You might be missing the article “the” here.
Context: ...And the `Client a` sends change `3a` to server through PushPull API and receives as a ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

---

[uncategorized] ~82-~82: You might be missing the article “the” here.
Context: ... and it has not been sent (pushpull) to server yet.  #### State 3  ![garbage-collectio...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

---

[uncategorized] ~88-~88: You might be missing the article “the” here.
Context: ....png)  `Client b` pushes change `3b` to server and receives as a response that `minVer...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

---

[uncategorized] ~88-~88: You might be missing the article “the” here.
Context: ...ent applies change `4`, the contents of document are changed to `ac`. This time, all cli...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

---

[grammar] ~88-~88: The noun should probably be in the singular form.
Context: ...t's still marked as tombstone for every clients, because `minVersionVector[a] = 1 < 3` ...

(EVERY_EACH_SINGULAR)

---

[formatting] ~88-~88: If the ‘because’ clause is essential to the meaning, do not use a comma before the clause.
Context: ...ll marked as tombstone for every clients, because `minVersionVector[a] = 1 < 3`   #### St...

(COMMA_BEFORE_BECAUSE)

---

[uncategorized] ~143-~143: Possible missing preposition found.
Context: ...ort from db.versionVector. So we choose remove only client's version vector from table...

(AI_HYDRA_LEO_MISSING_TO)

---

[uncategorized] ~143-~143: You might be missing the article “the” here.
Context: ...emove only client's version vector from table, and filter minVersionVector by active ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

---

[uncategorized] ~158-~158: You might be missing the article “the” here.
Context: ....Filter([c3]) = {c1:3, c2:3} ```  After client receive this minVersionVector, it will ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

---

[uncategorized] ~158-~158: This verb does not appear to agree with the subject. Consider using a different form.
Context: ...([c3]) = {c1:3, c2:3} ```  After client receive this minVersionVector, it will filter i...

(AI_EN_LECTOR_REPLACEMENT_VERB_AGREEMENT)

---

[uncategorized] ~159-~159: This verb does not appear to agree with the subject. Consider using a different form.
Context: ...lamport. The next pushpull request will contains filtered version vector so that eventua...

(AI_EN_LECTOR_REPLACEMENT_VERB_AGREEMENT)

---

[uncategorized] ~159-~159: You might be missing the article “a” here.
Context: ...The next pushpull request will contains filtered version vector so that eventually db.ve...

(AI_EN_LECTOR_MISSING_DETERMINER_A)

---

[uncategorized] ~159-~159: You might be missing the article “the” here.
Context: ...eventually db.version vector will store attached client's version vector only. ![filter-...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

</blockquote></details>
<details>
<summary>🪛 Markdownlint</summary><blockquote>

108-108: Punctuation: '.'
Trailing punctuation in heading

(MD026, no-trailing-punctuation)

---

113-113: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

---

145-145: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</blockquote></details>

</details>

</blockquote></details>
<details>
<summary>migrations/v0.5.3/drop-snapshots.go (2)</summary><blockquote>

`22-22`: **Remove unused import**

After removing `log.Fatal`, the import `"log"` is no longer needed.



Apply this diff to clean up the imports:

```diff
 import (
 	"context"
 	"fmt"
-	"log"

 	"go.mongodb.org/mongo-driver/bson"
 	"go.mongodb.org/mongo-driver/mongo"
 )

61-61: Use logging instead of fmt.Println

Using fmt.Println for logging messages in library code is not recommended. Consider using a logging framework or removing the print statement if it's not necessary.

Apply this diff to adjust the output:

-	fmt.Println("drop snapshots completed")
+	// Optionally, log completion message if needed
+	// log.Printf("DropSnapshots: snapshots collection dropped successfully")
migrations/v0.5.3/add-version-vector.go (6)

58-60: Correct grammatical error in error message

The error message in line 59 should correctly reflect the possibility of multiple actors.

Apply this diff to fix the grammatical error:

-       return fmt.Errorf("found %d actor in version vector", len(actors))
+       return fmt.Errorf("found %d actors in version vector", len(actors))

79-113: Handle empty batch scenarios in processMigrationBatch

In the processMigrationBatch function, if infos is empty, the function proceeds without performing any operations. While the current check if len(operations) > 0 prevents unnecessary writes, it's good practice to return early if there's nothing to process.

Consider adding a check at the beginning of the function:

func processMigrationBatch(
    ctx context.Context,
    collection *mongo.Collection,
    infos []database.ChangeInfo,
) error {
+   if len(infos) == 0 {
+       return nil
+   }
    var operations []mongo.WriteModel

130-155: Close cursor after iteration to release resources

Ensure that the cursor is properly closed after use to prevent resource leaks.

Add defer statements to close the cursor:

cursor, err := collection.Find(ctx, bson.M{})
if err != nil {
    return err
}
+defer cursor.Close(ctx)

157-161: Process remaining documents after loop ends

Make sure to update the processedCount after processing the final batch to accurately report progress if necessary.

if len(infos) > 0 {
    if err := processMigrationBatch(ctx, collection, infos); err != nil {
        return fmt.Errorf("process final batch: %w", err)
    }
+   processedCount += float64(len(infos))
}

163-165: Handle validation errors appropriately

Currently, if the validateAddVersionVector function returns an error, the migration stops. Consider logging the error with more context or providing a way to retry or skip the validation step if necessary.

You might want to wrap the error with additional context:

if err = validateAddVersionVector(ctx, db, databaseName); err != nil {
-   return err
+   return fmt.Errorf("validation failed after migration: %w", err)
}

This provides clearer information about where the failure occurred.


167-168: Standardize completion message formatting

For consistency, consider adjusting the completion message to match the format of progress updates.

- fmt.Println("add version vector completed")
+ fmt.Printf("%s.changes version vector migration 100%% completed\n", databaseName)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between bb9c7bb and 5c98c41.

📒 Files selected for processing (6)
  • cmd/yorkie/migration.go (1 hunks)
  • design/garbage-collection.md (3 hunks)
  • design/version-vector.md (1 hunks)
  • migrations/v0.5.3/add-version-vector.go (1 hunks)
  • migrations/v0.5.3/drop-snapshots.go (1 hunks)
  • migrations/v0.5.3/main.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • cmd/yorkie/migration.go
🧰 Additional context used
🪛 LanguageTool
design/garbage-collection.md

[uncategorized] ~8-~8: This verb may not be in the correct form. Consider using a different form for this context.
Context: ....3 --- # Garbage Collection ## Before watch this document In this document, the te...

(AI_EN_LECTOR_REPLACEMENT_VERB_FORM)


[uncategorized] ~11-~11: You might be missing the article “the” here.
Context: ...am/yorkie/pull/981) first to understand concepts and usage of version vector. ## Summa...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~31-~31: Possible missing article found.
Context: ...mple and lightweight to use, but it has crucial weakness that Lamport doesn't guarantee...

(AI_HYDRA_LEO_MISSING_THE)


[uncategorized] ~31-~31: Possible missing comma found.
Context: ... lightweight to use, but it has crucial weakness that Lamport doesn't guarantee causalit...

(AI_HYDRA_LEO_MISSING_COMMA)


[style] ~34-~34: This phrasing can be wordy, so consider using a more descriptive and concise alternative.
Context: ...-gc-issue](media/prev-gc-issue.jpg) As you can see from the image above, min_synced_seq doesn...

(AS_YOU_CAN_SEE)


[uncategorized] ~39-~39: A determiner appears to be missing. Consider inserting it.
Context: ...d remotely and purges them completely. Server records the version vector of the last ...

(AI_EN_LECTOR_MISSING_DETERMINER)


[uncategorized] ~40-~40: You might be missing the article “the” here.
Context: ...never the client requests PushPull. And Server returns the min version vector, `minVer...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~40-~40: Possible missing preposition found.
Context: ...rsionVectorof all clients in response PushPull to the client.minVersionVector` is us...

(AI_EN_LECTOR_MISSING_PREPOSITION)


[uncategorized] ~43-~43: You might be missing the article “the” here.
Context: ... vector is the vector which consists of minimum value of every version vector stored in...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~43-~43: You might be missing the article “a” here.
Context: ...ector stored in version vector table in database. Conceptually, min version vector is v...

(AI_EN_LECTOR_MISSING_DETERMINER_A)


[uncategorized] ~45-~45: You might be missing the article “a” here.
Context: ...e. Conceptually, min version vector is version vector that is uniformly applied to all...

(AI_EN_LECTOR_MISSING_DETERMINER_A)


[uncategorized] ~80-~80: You might be missing the article “the” here.
Context: ...And the Client a sends change 3a to server through PushPull API and receives as a ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~82-~82: You might be missing the article “the” here.
Context: ... and it has not been sent (pushpull) to server yet. #### State 3 ![garbage-collectio...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~88-~88: You might be missing the article “the” here.
Context: ....png) Client b pushes change 3b to server and receives as a response that `minVer...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~88-~88: You might be missing the article “the” here.
Context: ...ent applies change 4, the contents of document are changed to ac. This time, all cli...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[grammar] ~88-~88: The noun should probably be in the singular form.
Context: ...t's still marked as tombstone for every clients, because minVersionVector[a] = 1 < 3 ...

(EVERY_EACH_SINGULAR)


[formatting] ~88-~88: If the ‘because’ clause is essential to the meaning, do not use a comma before the clause.
Context: ...ll marked as tombstone for every clients, because minVersionVector[a] = 1 < 3 #### St...

(COMMA_BEFORE_BECAUSE)


[uncategorized] ~143-~143: Possible missing preposition found.
Context: ...ort from db.versionVector. So we choose remove only client's version vector from table...

(AI_HYDRA_LEO_MISSING_TO)


[uncategorized] ~143-~143: You might be missing the article “the” here.
Context: ...emove only client's version vector from table, and filter minVersionVector by active ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~158-~158: You might be missing the article “the” here.
Context: ....Filter([c3]) = {c1:3, c2:3} ``` After client receive this minVersionVector, it will ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)


[uncategorized] ~158-~158: This verb does not appear to agree with the subject. Consider using a different form.
Context: ...([c3]) = {c1:3, c2:3} ``` After client receive this minVersionVector, it will filter i...

(AI_EN_LECTOR_REPLACEMENT_VERB_AGREEMENT)


[uncategorized] ~159-~159: This verb does not appear to agree with the subject. Consider using a different form.
Context: ...lamport. The next pushpull request will contains filtered version vector so that eventua...

(AI_EN_LECTOR_REPLACEMENT_VERB_AGREEMENT)


[uncategorized] ~159-~159: You might be missing the article “a” here.
Context: ...The next pushpull request will contains filtered version vector so that eventually db.ve...

(AI_EN_LECTOR_MISSING_DETERMINER_A)


[uncategorized] ~159-~159: You might be missing the article “the” here.
Context: ...eventually db.version vector will store attached client's version vector only. ![filter-...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

design/version-vector.md

[uncategorized] ~11-~11: Possible missing comma found.
Context: ...on Vector called Lamport Synced Version Vector which consists of Lamport and version v...

(AI_HYDRA_LEO_MISSING_COMMA)


[uncategorized] ~23-~23: Possible missing comma found.
Context: ...ructure. ## Proposal Details ### What is Lamport Timestamp Lamport Timestamps ...

(AI_HYDRA_LEO_MISSING_COMMA)


[uncategorized] ~24-~24: The verb “are” doesn’t seem to fit in this context, “is” is probably more formally correct.
Context: ... Lamport Timestamp Lamport Timestamps are a logical clock mechanism used in distr...

(AI_HYDRA_LEO_CPT_ARE_IS)


[uncategorized] ~53-~53: Possible missing comma found.
Context: ...ements received, without creating a new max by adding to the maximum found during t...

(AI_HYDRA_LEO_MISSING_COMMA)

🪛 Markdownlint
design/garbage-collection.md

69-69: Punctuation: ':'
Trailing punctuation in heading

(MD026, no-trailing-punctuation)


48-48: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


56-56: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


108-108: Punctuation: '.'
Trailing punctuation in heading

(MD026, no-trailing-punctuation)


113-113: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)


145-145: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

design/version-vector.md

25-25: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (6)
migrations/v0.5.3/main.go (3)

1-18: LGTM: Proper file structure and package declaration.

The file structure is well-organized with a correct license header, concise package documentation, and an appropriate package name for version-specific migrations.


20-25: LGTM: Appropriate and well-organized imports.

The imports are correctly organized and include only the necessary packages for the file's functionality.


1-40: Overall: Well-implemented migration file with minor suggestions for improvement.

This migration file for version 0.5.3 is well-structured, properly documented, and follows good coding practices. The RunMigration function effectively orchestrates the migration steps with appropriate error handling. The minor suggestions provided earlier would further enhance its quality and maintainability.

design/version-vector.md (3)

38-50: Clear and informative explanation.

The explanation of Yorkie's current logical clock system, the motivation for change, and the introduction of Version Vectors is well-structured and informative. The inclusion of the image (version-vector-example.jpg) enhances understanding.


59-66: Excellent explanation of Lamport Synced Version Vector.

The introduction and explanation of the Lamport Synced Version Vector concept are clear and well-structured. The inclusion of the diagram (lamport-synced-version-vector-and-version-vector.jpg) significantly enhances understanding by providing a visual comparison between traditional Version Vectors and the new Lamport Synced Version Vector.


1-66: Excellent design document with minor improvements suggested.

This design document provides a comprehensive and well-structured explanation of Version Vectors and the new Lamport Synced Version Vector concept. The content is informative, and the included diagrams significantly enhance understanding. The document effectively communicates the rationale behind the proposed changes to Yorkie's logical clock system.

A few minor grammatical and formatting improvements have been suggested to further enhance readability. Consider implementing these changes and potentially restructuring the "Inconsistency of Lamport and Version Vector" section for improved clarity.

Overall, this is a high-quality design document that effectively introduces and explains the proposed changes to Yorkie's logical clock system.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~11-~11: Possible missing comma found.
Context: ...on Vector called Lamport Synced Version Vector which consists of Lamport and version v...

(AI_HYDRA_LEO_MISSING_COMMA)


[uncategorized] ~23-~23: Possible missing comma found.
Context: ...ructure. ## Proposal Details ### What is Lamport Timestamp Lamport Timestamps ...

(AI_HYDRA_LEO_MISSING_COMMA)


[uncategorized] ~24-~24: The verb “are” doesn’t seem to fit in this context, “is” is probably more formally correct.
Context: ... Lamport Timestamp Lamport Timestamps are a logical clock mechanism used in distr...

(AI_HYDRA_LEO_CPT_ARE_IS)


[uncategorized] ~53-~53: Possible missing comma found.
Context: ...ements received, without creating a new max by adding to the maximum found during t...

(AI_HYDRA_LEO_MISSING_COMMA)

🪛 Markdownlint

25-25: null
Fenced code blocks should have a language specified

(MD040, fenced-code-language)

func validateDropSnapshot(ctx context.Context, db *mongo.Client, databaseName string) error {
collections, err := db.Database(databaseName).ListCollectionNames(ctx, bson.D{})
if err != nil {
log.Fatal(err)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Avoid using log.Fatal; return the error instead

Using log.Fatal(err) will terminate the entire program, which is not suitable in this context. It's better to return the error to allow the calling function to handle it appropriately.

Apply this diff to fix the issue:

-		log.Fatal(err)
+		return fmt.Errorf("failed to list collection names: %w", err)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
log.Fatal(err)
return fmt.Errorf("failed to list collection names: %w", err)

Comment on lines +94 to +103
versionVector.Set(actorID, info.Lamport)
operation := mongo.NewUpdateOneModel().SetFilter(bson.M{
"project_id": info.ProjectID,
"doc_id": info.DocID,
"server_seq": info.ServerSeq,
}).SetUpdate(bson.M{"$set": bson.M{
"version_vector": versionVector,
}})
operations = append(operations, operation)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Optimize bulk write operation by preallocating slice

To improve performance, consider preallocating the operations slice with an initial capacity equal to len(infos). This avoids unnecessary reallocations during the append operations.

Modify the code as follows:

-   var operations []mongo.WriteModel
+   operations := make([]mongo.WriteModel, 0, len(infos))

    for _, info := range infos {
        versionVector := time.NewVersionVector()
        actorID, err := info.ActorID.ToActorID()
        if err != nil {
            return err
        }

        versionVector.Set(actorID, info.Lamport)
        operation := mongo.NewUpdateOneModel().SetFilter(bson.M{
            "project_id": info.ProjectID,
            "doc_id":     info.DocID,
            "server_seq": info.ServerSeq,
        }).SetUpdate(bson.M{"$set": bson.M{
            "version_vector": versionVector,
        }})
        operations = append(operations, operation)
    }

Committable suggestion was skipped due to low confidence.

Comment on lines +115 to +170
// AddVersionVector migrates the changes collection to add version vector.
func AddVersionVector(ctx context.Context, db *mongo.Client, databaseName string, batchSize int) error {
collection := db.Database(databaseName).Collection("changes")
totalCount, err := collection.CountDocuments(ctx, bson.M{})
if err != nil {
return err
}
batchCount := 1
prevPercentage := 0

cursor, err := collection.Find(ctx, bson.M{})
if err != nil {
return err
}

var infos []database.ChangeInfo

for cursor.Next(ctx) {
var info database.ChangeInfo
if err := cursor.Decode(&info); err != nil {
return fmt.Errorf("decode change info: %w", err)
}

infos = append(infos, info)

if len(infos) >= batchSize {
if err := processMigrationBatch(ctx, collection, infos); err != nil {
return err
}

percentage := int(float64(batchSize*batchCount) / float64(totalCount) * 100)

if percentage != prevPercentage {
fmt.Printf("%s.changes version vector migration %d%% completed \n", databaseName, percentage)
prevPercentage = percentage
}

infos = infos[:0]
batchCount++
}
}

if len(infos) > 0 {
if err := processMigrationBatch(ctx, collection, infos); err != nil {
return fmt.Errorf("process final batch: %w", err)
}
}

if err = validateAddVersionVector(ctx, db, databaseName); err != nil {
return err
}

fmt.Println("add version vector completed")

return nil
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ensure accurate progress calculation in AddVersionVector

In the AddVersionVector function, the percentage calculation may not accurately reflect progress, especially when the last batch is not full. This could result in reporting more than 100% completion.

Consider updating the percentage calculation to use the actual count of processed documents:

    totalCount, err := collection.CountDocuments(ctx, bson.M{})
    if err != nil {
        return err
    }
    var processedCount float64
    prevPercentage := 0

    // Inside the loop after processing a batch
    if len(infos) >= batchSize {
        if err := processMigrationBatch(ctx, collection, infos); err != nil {
            return err
        }

+       processedCount += float64(len(infos))
        percentage := int(processedCount / float64(totalCount) * 100)

        if percentage != prevPercentage {
            fmt.Printf("%s.changes version vector migration %d%% completed\n", databaseName, percentage)
            prevPercentage = percentage
        }

        infos = infos[:0]
        // batchCount++
    }

Also, remove the unused batchCount variable since processedCount now tracks progress.

Committable suggestion was skipped due to low confidence.

Comment on lines +30 to +77
// validateAddVersionVector validates the changes collection to add version vector.
func validateAddVersionVector(ctx context.Context, db *mongo.Client, databaseName string) error {
collection := db.Database(databaseName).Collection("changes")
totalCount, err := collection.CountDocuments(ctx, bson.M{})
if err != nil {
return err
}

prevPercentage := 0
count := float64(0)

cursor, err := collection.Find(ctx, bson.M{})
if err != nil {
return err
}

for cursor.Next(ctx) {
var info database.ChangeInfo
if err := cursor.Decode(&info); err != nil {
return fmt.Errorf("decode change info: %w", err)
}

versionVector := info.VersionVector
actors, err := versionVector.Keys()
if err != nil {
return err
}

if len(actors) > 1 {
return fmt.Errorf("found %d actor in version vector", len(actors))
}

if versionVector.VersionOf(actors[0]) != info.Lamport {
return fmt.Errorf("wrong lamport in version vector")
}

percentage := int(count / float64(totalCount) * 100)

if percentage != prevPercentage {
fmt.Printf("%s.changes validate version vector %d%% completed.\n", databaseName, percentage)
prevPercentage = percentage
}

count++
}

return nil
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ensure consistent percentage calculation and progress reporting

In the validateAddVersionVector function, there might be an inconsistency in the percentage calculation for progress reporting. The count variable is incremented after calculating the percentage, which could lead to off-by-one errors in reporting progress.

Consider updating the code to increment count before calculating the percentage:

+       count++
        percentage := int(count / float64(totalCount) * 100)

        if percentage != prevPercentage {
            fmt.Printf("%s.changes validate version vector %d%% completed.\n", databaseName, percentage)
            prevPercentage = percentage
        }

-       count++
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// validateAddVersionVector validates the changes collection to add version vector.
func validateAddVersionVector(ctx context.Context, db *mongo.Client, databaseName string) error {
collection := db.Database(databaseName).Collection("changes")
totalCount, err := collection.CountDocuments(ctx, bson.M{})
if err != nil {
return err
}
prevPercentage := 0
count := float64(0)
cursor, err := collection.Find(ctx, bson.M{})
if err != nil {
return err
}
for cursor.Next(ctx) {
var info database.ChangeInfo
if err := cursor.Decode(&info); err != nil {
return fmt.Errorf("decode change info: %w", err)
}
versionVector := info.VersionVector
actors, err := versionVector.Keys()
if err != nil {
return err
}
if len(actors) > 1 {
return fmt.Errorf("found %d actor in version vector", len(actors))
}
if versionVector.VersionOf(actors[0]) != info.Lamport {
return fmt.Errorf("wrong lamport in version vector")
}
percentage := int(count / float64(totalCount) * 100)
if percentage != prevPercentage {
fmt.Printf("%s.changes validate version vector %d%% completed.\n", databaseName, percentage)
prevPercentage = percentage
}
count++
}
return nil
}
// validateAddVersionVector validates the changes collection to add version vector.
func validateAddVersionVector(ctx context.Context, db *mongo.Client, databaseName string) error {
collection := db.Database(databaseName).Collection("changes")
totalCount, err := collection.CountDocuments(ctx, bson.M{})
if err != nil {
return err
}
prevPercentage := 0
count := float64(0)
cursor, err := collection.Find(ctx, bson.M{})
if err != nil {
return err
}
for cursor.Next(ctx) {
var info database.ChangeInfo
if err := cursor.Decode(&info); err != nil {
return fmt.Errorf("decode change info: %w", err)
}
versionVector := info.VersionVector
actors, err := versionVector.Keys()
if err != nil {
return err
}
if len(actors) > 1 {
return fmt.Errorf("found %d actor in version vector", len(actors))
}
if versionVector.VersionOf(actors[0]) != info.Lamport {
return fmt.Errorf("wrong lamport in version vector")
}
count++
percentage := int(count / float64(totalCount) * 100)
if percentage != prevPercentage {
fmt.Printf("%s.changes validate version vector %d%% completed.\n", databaseName, percentage)
prevPercentage = percentage
}
}
return nil
}

@hackerwins hackerwins closed this Oct 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants