Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Remote Store + Segrep] Support parallel segment file upload & download in replication path #8187

Open
ashking94 opened this issue Jun 21, 2023 · 4 comments
Labels
enhancement Enhancement or improvement to existing feature or request Storage:Durability Issues and PRs related to the durability framework Storage Issues and PRs relating to data and metadata storage

Comments

@ashking94
Copy link
Member

Is your feature request related to a problem? Please describe.
Currently the segments upload from primary and segments download on replicas is sequential. Only after the segments from primary are uploaded and subsequently the segment metadata file uploaded, the primary publishes to the replicas about initiating the segments sync. Given we have parallel segment files upload, we can initiate downloads on replicas after each file upload. This way we can reduce the overall time for replication.

Describe the solution you'd like
The segrep publisher can publish to replicas side along with individual segment files getting uploaded. We can decide the exact approach based on discussion.

Describe alternatives you've considered
NA

Additional context
NA

@ashking94 ashking94 added enhancement Enhancement or improvement to existing feature or request Storage:Durability Issues and PRs related to the durability framework labels Jun 21, 2023
@ashking94
Copy link
Member Author

ashking94 commented Jun 21, 2023

@mch2 @sachinpkale @Bukhtawar @gbbafna @ankitkala What do you think?

@ashking94 ashking94 changed the title [Remote Store + Segrep] Support parallel segment file upload & download [Remote Store + Segrep] Support parallel segment file upload & download in replication path Jun 21, 2023
@anasalkouz
Copy link
Member

@kotwanikunal is this overlapped with multipart download?

@kotwanikunal
Copy link
Member

@kotwanikunal is this overlapped with multipart download?

This seems more like an enhancement if I understand correctly. The approach defined here will start the segment downloads on replicas as soon as the upload is complete on primary, instead of waiting for all the segments to be uploaded. Is that that the correct understanding @ashking94 ?

@mch2
Copy link
Member

mch2 commented Jul 15, 2023

I think this is worth exploring for sure but isn't a trivial refactor. We could maybe publish checkpoints with expected files up front and then poll the store on an interval/backoff. Or we publish as files become available.

@Bukhtawar Bukhtawar added the Storage Issues and PRs relating to data and metadata storage label Jul 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Storage:Durability Issues and PRs related to the durability framework Storage Issues and PRs relating to data and metadata storage
Projects
Status: 🆕 New
Development

No branches or pull requests

5 participants