Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send "want hints" to peers based on sync download setting #686

Closed
EvanHahn opened this issue May 27, 2024 · 2 comments · Fixed by #944
Closed

Send "want hints" to peers based on sync download setting #686

EvanHahn opened this issue May 27, 2024 · 2 comments · Fixed by #944

Comments

@EvanHahn
Copy link
Contributor

EvanHahn commented May 27, 2024

In order to know whether a connected peer has finished syncing, a device needs to know what the peer wants to download. Currently we assume peers want everything, e.g. sync is not considered complete until a connected peer has everything. However with selective sync of media, a connected peer might consider sync to be "complete" from its side when it has only downloaded some of the media.

Sending bitfields as want-hints is one option, but it has quite a high overhead, and it requires the connected peer to sync blobIndex first (so that it can calculate what it wants). It also requires the connected peer to be able to read the blobIndex, which is incompatible with a "blind" server (e.g. encrypted blobIndex). Since a "blind" server would not be able to do selective sync anyway, not being able to send a want hint as a blind server is ok.

The other option is to send "path filters" for the hyperdrives, e.g. patterns for matching paths that determine what to sync. Because attachment types and variants have a defined file layout, this should be all that is needed.

A device receiving a want hint can read its own hyperdrive, and determine what blocks the connected peer wants.

@EvanHahn EvanHahn self-assigned this Aug 26, 2024
@EvanHahn EvanHahn removed their assignment Oct 10, 2024
@gmaclennan gmaclennan changed the title Update "want hints" to avoid asking for data if not an archive device Send "want hints" to peers based on sync download setting Oct 17, 2024
@gmaclennan gmaclennan self-assigned this Oct 17, 2024
@gmaclennan
Copy link
Member

I wrote an initial version of a "want hint" protobuf message as:

message WantExtension {
  // Not using _unspecified for default enum, because this enum is always used
  // as a repeated field, so we don't need to handle the case where it is
  // not specified.
  enum BlobVariant {
    original = 0;
    thumbnail = 1;
    preview = 2;
  }
  repeated BlobVariant photo = 1;
  repeated BlobVariant audio = 2;
  repeated BlobVariant video = 3;
}

However, this doesn't have great forwards compatibility. A want hint is meant to indicate "this is what I intend to download from you". A peer on a newer version of CoMapeo might download media types and media variants that are unknown to the older peer, which would result in the older peer thinking sync was complete before it actually is complete.

The alternative I can think of is to pass a path filter as the want hint, that matches hyperdrive paths. This ensures that a newer client can always communicate to an older client what it is going to download.

I'm not sure what the best way to define a path filter is:

  1. A single glob pattern, using something like minimatch or micromatch, e.g. pathFilter: '/(photo|video)/(original|preview)/**'
  2. A list of folder names, e.g. folders: ['/photo/original', '/video/original', '/photo/preview'], then const matched = !!folders.find(p => entryPath.startsWith(p.replace(/\/$/, '') + '/'))
  3. A list of paths with wildcards, e.g. paths: ['/photo/*', 'video/original/*'], then some kind of fast & safe matcher.

None of these solutions are great, because I had kind of intended blob paths to be an implementation detail, and deal with blob IDs outside the BlobStore, but this only leaks this implementation detail into extension messages, and keeps it hidden from the "public" API.

@gmaclennan
Copy link
Member

Following up with what we discussed in a huddle:

message DownloadIntentExtension {
  message DownloadIntent {
    repeated string variants = 1;
  }
  map<string, DownloadIntent> downloadIntents = 1;
}

The goal is for something that is structured (map of blob type to blob variants), but can handle future new blob types and new variants, and potentially have other criteria added to downloadIntent, e.g. size, or some other metadata.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants