Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Point in time snapshots #1471

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

gauravsarma1992
Copy link
Contributor

@gauravsarma1992 gauravsarma1992 commented Feb 9, 2025

The PR implements snapshotting a single store into the disk using the copy-on-write approach.

Check the README.md in the PR for more details on the design and implementation.

The PR doesn't implement the strategy of multi shard snapshots yet. It will be implemented
in a subsequent PR.

Test results

Snapshot time for 1 million keys:

  • No Reads and no writes is 190ms.
  • Reads and no writes is 230ms.
  • Reads and with writes during the snapshots is 800ms.

The reads and writes mentioned above are during the snapshotting process on the actual store object.

Few improvements that can be done:

  • Use protobuf instead of gob
  • Use buffered writes instead of chunk writes
  • Improve the locking mechanism of the SnapshotMap

StartedAt: time.Now(),

store: store,
exitCh: make(chan bool, 5),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For multiple store instances there'd be snapshot run triggered at one point of time, in that case why'd we need multiple exitCh chan.
From my understanding flow is like below:

  • Each Store has one snapshot
  • Each snapshot has one flusher
  • Each flusher sends exit signal after flush completed for all objects in that store

Also my assumption is each store will create it's own snapshotter and won't be shared among stores is that correct?
Please correct me if I'm wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the size of the exit channel can be 1 as well.
Yes, this is right. However, this PR will not cover the SnapshotManager that will spawn a snapshotter process for each of the stores.

Also my assumption is each store will create it's own snapshotter and won't be shared among stores is that correct?

ID uint64

store SnapshotStore
totalStoreShot uint8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are we using this variable for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this isn't required now, will remove it.

Comment on lines +60 to +70
for {
select {
case <-pit.ctx.Done():
pit.Close()
return
case <-pit.exitCh:
pit.Close()
return
}
}
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Shall we update this to?

func (pit *PointInTimeSnapshot) processStoreUpdates() error {
	select {
	case <-pit.ctx.Done():
	case <-pit.exitCh:
	}
	pit.Close()
	return nil
}

@gauravsarma1992 gauravsarma1992 marked this pull request as ready for review February 13, 2025 05:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants