-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wallet/db_sqlite3.c: Support direct replication of SQLITE3 backends. #4890
Conversation
1267d3e
to
b4198cc
Compare
Having this built in would be really nice. |
This brings up an issue that we had with the backup plugin itself: how to enforce that The backup plugin verifies that the |
The Backup API of SQLITE3 will always overwrite the backup copy --- in the case of this PR, we always start a full backup at The upshot is that in your example,
The API allows incremental backups, but the implementation here does not use incremental backups (which would be dangerous in our application as some updates may not be picked up in an increment). Incremental backups are intended where multiple updaters may exist on the source db, but that is not our use-case --- only the So in terms of correctness the code is correct, the issue is if whether it is efficient enough not to excessively delay operation (it will probably delay startup, due to the initial copy). |
I really like the idea! |
Interesting, Concept ACK if it works!
|
I was totally unaware that sqlite3 had a backup facility built in, hence my ignorance about it's inner working. I'll test this asap and if it works this is an excellent way of creating same-machine backups. |
The current code seems not to work (^^;;;). A backup file is generated but it seems a snapshot of the file at the start, and is not updated regularly. Seems a misunderstanding of the SQLITE3 Backup API docs. On the other hand this seems a fairly popular pull request. Maybe at the start we can check if both databases have a |
b4198cc
to
a438369
Compare
Current code does the replication "manually", i.e. just forwards writes to both databases. We only use the Backup API at the start to force-sync the two databases. As an optimization we can check |
@ZmnSCPxj |
An error that causes a database constraint violation on the main db should also cause a database constraint violation on the backup db (and would be a bug anyway). Transactions are committed on main before committing on backup, and if a disk-level error occurs at the main then backup is not updated. If a disk-level error occurs on backup but not on main, then main is updated, but we also abort after the failure on the backup, which is equivalent to an uncontrolled shutdown after updating the database but before sending anything to the counterparty, a scenario the protocol is prepared for. Either the non-updated backup (if .e.g. the disk-level failure was due to say NFS mount disappearing because some clumsy flesh foot tripped on the network cable, but the remote disk is still alive) or the updated main would be safe to resume with, again due to the protocol being prepared for either scenario --- as long as a message is not yet sent to the counterparty (and we only send that if both main and backup db were updated OK) then we would be safe with either the pre- or post- transaction state. Perhaps I should note though that Of course, if your disk bitrots your data under you, then the main and backup can indeed go out-of-sync. But we do not have any bitrot protection in |
05abab9
to
3dcc776
Compare
Is that not the wrong way around? Committing to the backup only after the main DB would allow commits to get lost if there is a crash in-between. Admittedly the risk of a crash hitting between commits, followed by a corruption to the main DB is extremely unlikely, but i think the backup should get the commit first. But i also see your point about the |
@cdecker the LN safety model allows us to recover from either pre-db-transaction or post-db-transaction. Consider that your imagined case can be reversed:
Thus, whether we write backup first or main first, the same problem can occur regardless. Different media can write at different speeds, wire delays exist, etc., in a relativistic world there is no "simultaneous", so there will always exist a race condition. As long as we get entire db-transactions, we should be fine, that is what Basically: whether we write to backup first, or main first, is arbitrary, because race condition will always exist in either case. Fortunately, we make sure not to proceed unless we confirm both have synched. That is the only important case: either everything has synched and we can send revocation keys, or some of them are not up-to-date but we have not sent revocation keys (and |
Checked and at least on my throwaway node it looks like we can recover the data safely. The backup is not bit-for-bit same copy (maybe I should add that to For more extreme optimization, we should really launch a new thread or process and open the backup there, and just use a pipe or socket to send queries (including begin transaction and commit) to the backup process/thread. We only need a response from the backup process/thread at commit time, and that is when it can send any error messages that happened in the backup. That ensures that slow backup media writes is at least done in parallel with main media writes. But that is optimization and correctness comes first and the feature is usable as-is. |
3dcc776
to
ef7a712
Compare
Added note that the backup and main are not byte-for-byte identical, rebased on master. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack ef7a712
With some small comment on the docs
ChangeLog-Added: With the `sqlite3://` scheme for `--wallet` option, you can now specify a second file path for real-time database backup by separating it from the main file path with a `:` character.
ef7a712
to
3211983
Compare
Updated docs as per review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack 3211983
Nice work! |
First of all this is great. Second: it's a terrifying first time experience. My
Some sort of log status about the migration would help calm the nerves. On a related note, what is actually stored in lightningd.sqlite3? Is it the bare essentials to safely use channels, or does it contain other stuff like gossip? Since the backup file is probably on a slower, perhaps even remote, disk, I would think that the less stuff in there, the better. Non-radioactive, yet important, things like invoice metadata are probably something I could backup separately. |
wow thanks for reminding me about this PR @Sjors, I just tried this as well and it seems to work. I have a bit more peace of mind now. |
ChangeLog-Added: With the
sqlite3://
scheme for--wallet
option, you can now specify a second file path for real-time database backup by separating it from the main file path with a:
character.Created as draft since not yet sufficiently tested, lemme check a bit. Milestoned next version