Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

litestream recommendation may lead to crashes? #4860

Closed
ZmnSCPxj opened this issue Oct 12, 2021 · 3 comments · Fixed by #4867
Closed

litestream recommendation may lead to crashes? #4860

ZmnSCPxj opened this issue Oct 12, 2021 · 3 comments · Fixed by #4867

Comments

@ZmnSCPxj
Copy link
Collaborator

In BACKUP.md we have a recommendation to use litestream. However, I snuck a look at the official litestream online docs:

Litestream works by effectively taking over the checkpointing process. It starts a long-running read transaction to prevent any other process from checkpointing and restarting the WAL file.

In filesystem terms, that "read transaction" is an advisory shared lock on the database file. When the SQLITE instance running inside lightningd wants to write to the database file (and we write all the time, you should see the blinkenlights on the SSD on my Pi), however, it demands an advisory exclusive lock.

SQLITE will auto-retry the advisory lock for some time (I believe we actually explicitly set this timeout somewhere...? need to dig). However after a while it will quit with "database is locked" and then lightningd kills itself.

Note that in BACKUP.md we recommend against using sqlite3 .dump or VACUUM INTO (the last paragraph). Litestream is actually similar to an efficient .dump (it only copies (part of) the WAL file instead of dumping the entire database), but this is a difference in quantity and not quality: the backup process read-locks the file, and we write-lock very often, and on slow HDDs or high-load systems the timeout on the write-lock retry can be reached. If we disrecommend sqlite3 .dump I think we should disrecommend against litestream as well (or if the quantity is fairly low enough, maybe put up warnings at the very least, and move litestream to after the "copy the file" section).

@cdecker
Copy link
Member

cdecker commented Oct 15, 2021

Hm, ok. I was actually wondering how it was doing replication, and someone said they'd be using the wal, and not locking the DB. If, despite following the document regarding setup (enabling wal mode), it crashes we should remove that recommendation altogether.

@ZmnSCPxj
Copy link
Collaborator Author

ZmnSCPxj commented Oct 15, 2021

I strongly suspect #4857 root cause is litestream replication; let us wait for response there from OP to see if patching helps.

Note that we now have a 5-second timer on sqlite3_busy_timeout, which will prevent crashing from litestream in most cases (#4857 uses a version which predates this! #4857 (comment)). However, in high load cases or very slow media (or both!) 5 seconds may be too short still. See #4867

@ZmnSCPxj
Copy link
Collaborator Author

ZmnSCPxj commented Oct 15, 2021

Anyway, to clarify: litestream has to lock the main database file. SQLITE3 uses the WAL file as a FIFO ring buffer, meaning it could start overwriting old log entries before litestream can read them if the main database file is not locked. As noted in the actual litestream documentation, it uses a long-running read transaction, which is, in fact, a read-shared advisory lock on the file, in order to prevent the WAL file from wrapping around before litestream can read older log entries.

The advantage of using the WAL file is that litestream really only needs to lock for very short periods of time (well, short relatively --- it is still a "long-running" read transaction that locks for entire hundreds of milliseconds when most read transactions will be less than a hundred milliseconds tops) because it only needs to read very short WAL entries. Using something like sqlite3 .dump or VACUUM INTO locks for several seconds because it has to read a lot more data from the database, so relative to those solutions, litestream is better.

ZmnSCPxj added a commit to ZmnSCPxj/lightning that referenced this issue Oct 16, 2021
Closes: ElementsProject#4860

ChangeLog-Added: With `sqlite3` db backend we now use a 60-second busy timer, to allow backup processes like `litestream` to operate safely.
ZmnSCPxj added a commit that referenced this issue Oct 17, 2021
Closes: #4860

ChangeLog-Added: With `sqlite3` db backend we now use a 60-second busy timer, to allow backup processes like `litestream` to operate safely.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants