-
Notifications
You must be signed in to change notification settings - Fork 314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: call fsync before rename #1134
Conversation
According to the POSIX spec, one needs to call fsync on the file descriptor used for writing, see: https://pubs.opengroup.org/onlinepubs/9699919799/functions/fsync.html Moreover, fail if database.db.tmp exists, since this is an indication something related to database.db writing failed in the past and we did not notice.
NB, this should address issue koenk/zigbee2mqtt#11759, or well, at least improve the situation. |
If I understand correctly, the current code doesn't do anything because a fsync is called on a file descriptor where nothing was written to?
Yes, otherwise we cannot get out of the situation without manual intervention. I propose to, before opening the fd, do a check whether the tmp db file exists, if it does, postfix it with the unix timestamp and log a warning message. |
Indeed, that can happen in theory. File systems add many checks to make sure the data does hit the disk in time, but especially when running on SD cards stuff may get cached aggressively before it appears on the disk (Raspberry PI users, for example).
Good point, so there is nothing else that catches the existence of a database.db.tmp. I added a slightly adapted behavior: if database.tmp already exists, rename it to database.tmp. and warn. In that case, only one warning is emitted. I broke CI since this adds an untested branch. Let me know whether you like the new behavior, because then I will add a test for it. |
Looks good! |
💯 |
When writing
database.db
to disk, we should call fsync on the file descriptor used for writing. Seehttps://pubs.opengroup.org/onlinepubs/9699919799/functions/fsync.html
Which explicitly states that "The fsync() function shall request that all data for the open file descriptor ..." (emphasis mine)
I also enforced that database.db.tmp is not created if it already exists, in which case an Exception is thrown. That should be an indication that something went wrong in the past, but maybe this may be annoying. Should this behavior be properly handled in a try/catch and a nice logging message?