Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
btrfs-progs: Exhaust delayed refs and dirty block groups to prevent d…
…elayed refs lost [BUG] Btrfs-progs sometimes fails to find certain extent backref when committing transaction. The most reliable way to reproduce it is fsck-test/013 on 64K page sized system: [...] adding new data backref on 315859712 root 287 owner 292 offset 0 found 1 btrfs unable to find ref byte nr 31850496 parent 0 root 2 owner 0 offset 0 Failed to find [30867456, 168, 65536] Also there are some github bug reports related to this problem. [CAUSE] Commit 909357e ("btrfs-progs: Wire up delayed refs") introduced delayed refs in btrfs-progs. However in that commit, delayed refs are not run at correct timing. That commit calls btrfs_run_delayed_refs() before btrfs_write_dirty_block_groups(), which needs to update BLOCK_GROUP_ITEMs in extent tree, thus could cause new delayed refs. This means each time we commit a transaction, we may screw up the extent tree by dropping some pending delayed refs, like: Transaction 711: btrfs_commit_transaction() |- btrfs_run_delayed_refs() | Now all delayed refs are written to extent tree | |- btrfs_write_dirty_block_groups() | Needs to update extent tree root | ADD_DELAYED_REF to 315859712. | Delayed refs are attached to current trans handle. | |- __commit_transaction() |- write_ctree_super() |- btrfs_finish_extent_commit() |- kfree(trans) Now delayed ref for 315859712 are lost Transaction 712: Tree block 315859712 get dropped btrfs_commit_transaction() |- btrfs_run_delayed_refs() |- run_one_delayed_ref() |- __free_extent() As previous ADD_DELAYED_REF to 315859712 is lost, extent tree doesn't have any backref for 315859712, causing the bug In fact, commit c31edf6 ("btrfs-progs: Fix false ENOSPC alert by tracking used space correctly") detects the tree block leakage, but in the reproducer we have too much noise, thus nobody notices the leakage warning. [FIX] We can't just move btrfs_run_delayed_refs() after btrfs_write_dirty_block_groups(), as during btrfs_run_delayed_refs(), we can re-dirty block groups. Thus we need to exhaust both delayed refs and dirty blocks. This patch will call btrfs_write_dirty_block_groups() and btrfs_run_delayed_refs() in a loop until both delayed refs and dirty blocks are exhausted. Much like what we do in commit_cowonly_roots() in kernel. Also, to prevent such problem from happening again (and not to debug such problem again), add extra check on delayed refs before freeing the transaction handle. Reported-by: Klemens Schölhorn <klemens@schoelhorn.eu> Issue: #187 Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
- Loading branch information