-
Notifications
You must be signed in to change notification settings - Fork 249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
btrfs-convert: source-ext2.c:845: ext2_copy_inodes: BUG_ON ret
triggered, value -5
#187
Comments
Thank you for the report and detailed debug output. It shows that we have the correct METADATA_ITEM key, but delayed tree still failed to locate it. |
Please try this debug branch: Although I'm pretty sure it's That debug branch would output needed debug info. Thanks. |
Thanks for the reply! Here is the log from the same command as above using your debug branch (dfa4977), this time also including the progress output (as I didn't have to copy it from my terminal buffer 😉): btrfs-convert-dfa4977b40-20190706T133750.log (Attached as a file, because it contains some invalid utf-8 in the leaf flags debug output. |
Thanks for your log! Unfortunately (or fortunately) I could finally reproduce the bug itself, by running fsck-tests/013 on a 64K page size system. And since it's reproduced reliably, it's quite easy to debug, and here comes the fix: It's based on devel branch, which includes other delayed ref fixes. Thanks. |
I had the same problem on my PowerMac G4, trying to convert 48 GiB ext4 to btrfs on my SSD. I reported the bug on bugzilla. Luckily I found this report here later! I can confirm the fix (adam900710's delayed_ref_fix tree) works for me. Thanks @adam900710 ! |
I can also confirm that the conversion was successful. (I was using only aaafe6f following the discussion on the mailing list) I guess you don't need anything else and I can start using the converted fs? Thank you for your quick fix! |
…elayed refs lost [BUG] Btrfs-progs sometimes fails to find certain extent backref when committing transaction. The most reliable way to reproduce it is fsck-test/013 on 64K page sized system: [...] adding new data backref on 315859712 root 287 owner 292 offset 0 found 1 btrfs unable to find ref byte nr 31850496 parent 0 root 2 owner 0 offset 0 Failed to find [30867456, 168, 65536] Also there are some github bug reports related to this problem. [CAUSE] Commit 909357e ("btrfs-progs: Wire up delayed refs") introduced delayed refs in btrfs-progs. However in that commit, delayed refs are not run at correct timing. That commit calls btrfs_run_delayed_refs() before btrfs_write_dirty_block_groups(), which needs to update BLOCK_GROUP_ITEMs in extent tree, thus could cause new delayed refs. This means each time we commit a transaction, we may screw up the extent tree by dropping some pending delayed refs, like: Transaction 711: btrfs_commit_transaction() |- btrfs_run_delayed_refs() | Now all delayed refs are written to extent tree | |- btrfs_write_dirty_block_groups() | Needs to update extent tree root | ADD_DELAYED_REF to 315859712. | Delayed refs are attached to current trans handle. | |- __commit_transaction() |- write_ctree_super() |- btrfs_finish_extent_commit() |- kfree(trans) Now delayed ref for 315859712 are lost Transaction 712: Tree block 315859712 get dropped btrfs_commit_transaction() |- btrfs_run_delayed_refs() |- run_one_delayed_ref() |- __free_extent() As previous ADD_DELAYED_REF to 315859712 is lost, extent tree doesn't have any backref for 315859712, causing the bug In fact, commit c31edf6 ("btrfs-progs: Fix false ENOSPC alert by tracking used space correctly") detects the tree block leakage, but in the reproducer we have too much noise, thus nobody notices the leakage warning. [FIX] We can't just move btrfs_run_delayed_refs() after btrfs_write_dirty_block_groups(), as during btrfs_run_delayed_refs(), we can re-dirty block groups. Thus we need to exhaust both delayed refs and dirty blocks. This patch will call btrfs_write_dirty_block_groups() and btrfs_run_delayed_refs() in a loop until both delayed refs and dirty blocks are exhausted. Much like what we do in commit_cowonly_roots() in kernel. Also, to prevent such problem from happening again (and not to debug such problem again), add extra check on delayed refs before freeing the transaction handle. Reported-by: Klemens Schölhorn <klemens@schoelhorn.eu> Issue: #187 Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
…elayed refs lost [BUG] Btrfs-progs sometimes fails to find certain extent backref when committing transaction. The most reliable way to reproduce it is fsck-test/013 on 64K page sized system: [...] adding new data backref on 315859712 root 287 owner 292 offset 0 found 1 btrfs unable to find ref byte nr 31850496 parent 0 root 2 owner 0 offset 0 Failed to find [30867456, 168, 65536] Also there are some github bug reports related to this problem. [CAUSE] Commit 909357e ("btrfs-progs: Wire up delayed refs") introduced delayed refs in btrfs-progs. However in that commit, delayed refs are not run at correct timing. That commit calls btrfs_run_delayed_refs() before btrfs_write_dirty_block_groups(), which needs to update BLOCK_GROUP_ITEMs in extent tree, thus could cause new delayed refs. This means each time we commit a transaction, we may screw up the extent tree by dropping some pending delayed refs, like: Transaction 711: btrfs_commit_transaction() |- btrfs_run_delayed_refs() | Now all delayed refs are written to extent tree | |- btrfs_write_dirty_block_groups() | Needs to update extent tree root | ADD_DELAYED_REF to 315859712. | Delayed refs are attached to current trans handle. | |- __commit_transaction() |- write_ctree_super() |- btrfs_finish_extent_commit() |- kfree(trans) Now delayed ref for 315859712 are lost Transaction 712: Tree block 315859712 get dropped btrfs_commit_transaction() |- btrfs_run_delayed_refs() |- run_one_delayed_ref() |- __free_extent() As previous ADD_DELAYED_REF to 315859712 is lost, extent tree doesn't have any backref for 315859712, causing the bug In fact, commit c31edf6 ("btrfs-progs: Fix false ENOSPC alert by tracking used space correctly") detects the tree block leakage, but in the reproducer we have too much noise, thus nobody notices the leakage warning. [FIX] We can't just move btrfs_run_delayed_refs() after btrfs_write_dirty_block_groups(), as during btrfs_run_delayed_refs(), we can re-dirty block groups. Thus we need to exhaust both delayed refs and dirty blocks. This patch will call btrfs_write_dirty_block_groups() and btrfs_run_delayed_refs() in a loop until both delayed refs and dirty blocks are exhausted. Much like what we do in commit_cowonly_roots() in kernel. Also, to prevent such problem from happening again (and not to debug such problem again), add extra check on delayed refs before freeing the transaction handle. Reported-by: Klemens Schölhorn <klemens@schoelhorn.eu> Issue: #187 Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
I guess this can be closed? |
Possibly continuing from #123, I am hitting the same issue trying to convert a 4 TiB ext4 filesystem with version 5.1.1 on linux 4.19.53. The fs has around 800 GiB free space and was defragmented using e4defrag.
@adam900710: It seems the output already contains the additional debug output you mentioned in #123.
PS: Sorry for accidentally opening #186, I hit enter while entering the title…
The text was updated successfully, but these errors were encountered: