Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AddressSanitizer reports use-after-free within zstd mempool #12215

Closed
ahrens opened this issue Jun 10, 2021 · 9 comments · Fixed by #12928
Closed

AddressSanitizer reports use-after-free within zstd mempool #12215

ahrens opened this issue Jun 10, 2021 · 9 comments · Fixed by #12928
Labels
Status: Triage Needed New issue which needs to be triaged Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@ahrens
Copy link
Member

ahrens commented Jun 10, 2021

System information

Type Version/Name
Distribution Name
Distribution Version
Linux Kernel
Architecture x86_64
ZFS Version 860051f
SPL Version

Describe the problem you're observing

Describe how to reproduce the problem

./configure --enable-asan, then run zloop. AddressSanitizer reports use-after-poison.

I can make the problem go away by changing zstd_mempool_alloc() to always use vmem_alloc() (by commenting out the code above the "try lazy allocation" comment).

=================================================================
==130608==ERROR: AddressSanitizer: use-after-poison on address 0x63300001c818 at pc 0x7ff61da02d1a bp 0x7ff6158efb40 sp 0x7ff6158ef2e8
WRITE of size 1152 at 0x63300001c818 thread T105
==130608==AddressSanitizer: while reporting a bug found another one. Ignoring.
  3.41 sec in ztest_dmu_write_parallel
    #0 0x7ff61da02d19  (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x5ed19)
    #1 0x7ff61d30ff77 in memset /usr/include/x86_64-linux-gnu/bits/string_fortified.h:71
    #2 0x7ff61d30ff77 in ZSTD_initCCtx ../../module/zstd/lib/zstd.c:13156
    #3 0x7ff61d3100ab in zfs_ZSTD_createCCtx_advanced ../../module/zstd/lib/zstd.c:13172
    #4 0x7ff61d32ac00 in zfs_zstd_compress ../../module/zstd/zfs_zstd.c:386
    #5 0x7ff61d194071 in zio_compress_data ../../module/zfs/zio_compress.c:166
    #6 0x7ff61d17a505 in zio_write_compress ../../module/zfs/zio.c:1711
    #7 0x7ff61d1742fd in __zio_execute ../../module/zfs/zio.c:2195
    #8 0x7ff61d1742fd in zio_execute ../../module/zfs/zio.c:2108
    #9 0x7ff61ce51cf9 in taskq_thread /export/home/delphix/zfs/lib/libzpool/taskq.c:237
    #10 0x7ff61c2a66da in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76da)
    #11 0x7ff61bfcf71e in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x12171e)

0x63300001c818 is located 24 bytes inside of 106613-byte region [0x63300001c800,0x633000036875)
allocated by thread T103 here:
    #0 0x7ff61da82b40 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xdeb40)
    #1 0x7ff61d329fac in umem_alloc ../../lib/libspl/include/umem.h:92
    #2 0x7ff61d32a106 in zstd_mempool_alloc ../../module/zstd/zfs_zstd.c:297
    #3 0x7ff61d32a3b3 in zstd_alloc ../../module/zstd/zfs_zstd.c:560
    #4 0x7ff61d2beb1a in zfs_ZSTD_malloc ../../module/zstd/lib/zstd.c:7376
    #5 0x7ff61d2beb61 in ZSTD_cwksp_create ../../module/zstd/lib/zstd.c:9673
    #6 0x7ff61d3139c1 in zfs_ZSTD_resetCCtx_internal ../../module/zstd/lib/zstd.c:14579
    #7 0x7ff61d3197db in ZSTD_compressBegin_internal ../../module/zstd/lib/zstd.c:16187
    #8 0x7ff61d31afc5 in ZSTD_resetCStream_internal ../../module/zstd/lib/zstd.c:16739
    #9 0x7ff61d31c4aa in zfs_ZSTD_compressStream2 ../../module/zstd/lib/zstd.c:17093
    #10 0x7ff61d31c8a2 in zfs_ZSTD_compressStream2_simpleArgs ../../module/zstd/lib/zstd.c:17142
    #11 0x7ff61d31cb60 in zfs_ZSTD_compress2 ../../module/zstd/lib/zstd.c:17156
    #12 0x7ff61d32ac95 in zfs_zstd_compress ../../module/zstd/zfs_zstd.c:410
    #13 0x7ff61d194071 in zio_compress_data ../../module/zfs/zio_compress.c:166
    #14 0x7ff61d17a505 in zio_write_compress ../../module/zfs/zio.c:1711
    #15 0x7ff61d1742fd in __zio_execute ../../module/zfs/zio.c:2195
    #16 0x7ff61d1742fd in zio_execute ../../module/zfs/zio.c:2108
    #17 0x7ff61ce51cf9 in taskq_thread /export/home/delphix/zfs/lib/libzpool/taskq.c:237
    #18 0x7ff61c2a66da in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76da)
...
SUMMARY: AddressSanitizer: use-after-poison (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x5ed19)
Shadow bytes around the buggy address:
  0x0c667fffb8b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c667fffb8c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c667fffb8d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c667fffb8e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c667fffb8f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0c667fffb900: 00 00 00[f7]f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7
  0x0c667fffb910: f7 f7 f7 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c667fffb920: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c667fffb930: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c667fffb940: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c667fffb950: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==130608==ABORTING

Include any warning/errors/backtraces from the system logs

@ahrens ahrens added Type: Defect Incorrect behavior (e.g. crash, hang) Status: Triage Needed New issue which needs to be triaged labels Jun 10, 2021
@ahrens
Copy link
Member Author

ahrens commented Jun 10, 2021

cc @c0d3z3r0 @allanjude

@PrivatePuffin
Copy link
Contributor

cc @BrainSlayer (as he wrote most of the current allocator)

@BrainSlayer
Copy link
Contributor

BrainSlayer commented Jun 16, 2021

not sure if this is a false positive due the way this allocator is working. this is no allocator which frees memory after it has been allocated and used. its a memory cache which keeps allocated memory for reuse to avoid reallocation delays. i also cannot find any bug in the code while reviewing. and no crash has been ever observed in it, in its daily use. a use after free would lead to crashes at a certain point. if something find something i would be happy to know about it. but i cannot find any issue in the code here

edt:
running the same test now to see if its reproducable

@BrainSlayer
Copy link
Contributor

no error so far

06/16 14:36:48 /xfs/zfs_zstd/zfs_zstd/asan/bin/ztest -G -VVVVV -K raidz -m 2 -r 0 -D 0 -S 0 -R 1 -v 4 -a 12 -C special=random -T 55 -P 12 -s 512m -f /var/tmp/zloop-run
06/16 14:38:01 /xfs/zfs_zstd/zfs_zstd/asan/bin/ztest -G -VVVVV -K raidz -m 0 -r 0 -D 0 -S 0 -R 3 -v 5 -a 12 -C special=random -T 103 -P 24 -s 512m -f /var/tmp/zloop-run
06/16 14:40:14 /xfs/zfs_zstd/zfs_zstd/asan/bin/ztest -G -VVVVV -K raidz -m 0 -r 11 -D 0 -S 0 -R 2 -v 5 -a 12 -C special=random -T 47 -P 12 -s 512m -f /var/tmp/zloop-run
06/16 14:41:22 /xfs/zfs_zstd/zfs_zstd/asan/bin/ztest -G -VVVVV -K raidz -m 0 -r 9 -D 0 -S 0 -R 2 -v 4 -a 9 -C special=random -T 39 -P 14 -s 512m -f /var/tmp/zloop-run
06/16 14:43:12 /xfs/zfs_zstd/zfs_zstd/asan/bin/ztest -G -VVVVV -K draid -m 0 -r 33 -D 7 -S 1 -R 1 -v 0 -a 9 -C special=random -T 50 -P 11 -s 512m -f /var/tmp/zloop-run
06/16 14:44:26 /xfs/zfs_zstd/zfs_zstd/asan/bin/ztest -G -VVVVV -K raidz -m 2 -r 0 -D 0 -S 0 -R 1 -v 2 -a 12 -C special=random -T 39 -P 18 -s 512m -f /var/tmp/zloop-run
06/16 14:45:32 /xfs/zfs_zstd/zfs_zstd/asan/bin/ztest -G -VVVVV -K draid -m 0 -r 30 -D 5 -S 2 -R 2 -v 0 -a 12 -C special=random -T 108 -P 41 -s 512m -f /var/tmp/zloop-run
06/16 14:47:43 /xfs/zfs_zstd/zfs_zstd/asan/bin/ztest -G -VVVVV -K raidz -m 0 -r 4 -D 0 -S 0 -R 3 -v 5 -a 9 -C special=random -T 101 -P 23 -s 512m -f /var/tmp/zloop-run
06/16 14:51:06 /xfs/zfs_zstd/zfs_zstd/asan/bin/ztest -G -VVVVV -K raidz -m 1 -r 5 -D 0 -S 0 -R 3 -v 4 -a 12 -C special=random -T 105 -P 37 -s 512m -f /var/tmp/zloop-run
06/16 14:53:09 /xfs/zfs_zstd/zfs_zstd/asan/bin/ztest -G -VVVVV -K raidz -m 2 -r 0 -D 0 -S 0 -R 1 -v 2 -a 9 -C special=random -T 67 -P 12 -s 512m -f /var/tmp/zloop-run
06/16 14:54:33 /xfs/zfs_zstd/zfs_zstd/asan/bin/ztest -G -VVVVV -K draid -m 0 -r 38 -D 10 -S 2 -R 2 -v 1 -a 9 -C special=random -T 60 -P 20 -s 512m -f /var/tmp/zloop-run
06/16 14:55:53 /xfs/zfs_zstd/zfs_zstd/asan/bin/ztest -G -VVVVV -K draid -m 0 -r 23 -D 8 -S 2 -R 1 -v 2 -a 9 -C special=random -T 70 -P 18 -s 512m -f /var/tmp/zloop-run
06/16 14:57:25 /xfs/zfs_zstd/zfs_zstd/asan/bin/ztest -G -VVVVV -K draid -m 0 -r 41 -D 7 -S 3 -R 2 -v 1 -a 9 -C special=random -T 66 -P 31 -s 512m -f /var/tmp/zloop-run
06/16 14:59:11 /xfs/zfs_zstd/zfs_zstd/asan/bin/ztest -G -VVVVV -K raidz -m 1 -r 8 -D 0 -S 0 -R 2 -v 5 -a 12 -C special=random -T 33 -P 19 -s 512m -f /var/tmp/zloop-run
06/16 15:01:04 /xfs/zfs_zstd/zfs_zstd/asan/bin/ztest -G -VVVVV -K draid -m 0 -r 20 -D 4 -S 2 -R 1 -v 1 -a 12 -C special=random -T 40 -P 20 -s 512m -f /var/tmp/zloop-run
06/16 15:02:09 /xfs/zfs_zstd/zfs_zstd/asan/bin/ztest -G -VVVVV -K draid -m 0 -r 30 -D 6 -S 1 -R 1 -v 2 -a 9 -C special=random -T 118 -P 11 -s 512m -f /var/tmp/zloop-run
06/16 15:04:28 /xfs/zfs_zstd/zfs_zstd/asan/bin/ztest -G -VVVVV -K raidz -m 2 -r 0 -D 0 -S 0 -R 1 -v 2 -a 12 -C special=random -T 108 -P 20 -s 512m -f /var/tmp/zloop-run
06/16 15:06:39 /xfs/zfs_zstd/zfs_zstd/asan/bin/ztest -G -VVVVV -K raidz -m 2 -r 0 -D 0 -S 0 -R 1 -v 2 -a 12 -C special=random -T 51 -P 15 -s 512m -f /var/tmp/zloop-run

@dioni21
Copy link
Contributor

dioni21 commented Jun 16, 2021

Could this bug depend on compiler and library versions? Are you both running the same system?

@BrainSlayer
Copy link
Contributor

Could this bug depend on compiler and library versions? Are you both running the same system?

of course not. i can just talk about my system which is opensuse based in kernel 5.13 using latest trunk source + gcc 10.2.1

@szubersk
Copy link
Contributor

I confirm that ztest compiled with asan and ubsan throws tantrums

==654029==ERROR: AddressSanitizer: use-after-poison on address 0x7f835d165818 at pc 0x7f83698f128e bp 0x7f8360668860 sp 0x7f8360668010
WRITE of size 1152 at 0x7f835d165818 thread T107                 
    #0 0x7f83698f128d in __interceptor_memset ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:799
    #1 0x7f836857d666 in ZSTD_initCCtx (/lib/x86_64-linux-gnu/libzpool.so.5+0x1865666)
    #2 0x7f836857d8f7 in zfs_ZSTD_createCCtx_advanced (/lib/x86_64-linux-gnu/libzpool.so.5+0x18658f7)
    #3 0x7f83685b4475 in zfs_zstd_compress (/lib/x86_64-linux-gnu/libzpool.so.5+0x189c475)
    #4 0x7f83681c0f32 in zio_compress_data (/lib/x86_64-linux-gnu/libzpool.so.5+0x14a8f32)
    #5 0x7f836818703f in zio_write_compress (/lib/x86_64-linux-gnu/libzpool.so.5+0x146f03f)
    #6 0x7f8368178ada in zio_execute (/lib/x86_64-linux-gnu/libzpool.so.5+0x1460ada)
    #7 0x7f8367b0d9cd in taskq_thread (/lib/x86_64-linux-gnu/libzpool.so.5+0xdf59cd)
    #8 0x7f8366120d7f in start_thread nptl/pthread_create.c:481  
    #9 0x7f8366044b6e in clone (/lib/x86_64-linux-gnu/libc.so.6+0xfcb6e)
                                                                                               
0x7f835d165818 is located 24 bytes inside of 161937-byte region [0x7f835d165800,0x7f835d18d091) 
allocated by thread T107 here:                
    #0 0x7f83699667cf in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x7f83685b1ea7 in umem_alloc (/lib/x86_64-linux-gnu/libzpool.so.5+0x1899ea7)
    #2 0x7f83685b2b81 in zstd_mempool_alloc (/lib/x86_64-linux-gnu/libzpool.so.5+0x189ab81)
    #3 0x7f83685b3207 in zstd_alloc (/lib/x86_64-linux-gnu/libzpool.so.5+0x189b207)
    #4 0x7f83684be020 in zfs_ZSTD_malloc (/lib/x86_64-linux-gnu/libzpool.so.5+0x17a6020)
    #5 0x7f83684be122 in ZSTD_cwksp_create (/lib/x86_64-linux-gnu/libzpool.so.5+0x17a6122)
    #6 0x7f836858441e in zfs_ZSTD_resetCCtx_internal (/lib/x86_64-linux-gnu/libzpool.so.5+0x186c41e)
    #7 0x7f8368592b11 in ZSTD_compressBegin_internal (/lib/x86_64-linux-gnu/libzpool.so.5+0x187ab11)
    #8 0x7f83685951ac in ZSTD_resetCStream_internal (/lib/x86_64-linux-gnu/libzpool.so.5+0x187d1ac)
    #9 0x7f8368597668 in zfs_ZSTD_compressStream2 (/lib/x86_64-linux-gnu/libzpool.so.5+0x187f668)
    #10 0x7f8368597f9a in zfs_ZSTD_compressStream2_simpleArgs (/lib/x86_64-linux-gnu/libzpool.so.5+0x187ff9a)
    #11 0x7f83685983a3 in zfs_ZSTD_compress2 (/lib/x86_64-linux-gnu/libzpool.so.5+0x18803a3)
    #12 0x7f83685b452c in zfs_zstd_compress (/lib/x86_64-linux-gnu/libzpool.so.5+0x189c52c)
    #13 0x7f83681c0f32 in zio_compress_data (/lib/x86_64-linux-gnu/libzpool.so.5+0x14a8f32)
    #14 0x7f836818703f in zio_write_compress (/lib/x86_64-linux-gnu/libzpool.so.5+0x146f03f)
    #15 0x7f8368178ada in zio_execute (/lib/x86_64-linux-gnu/libzpool.so.5+0x1460ada)
    #16 0x7f8367b0d9cd in taskq_thread (/lib/x86_64-linux-gnu/libzpool.so.5+0xdf59cd)
    #17 0x7f8366120d7f in start_thread nptl/pthread_create.c:481

On top of asan problems, ubsan also reports following

../../module/zfs/vdev.c:4528:15: runtime error: member access within null pointer of type 'struct vdev_t'                                                                                     
../../module/zfs/vdev.c:4529:18: runtime error: member access within null pointer of type 'struct vdev_t'                                                                           
../../module/icp/io/sha2_mod.c:735:2: runtime error: null pointer passed as argument 1, which is declared to never be null                                                                    
../../module/icp/io/sha2_mod.c:736:2: runtime error: null pointer passed as argument 1, which is declared to never be null                                                          
../../module/zcommon/zfs_fletcher.c:324:4: runtime error: member access within misaligned address 0x7ffc970195d0 for type 'union fletcher_4_ctx_t', which requires 64 byte alignment

@szubersk
Copy link
Contributor

@ahrens this issue was fixed in #12928 by removing ASan poisoning for zstd due to unusual memory management used there.

@ahrens
Copy link
Member Author

ahrens commented Feb 16, 2022

@szubersk great!

@ahrens ahrens closed this as completed Feb 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Triage Needed New issue which needs to be triaged Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants