-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop UB complaints and autovectorization of scalar fletcher4 implementation #13631
Changes from all commits
f4338b5
a4dee02
2e096eb
896773d
9b13f28
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,6 +5,22 @@ libzpool_la_CPPFLAGS = $(AM_CPPFLAGS) $(FORCEDEBUG_CPPFLAGS) | |
libzpool_la_CPPFLAGS += -I$(srcdir)/include/os/@ac_system_l@/zfs | ||
libzpool_la_CPPFLAGS += -DLIB_ZPOOL_BUILD | ||
|
||
# Less dire than if you override this in the kernel, but autovectorizing | ||
# random portions seems like a great way to encounter crashes in places | ||
# that look, on the surface, like there's no SIMD involved... | ||
# | ||
# We would keep -mgeneral-regs-only, but on e.g. x64, things like atof | ||
# can't be defined without implying using FPU regs, so it's a compile | ||
# time error. | ||
NOVECTOR := -fno-tree-vectorize | ||
$(addprefix module/zcommon/libzpool_la-,zfs_fletcher.$(OBJEXT) zfs_fletcher.l$(OBJEXT)) : CFLAGS += $(NOVECTOR) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The same as above. |
||
$(addprefix module/zcommon/libzpool_la-,zfs_fletcher_aarch64_neon.$(OBJEXT) zfs_fletcher_aarch64_neon.l$(OBJEXT)) : CFLAGS += $(NOVECTOR) | ||
$(addprefix module/zcommon/libzpool_la-,zfs_fletcher_avx512.$(OBJEXT) zfs_fletcher_avx512.l$(OBJEXT)) : CFLAGS += $(NOVECTOR) | ||
$(addprefix module/zcommon/libzpool_la-,zfs_fletcher_intel.$(OBJEXT) zfs_fletcher_intel.l$(OBJEXT)) : CFLAGS += $(NOVECTOR) | ||
$(addprefix module/zcommon/libzpool_la-,zfs_fletcher_sse.$(OBJEXT) zfs_fletcher_sse.l$(OBJEXT)) : CFLAGS += $(NOVECTOR) | ||
$(addprefix module/zcommon/libzpool_la-,zfs_fletcher_superscalar.$(OBJEXT) zfs_fletcher_superscalar.l$(OBJEXT)) : CFLAGS += $(NOVECTOR) | ||
$(addprefix module/zcommon/libzpool_la-,zfs_fletcher_superscalar4.$(OBJEXT) zfs_fletcher_superscalar4.l$(OBJEXT)) : CFLAGS += $(NOVECTOR) | ||
|
||
lib_LTLIBRARIES += libzpool.la | ||
CPPCHECKTARGETS += libzpool.la | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -39,6 +39,15 @@ ifneq ($(KBUILD_EXTMOD),) | |
@CONFIG_QAT_TRUE@KBUILD_EXTRA_SYMBOLS += @QAT_SYMBOLS@ | ||
endif | ||
|
||
# Woe betide ye who override this, for yours is the kingdom of mysterious segfaults | ||
# absolutely everywhere. | ||
# | ||
# Free warning to anyone considering it - -march=native or the like later in the line | ||
# will undo -mgeneral-regs-only, and gcc's -O2 starting in 12 does autovectorizing. | ||
# | ||
# Good luck. | ||
ZFS_MODULE_CFLAGS += -mgeneral-regs-only -fno-tree-vectorize | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The Linux kernel really should handle this for us by disabling all of the vector instructions such that there is no need to pass either -mgeneral-regs-only or -fno-tree-vectorize.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I didn't just add it for fun, no. As the comment above explains, if the user forcibly overrides -march later in the line, then these are overridden as well, and boom goes the dynamite. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Alright. -mgeneral-regs-only is not backward compatible with older compilers so unless we write an autotools check for it, we probably should drop it in favor of only doing -fno-tree-vectorize. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was going to write an autoconf check, yes. I think -fno-tree-vectorize might also need a check of that nature, but it's been a while since I looked, so I'll check when I get back to this. |
||
|
||
asflags-y := $(ZFS_MODULE_CFLAGS) $(ZFS_MODULE_CPPFLAGS) | ||
ccflags-y := $(ZFS_MODULE_CFLAGS) $(ZFS_MODULE_CPPFLAGS) | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -423,25 +423,34 @@ CFLAGS.zil.c= -Wno-cast-qual | |
CFLAGS.zio.c= -Wno-cast-qual | ||
CFLAGS.zrlock.c= -Wno-cast-qual | ||
CFLAGS.zfs_zstd.c= -Wno-cast-qual -Wno-pointer-arith | ||
CFLAGS.entropy_common.c= -fno-tree-vectorize -U__BMI__ | ||
CFLAGS.error_private.c= -fno-tree-vectorize -U__BMI__ | ||
CFLAGS.fse_decompress.c= -fno-tree-vectorize -U__BMI__ | ||
CFLAGS.pool.c= -fno-tree-vectorize -U__BMI__ | ||
CFLAGS.xxhash.c= -fno-tree-vectorize -U__BMI__ | ||
CFLAGS.zstd_common.c= -fno-tree-vectorize -U__BMI__ | ||
CFLAGS.fse_compress.c= -fno-tree-vectorize -U__BMI__ | ||
CFLAGS.hist.c= -fno-tree-vectorize -U__BMI__ | ||
CFLAGS.huf_compress.c= -fno-tree-vectorize -U__BMI__ | ||
CFLAGS.zstd_compress.c= -fno-tree-vectorize -U__BMI__ | ||
CFLAGS.zstd_compress_literals.c= -fno-tree-vectorize -U__BMI__ | ||
CFLAGS.zstd_compress_sequences.c= -fno-tree-vectorize -U__BMI__ | ||
CFLAGS.zstd_compress_superblock.c= -fno-tree-vectorize -U__BMI__ | ||
CFLAGS.zstd_double_fast.c= -fno-tree-vectorize -U__BMI__ | ||
CFLAGS.zstd_fast.c= -fno-tree-vectorize -U__BMI__ | ||
CFLAGS.zstd_lazy.c= -fno-tree-vectorize -U__BMI__ | ||
CFLAGS.zstd_ldm.c= -fno-tree-vectorize -U__BMI__ | ||
CFLAGS.zstd_opt.c= -fno-tree-vectorize -U__BMI__ | ||
CFLAGS.huf_decompress.c= -fno-tree-vectorize -U__BMI__ | ||
CFLAGS.zstd_ddict.c= -fno-tree-vectorize -U__BMI__ | ||
CFLAGS.zstd_decompress.c= -fno-tree-vectorize -U__BMI__ | ||
CFLAGS.zstd_decompress_block.c= -fno-tree-vectorize -U__BMI__ | ||
CFLAGS.entropy_common.c=-U__BMI__ | ||
CFLAGS.error_private.c= -U__BMI__ | ||
CFLAGS.fse_decompress.c= -U__BMI__ | ||
CFLAGS.pool.c= -U__BMI__ | ||
CFLAGS.xxhash.c= -U__BMI__ | ||
CFLAGS.zstd_common.c= -U__BMI__ | ||
CFLAGS.fse_compress.c= -U__BMI__ | ||
CFLAGS.hist.c= -U__BMI__ | ||
CFLAGS.huf_compress.c= -U__BMI__ | ||
CFLAGS.zstd_compress.c= -U__BMI__ | ||
CFLAGS.zstd_compress_literals.c= -U__BMI__ | ||
CFLAGS.zstd_compress_sequences.c= -U__BMI__ | ||
CFLAGS.zstd_compress_superblock.c= -U__BMI__ | ||
CFLAGS.zstd_double_fast.c= -U__BMI__ | ||
CFLAGS.zstd_fast.c= -U__BMI__ | ||
CFLAGS.zstd_lazy.c= -U__BMI__ | ||
CFLAGS.zstd_ldm.c= -U__BMI__ | ||
CFLAGS.zstd_opt.c= -U__BMI__ | ||
CFLAGS.huf_decompress.c= -U__BMI__ | ||
CFLAGS.zstd_ddict.c= -U__BMI__ | ||
CFLAGS.zstd_decompress.c= -U__BMI__ | ||
CFLAGS.zstd_decompress_block.c= -U__BMI__ | ||
|
||
# Woe betide ye who override this, for yours is the kingdom of mysterious segfaults | ||
# absolutely everywhere. | ||
# | ||
# Free warning to anyone considering it - -march=native or the like later in the line | ||
# will undo -mgeneral-regs-only, and gcc's -O2 starting in 12 does autovectorizing. | ||
# | ||
# Good luck. | ||
CFLAGS += -mgeneral-regs-only -fno-tree-vectorize | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not sure how far back FreeBSD's compiler support goes, but the same remark about |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this fix a real problem or is it just a precaution? I would rather not disable auto-vectorization in userspace since we expect the compiler to attempt these things in userspace, where using SIMD should always be safe.
That said, I know for a fact that the compiler will sometimes vectorize the mixing matrix calculations. I am not sure if its vectorization makes a difference, but I am inclined to let it try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#13605 says hello.