Vectorized field trait #216

nbgl · 2021-09-03T16:17:30Z

Update:
I’m breaking this massive PR up into a few smaller ones to ease review. This one is for the PackedField and Packable traits, as well as the Singleton struct. A PackedField is a wrapper around a vector of field elements, permitting a subset of field operations. Packable extends Field with the default PackedField for a particular Field. It defaults to Singleton a dummy container for a scalar, and is specialized per platform.

Original comment:
This PR introduces PackedFields, which permit SIMD field arithmetic. The default packing defers to scalar code, but can be overridden for particular architecture/field combinations.

I implemented Crandall arithmetic in AVX2, and rewrote fft_classic in vector as a test case. The vectorized FFT is 30-ish% faster (needs more thorough benchmarks) on Skylake.

While functional, this branch needs a cleanup before it can be reviewed. There’s some debug logging/unused imports/confusing names/undocumented tricks. But I welcome big-picture comments!

nbgl · 2021-09-03T16:24:52Z

It looks like the build can’t compile num-bigint, so I suspect it would fail on main as well

dlubarov · 2021-09-03T17:03:23Z

Yep the latest nightly seems to break main. I'll look into it

dlubarov · 2021-09-03T17:05:54Z

Looks like we'll need to wait for a change in num -- see rust-num/num-bigint/issues/218. It affects lots of users so hopefully it will come soon.

dlubarov · 2021-09-03T19:51:37Z

src/field/packed_field.rs

+
+    fn broadcast(x: Self::FieldType) -> Self;
+
+    fn from_arr(arr: &[Self::FieldType]) -> Self;


Maybe call this from_slice since it technically takes a slice? Or have it take [Self::FieldType; WIDTH]

Similarly for the method below, I'd either call it to_vec or have it return [Self::FieldType; WIDTH]

I agree, this is already on the to-do list 😊

dlubarov · 2021-09-03T20:06:34Z

src/field/packed_crandall_avx2.rs

+
+#[derive(Copy, Clone)]
+#[repr(C, align(4))]
+pub struct PackedCrandallAVX2(pub [u64; 4]);


Hm would it be simpler to wrap the __m256i? Seems like we convert to it rather than doing anything directly with the [u64; 4]

This is an awful trick to stop Rust assuming 32-byte alignment. I’ve added the following comment to explain it:

// PackedCrandallAVX2 wraps an array of four u64s, with the new and get methods to convert that // array to and from __m256i, which is the type we actually operate on. This indirection is a // terrible trick to change PackedCrandallAVX2's alignment. // We'd like to be able to cast slices of CrandallField to slices of PackedCrandallAVX2. Rust // aligns __m256i to 32 bytes but CrandallField has a lower alignment. That alignment extends to // PackedCrandallAVX2 and it appears that it cannot be lowered with #[repr(C, blah)]. It is // important for Rust not to assume 32-byte alignment, so we cannot wrap __m256i directly. // There are two versions of vectorized load/store instructions on x86: aligned (vmovaps and // friends) and unaligned (vmovups etc.). The difference between them is that aligned loads and // stores are permitted to segfault on unaligned accesses. Historically, the aligned instructions // were faster, and although this is no longer the case, compilers prefer the aligned versions if // they know that the address is aligned. Using aligned instructions on unaligned addresses leads to // bugs that can be frustrating to diagnose. Hence, we can't have Rust assuming alignment, and // therefore PackedCrandallAVX2 wraps [u64; 4] and not __m256i.

Please let me know if you can think of a nicer way to do this 😊

dlubarov

LGTM

nbgl mentioned this pull request Sep 3, 2021

Crandall Field vector math #204

Closed

dlubarov reviewed Sep 3, 2021

View reviewed changes

PackedField trait

f284b35

nbgl force-pushed the jakub/vector-field-trait branch from c9a8dc3 to f284b35 Compare September 3, 2021 22:35

nbgl changed the title ~~WIP: Vectorized fields~~ Vectorized field trait Sep 3, 2021

nbgl requested a review from dlubarov September 3, 2021 22:39

dlubarov approved these changes Sep 3, 2021

View reviewed changes

Daniel comments

9031060

nbgl merged commit 032e2fe into main Sep 4, 2021

nbgl deleted the jakub/vector-field-trait branch September 4, 2021 00:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vectorized field trait #216

Vectorized field trait #216

nbgl commented Sep 3, 2021 •

edited

Loading

nbgl commented Sep 3, 2021

dlubarov commented Sep 3, 2021

dlubarov commented Sep 3, 2021

dlubarov Sep 3, 2021 •

edited

Loading

nbgl Sep 3, 2021

dlubarov Sep 3, 2021

nbgl Sep 3, 2021

dlubarov left a comment


		fn broadcast(x: Self::FieldType) -> Self;

		fn from_arr(arr: &[Self::FieldType]) -> Self;

Vectorized field trait #216

Vectorized field trait #216

Conversation

nbgl commented Sep 3, 2021 • edited Loading

nbgl commented Sep 3, 2021

dlubarov commented Sep 3, 2021

dlubarov commented Sep 3, 2021

dlubarov Sep 3, 2021 • edited Loading

Choose a reason for hiding this comment

nbgl Sep 3, 2021

Choose a reason for hiding this comment

dlubarov Sep 3, 2021

Choose a reason for hiding this comment

nbgl Sep 3, 2021

Choose a reason for hiding this comment

dlubarov left a comment

Choose a reason for hiding this comment

nbgl commented Sep 3, 2021 •

edited

Loading

dlubarov Sep 3, 2021 •

edited

Loading