Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: ntt mixed-radix bug regarding large ntts (and/or batch) #523

Merged
merged 3 commits into from
May 20, 2024

Conversation

yshekel
Copy link
Collaborator

@yshekel yshekel commented May 19, 2024

in some cases 32b values would wrap around and cause invalid accesses to wrong elements and memory addresses

@yshekel yshekel force-pushed the yshekel/large_ntt_bugfix_ branch 2 times, most recently from 5d6b098 to 7cc5d44 Compare May 19, 2024 15:45
@yshekel yshekel force-pushed the yshekel/large_ntt_bugfix_ branch from 232b275 to 2d33f60 Compare May 19, 2024 16:44
Copy link
Contributor

@HadarIngonyama HadarIngonyama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks fine, do these changes affect performance in any way?

@yshekel
Copy link
Collaborator Author

yshekel commented May 20, 2024

looks fine, do these changes affect performance in any way?

Yes, for some cases (size+batch) it is a little slower. Sometimes slightly faster (which is odd but).
It's probably since now the kernels use u64 which are more expensive instructions or more instructions or something.

I tried to minimize it. For example I initially made the meta.block_id u64 (Which make all ops with it u64) but then reverted and casted only where I need it (so only some ops are u64 and the struct is smaller etc.).

Do you have an idea how to support >4G elements but avoid it? I don't really think it's possible.

@HadarIngonyama

@yshekel yshekel requested a review from HadarIngonyama May 20, 2024 09:57
yshekel added 2 commits May 20, 2024 13:06
in some cases 32b values would wrap around and cause invalid accesses to wrong elements and memory addresses
@yshekel yshekel force-pushed the yshekel/large_ntt_bugfix_ branch from 2d33f60 to 0d3a51f Compare May 20, 2024 10:06
@yshekel yshekel merged commit 4e3aa63 into main May 20, 2024
26 checks passed
@yshekel yshekel deleted the yshekel/large_ntt_bugfix_ branch May 20, 2024 13:42
yshekel added a commit that referenced this pull request May 23, 2024
in some cases 32b values would wrap around and cause invalid accesses to
wrong elements and memory addresses
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants