-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Rust apis #262
Update Rust apis #262
Conversation
* refactor * refactor * revert * refactor: clang format * Update icicle/appUtils/msm/msm.cu
…fc7ed6068e8eab9d81a8d6002949524597a19' into develop/vhnat/feat-233-update-api-fix-rust-build
…lop/vhnat/feat-233-update-api-fix-rust-build
…vhnat/feat-233-update-api-fix-rust-build
…vhnat/feat-233-update-api
…pi' into develop/dima/feat-233-update-rust-apis
icicle/appUtils/lde/lde.cuh
Outdated
* evaluation domains of polynomials. It also contains methods for element-wise manipulation of vectors, which is useful | ||
* for working with polynomials in evaluation domain. | ||
* LDE (stands for low degree extension) contains [NTT](@ref ntt)-based methods for translating between coefficient and evaluation domains of polynomials. | ||
* It also contains methods for element-wise manipulation of vectors, which is useful for working with polynomials in evaluation domain. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But they are just general-purpose arithmetic. Wouldn't it be better to move these into some other file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're right...
My initial logic was like this: in arkworks there is this Evaluations
struct which implements all the operations one might perform on polynomials in evaluation form, including element-wise add, sub and mult and I wanted to do something similar in lde. But we might also need vector operations elsewhere (e.g. in tree builders right?) so it's best to factor these out into a more general place
I think I'm going to remove lde for now then, and will (try to) implement a higher-level polynomial functionality in the next PR. Would that make sense?
icicle/appUtils/lde/lde.cuh
Outdated
* LDE (stands for low degree extension) contains [NTT](@ref ntt)-based methods for translating between coefficient and | ||
* evaluation domains of polynomials. It also contains methods for element-wise manipulation of vectors, which is useful | ||
* for working with polynomials in evaluation domain. | ||
* LDE (stands for low degree extension) contains [NTT](@ref ntt)-based methods for translating between coefficient and evaluation domains of polynomials. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't LDE a part of NTT and not vice versa?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partially answered in the previous comment: I imagine future LDE as something like arkworks' Evaluation
or a mix of Evaluation
and DensePolynomial
. So, it's a higher-level wrapper over NTT. For example, to evaluate a polynomial of degree N
on a domain of size 2N
you need 2 size N
NTTs, or to multiply two polynomials you need some NTTs for translating between evaluation and coefficient forms
@vhnatyk a couple of questions I have about the current NTT design in this branch:
@jeremyfelder @ChickenLover please feel free to comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I don't understand the purpose of
Domain
......
I agree
- Here are my other general preferences....:
- I would take
inout
,size
andis_inverse
out from config and put these fields into the NTT function separately. My intuition is that fields that have a natural default should be inside cfg but something that you almost definitely need to set manually shouldn't. Also imontt(data, true, cfg)
is more readable thanntt(cfg)
.
I agree about inout
and size
. I'm not sure why is_inverse
would be different than is_coset
though?
- Maybe it's better to have just one
on_device
flag instead of two (inputs_on_device
andoutputs_on_device
) inNTTConfig
? It just feels safer and more intuitive to me to not change where the pointerinout
is pointing to (which will inevitably happen if inputs are on device and outputs on host let's say) in the function which is supposedly "in-place". From my experience, it's almost always either both inputs and outputs on device, or both on host in real-life NTTs
I agree that it will be less error prone but this will force a user to manually move the initial data and resulting data to/from device if they want to use on_device=true
which might also be error prone.
- I completely agree that we need convenience wrappers for NTT functions, just thought maybe we could leave a minimal number of expressive calls for now and later on grow the wrappers organically. I just don't really believe in our ability to thoughtfully design good quality wrappers right now without delaying the PR.
👍🏻
- We currently have three similar fields in
NTTConfig
:Ordering
,Decimation
andButterfly
. They are not independent, for exampleDecimation::kDIT
andButterfly::kCooleyTukey
are basically the same thing. I think there's an issue in case for exampleDecimation::kDIF
andButterfly::kCooleyTukey
are passed, in which case one has to be ignored. The reason why we haveDecimation
andButterfly
in the first place is for compatibility with other codebases (e.g. gnark uses DIT and DIF in their codebase) but imo it'd best to just document which decimation/butterfly options correspond to whichOrdering
options and removeDecimation
andButterfly
fields altogether.
Yes, if multiple options are essentially the same outcome but different names, we should give one option and document the alternative names for that option.
template <typename S> | ||
int get_optimal_c(int bitsize) | ||
{ | ||
return ceil(log2(bitsize)) - 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be able to be negative?
@@ -61,7 +58,7 @@ namespace ntt { | |||
int number_of_threads = MAX_THREADS_BATCH; | |||
int number_of_blocks = (n * batch_size + number_of_threads - 1) / number_of_threads; | |||
reverse_order_kernel<<<number_of_blocks, number_of_threads, 0, stream>>>(arr, arr_reversed, n, logn, batch_size); | |||
cudaMemcpyAsync(arr, arr_reversed, n * batch_size * sizeof(E), cudaMemcpyDeviceToDevice, stream); | |||
cudaMemcpyAsync(arr, arr_reversed, n * batch_size * sizeof(E), cudaMemcpyDefault, stream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cudaMemcpyDefault is only on devices that support unified virtual addressing...what devices are these? We should update docs for supported devices
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vhnatyk I also don't understand the purpose of this tbh
.define("BUILD_TESTS", "OFF") //TODO: feature | ||
// .define("CURVE", "bls12_381") | ||
.define("CURVE", "bn254") | ||
// .define("ECNTT_DEFINED", "") //TODO: feature |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// .define("ECNTT_DEFINED", "") //TODO: feature | |
// .define("ECNTT_DEFINED", "") //TODO: feature | |
// .define("G2_DEFINED", "") |
if (config.batch_size == 1) | ||
bucket_method_msm( | ||
config.bitsize, config.c, scalars, points, msm_size, results, config.are_scalars_on_device, config.big_triangle, | ||
config.large_bucket_factor, config.ctx.stream); | ||
bitsize, 16, scalars, points, msm_size, results, config.are_scalars_on_device, | ||
config.are_scalars_montgomery_form, config.are_points_on_device, config.are_points_montgomery_form, | ||
config.are_results_on_device, config.is_big_triangle, config.large_bucket_factor, config.is_async, | ||
config.ctx.stream); | ||
else | ||
batched_bucket_method_msm( | ||
config.bitsize, config.c, scalars, points, config.batch_size, msm_size, results, config.are_scalars_on_device, | ||
config.ctx.stream); | ||
bitsize, (config.c == 0) ? get_optimal_c<S>(bitsize) : config.c, scalars, points, config.batch_size, msm_size, | ||
results, config.are_scalars_on_device, config.ctx.stream); | ||
return cudaSuccess; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see us using config->point_size
even though the comments on MSMConfig say it is used instead of msm_size in certain cases. This parameter in the config is a bit confusing, can't we always just use msm_size and batch_size to get the correct values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, it's currently unused, I tried to mark this fact with the following comment: https://github.com/ingonyama-zk/icicle/pull/262/files#diff-cdd3b67200a359e20797b18b99ca7176f36d4c9d48d9843d22500c5424fe1c7dR106. The idea of this parameter is to control whether you want to re-use the same set of bases in all MSMs or have custom bases for each one. Current default is the latter, although it's really easy to make this parameter functional and from my understanding this will help you in halo2 where you could reduce memory footprint from bases by sharing bases between MSMs, right?
This PR aims to update Rust APIs to the minimal usable standard