-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Koren/v3 msm #581
Koren/v3 msm #581
Conversation
…e verified against another model (i.e. gpu)
…e verified against another model (i.e. gpu)
…e verified against another model (i.e. gpu)
…al from wrong field
…erywhere instead of specifying the install dir in the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good but I had some comments.
Please ask Miki/Hadar to review the algorithm part. I did not really review it.
…remainder and the tests now reflect that
…y modifies c if it divides scalar width
@@ -22,7 +22,6 @@ class Projective | |||
FF x; | |||
FF y; | |||
FF z; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
const Device& device, const scalar_t* scalars, const A* bases, int msm_size, const MSMConfig& config, P* results) | ||
{ | ||
int c = config.c; | ||
if (c < 1) { c = std::max((int)std::log2(msm_size) - 1, 8); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this optimal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's an approximation given the derivative of num of addition (of the single-threaded no precompute solution)
it can be optimized - I would like to know how you determined optimal value
int c = config.c; | ||
if (c < 1) { c = std::max((int)std::log2(msm_size) - 1, 8); } | ||
if (scalar_t::NBITS % c == 0) { | ||
std::cerr << "Currerntly c (" << c << ") mustn't divide scalar width (" << scalar_t::NBITS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't Niall's trick solve this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does I currently have a bug regarding this
fixing it is on the todo list
|
||
template <typename A, typename P> | ||
void Msm<A, P>::run_msm( | ||
const scalar_t* scalars, const A* bases, const unsigned int msm_size, const unsigned int batch_idx, P* results) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's stay consistent with calling them points and not bases please. the msm always used the points notation but somehow the bases notation slipped through
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yshekel is it ok? (given the existing api)
template <typename A, typename P> | ||
void Msm<A, P>::phase2_bm_sum(std::vector<BmSumSegment>& segments) | ||
{ | ||
phase2_setup(segments); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as we discussed this is probably extraneous
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will be dealt within a future version
int return_idx = task->m_return_idx; | ||
// Due to the choice of num segments being less than half of total tasks there ought to be an idle task for | ||
// the line sum | ||
task = manager.get_idle_task(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as we discussed, use get idle for both tasks or assign thread per segment in advance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note the rust msm is not right. You need to handle the TODOs in msm test in rust and remove the test in v3 dir since it was renamed
Describe the changes
MSM multithreaded implementation on CPU
Linked Issues
Resolves #