-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch from ocl to opencl3 #34
Conversation
src/opencl/mod.rs
Outdated
let kernels = opencl3::kernel::create_program_kernels(&program)?; | ||
let kernels_by_name = kernels | ||
.into_iter() | ||
.map(|kernel| (kernel.function_name().unwrap(), kernel)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is it safe to unwrap
here?
I've converted this to a draft PR. It turns out that this work is not needed for the the UUID work at #30, hence I get more time to polish it more, like removing |
82b6b1d
to
9e0746b
Compare
I have addressed previous comments and also cleaned-up the API a bit. It's ready for review :) |
F: FnOnce() -> Result<R, E>, | ||
E: From<GPUError>, | ||
{ | ||
fun() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so much fun
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a couple small things, but otherwise lgtm
f625fa6
to
b549f7b
Compare
I've removed the last commit (I'm not sure if that change is possible, once there is CUDA as well). Better be safe than sorry. It can be merged if the corresponding neptune and bellperson changes are also merged. |
I see failures in |
@dignifiedquire I think nothing changed since the last review, is it OK if I merge it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you make sure to add doc comments please to all new functions
Ok(()) | ||
} | ||
buffer: opencl3::memory::Buffer<T>, | ||
length: usize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a comment if the length
is in in bytes or in T
elements
let binaries = program | ||
.get_binaries() | ||
.map_err(|_| GPUError::ProgramInfoNotAvailable(CL_PROGRAM_BINARIES))?; | ||
std::fs::write(cached, binaries[0].clone())?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if there are multiple or no binary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No idea. This code was just ported over from: https://github.com/filecoin-project/rust-gpu-tools/pull/34/files/e212d9d1af6d46c0030c2794dc0607939c031ff2#diff-f16a498f7eb37b8af9a85fcbd3bba8f972198602f61b6f919bf15aeb939da2c2L256
Should this be addressed? If yes, then I'd do this in a separate PR after all these reviews.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets open an issue, to at least investigate this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let buff = opencl3::memory::Buffer::create( | ||
&self.context, | ||
CL_MEM_READ_WRITE, | ||
length, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is length
now in numbers of T
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, length is the number of T
s. Reason is that we don't use as underlying buffer an OpenCL <u8>
buffer anymore, but a <T>
one (see https://github.com/filecoin-project/rust-gpu-tools/pull/34/files/e212d9d1af6d46c0030c2794dc0607939c031ff2#diff-f16a498f7eb37b8af9a85fcbd3bba8f972198602f61b6f919bf15aeb939da2c2R56). This simplifies the code quite, we need to do less size calculations and memory transformations. The tests still pass.
pub fn write_from_buffer<T>( | ||
&self, | ||
buffer: &Buffer<T>, | ||
offset: usize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what unit does offset
have?
src/opencl/mod.rs
Outdated
offset: usize, | ||
data: &[T], | ||
) -> GPUResult<()> { | ||
assert!(offset + data.len() <= buffer.length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A useful error message would be nice
.create_sub_buffer(CL_MEM_READ_WRITE, offset, data.len())?; | ||
|
||
self.queue | ||
.enqueue_write_buffer(&buff, CL_BLOCKING, 0, data, &[])?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add a comment why passing an empty slice is the right thing to do here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was about to add // We are not waiting for events, hence the last parameter is an empty slice
, but that didn't really make things clearer. Also it would need to be added to all 3 occurrences. I'm in favour of not having a comment, but if you think I should that comment I can add it.
Also refactor the code. Instead of operating on the `Buffer`s, operate on the `Program` and pass in the `Buffer`s. This way there is no need to expose the queue to the outside. Kernel creation might now return an error if the kernel with the given name doesn't exist.
This change is needed to make the API the same that is used by the CUDA implementation.
This method is needed by the CUDA implementation, on OpenCL it doesn't really do anything other then executing the closure.
There is no need to keep a copy of a device, we only need the device name.
In CUDA it is non-optional, hence make it also mandatory for OpenCL.
There is no need to own it.
Remove `unwrap()` with proper error propagation.
There are functions/macros that are not used by `bellperson` or `neptune`. Remove the unused ones or make them private if they are used internally.
Instead of using a generic type in our `Buffer` type and do manual size calculations pass it on directly into opencl3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be OK if I add documentation to all public functions once we've merged everything (before the release)?
let binaries = program | ||
.get_binaries() | ||
.map_err(|_| GPUError::ProgramInfoNotAvailable(CL_PROGRAM_BINARIES))?; | ||
std::fs::write(cached, binaries[0].clone())?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No idea. This code was just ported over from: https://github.com/filecoin-project/rust-gpu-tools/pull/34/files/e212d9d1af6d46c0030c2794dc0607939c031ff2#diff-f16a498f7eb37b8af9a85fcbd3bba8f972198602f61b6f919bf15aeb939da2c2L256
Should this be addressed? If yes, then I'd do this in a separate PR after all these reviews.
let buff = opencl3::memory::Buffer::create( | ||
&self.context, | ||
CL_MEM_READ_WRITE, | ||
length, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, length is the number of T
s. Reason is that we don't use as underlying buffer an OpenCL <u8>
buffer anymore, but a <T>
one (see https://github.com/filecoin-project/rust-gpu-tools/pull/34/files/e212d9d1af6d46c0030c2794dc0607939c031ff2#diff-f16a498f7eb37b8af9a85fcbd3bba8f972198602f61b6f919bf15aeb939da2c2R56). This simplifies the code quite, we need to do less size calculations and memory transformations. The tests still pass.
.create_sub_buffer(CL_MEM_READ_WRITE, offset, data.len())?; | ||
|
||
self.queue | ||
.enqueue_write_buffer(&buff, CL_BLOCKING, 0, data, &[])?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was about to add // We are not waiting for events, hence the last parameter is an empty slice
, but that didn't really make things clearer. Also it would need to be added to all 3 occurrences. I'm in favour of not having a comment, but if you think I should that comment I can add it.
Sure, just open an issue to ensure that things are documented. |
I opened an issue: #41 Everything else should be addressed (which unit |
This PR makes the switch from
ocl
toopencl3
. This is mostly what the first commit. On top of that there are other improvements which make sense in the light of future CUDA integration.For easier review you might want to review this PR commit per commit (also see the commit messages for more information).
This PR also needs changes to the upstream dependants, which can be found here: