-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIXES #81: 400x faster slice assignment #134
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Could you be sure to add some tests for this in the test suite, additionally for the cases that throw an error?
thank you for your quick response. I have so far demonstrated two cases one is the histogram matching image filter https://github.com/muayyad-alsadi/wasm-demos/blob/main/cdb_djp_hash/ I believe the later one is a good use case for tests (the wasm is ~500 bytes only)
unfortunately we can't change the signature of https://github.com/muayyad-alsadi/wasm-demos/blob/main/cdb_djp_hash/wasmtime_fast_memory.py which is used like this https://github.com/muayyad-alsadi/wasm-demos/blob/main/cdb_djp_hash/cdb_djp_hash.py store = Store()
module = Module.from_file(store.engine, 'cdb_djp_hash.wasm')
instance = Instance(store, module, [])
memory = instance.exports(store)['memory']
fast_mem = FastMemory(memory, store)
# ...
fast_mem[start:stop] = input
output = fast_mem[start_output:stop_output] any suggestion for its name?
let me double check |
I've added test, please let me know if you want me to move my logic into a different class and please suggest its name |
I don't know how do you want to solved those two type hints
should I change return to Any? |
I realize this since they're Python builtins, which is why I was wondering if something like renaming the methods to
Personally I think the "stateful" approach you have right now where class Memory:
def view(self, store: Storelike) -> MemoryView:
# ... where In either case though I think what's important is to avoid the stateful setting-of-a-store and instead make it explicit as part of the API.
Sorry I'm not familiar enough with mypy myself to know what's going on here. |
I've called it |
Thinking more on this, can you say more as to why a dedicated |
we can do them, but I have two concern
https://github.com/muayyad-alsadi/wasm-demos/blob/main/cdb_djp_hash/cdb_djp_hash.py#L44 # cdb_djp_hash.py
fast_mem[start:stop]=input
# histo_wasmtime_fast.py
fast_mem[0:] = a_ref
fast_mem[len(a_ref):] = a_in
dst = fast_mem[len(a_ref)+len(a_in): len(a_ref)+len(a_in)+len(a_in)]
def read(self, store, key=None):
start, stop, step = key.indices(size)
def write(self, store, value, key=None):
start, stop, step = key.indices(size) or def read(self, store, start=0, stop=None, step=None):
key = slice(start, stop, step)
start, stop, step = key.indices(size)
# ...
def write(self, store, value, start=0, stop=None, step=None):
key = slice(start, stop, step)
start, stop, step = key.indices(size)
# ...
|
no, it's totally fine, it will work, I've tested it if you insist I'll remove this comment.
and bytearray and array.array are done without copying if you like I can change the type of value to be
this is the standard behavior in python of all indexable array like objecrs
as you can see in the code, we did not implement the logic, we left it to python builtin slice to document and implement this.
this will do all the cases like
and as mentioned before in python the slice might under-return as in the following (we asked to 200 items, but got 26)
I prefer to delegate to python builtin logic than to implement my own, because if we do a custom then we need to do more documentation and more reasoning I hope I answered all user concerns |
Sorry this PR is taking quite a lot more time than I though it would to review, so I'm going to have to find some dedicated time to review this in the future as it's not working for me to fit it in "gaps" in my day. |
TYT and I'm open to any suggestion or question. |
Firstly, I think this is an important topic and I would like to help where I can. I think much of the still-open discussions come from slice vs. no slice interface distinction. Both the question about the necessity of the @muayyad-alsadi I understand that you would like the interface to behave like a slice. All your arguments on the above two matters are correct for slices. As a python programmer, I would also expect that
That is not at all what I would expect when writing to a location in memory, and also is (luckily) not what
is completely appropriate. I would vote to remove the |
wasmtime/_memory.py
Outdated
def write( | ||
self, | ||
store: Storelike, | ||
value: typing.Any, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I said "works with mypy" better I was referring to the fact that this is listed as typing.Any
. While I don't dispute "mypy runs without errors" falling back on typing.Any
defeats the purpose of using type ascription in the first place.
This function does not take any type in Python. I think you provided a compelling case for it shouldn't just take bytearray
, but in that case the list of supported types I think should be listed here under typing.Union[...]
or something similar.
wasmtime/_memory.py
Outdated
if slice_size != val_size: | ||
raise IndexError(f"mismatched value size, value of size [{val_size}] with slice of size [{slice_size}]") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that this logic is correct? Here slice_size
is always equal to val_size
, so this if
never fires?
wasmtime/_memory.py
Outdated
src_ptr = (ptr_type).from_buffer(value) | ||
dst_ptr = (ptr_type).from_address(ctypes.addressof(data_ptr.contents) + start) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that the parentheses around ptr_type
are necessary here.
wasmtime/_memory.py
Outdated
write a bytearray value into a possibly large slice of memory | ||
negative start is allowed in a way similat to list slice mylist[-10:] | ||
if value is not bytearray it will be used to construct an intermediate bytearray | ||
return number of bytes written |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be good to expand this documentation to explain what happens for the raise
cases below, for example writing beyond the end of memory. My impression is that it behaves differently than native Python slicing which I think would be good to call out.
I've addressed all of those points in my last push. |
wasmtime/_memory.py
Outdated
if start >= size: | ||
raise IndexError("index out of range") | ||
# value must be bytearray ex. cast bytes() to bytearray | ||
if not isinstance(value, array.array) and not isinstance(value, bytearray): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that if array.array
is to be supported here it needs more work and more tests. Currently there are no tests using array.array
and for any size bigger than u8 I don't believe it's correct. For example an array.array
of 32-bit integers doesn't correctly copy the size of the array into memory. Furthermore endianness is a concern with array.array
where I don't think that larger than 1-byte values should be supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are totally right, should I check for value.itemsize==1
and throw ValueError or remove array.array
wasmtime/_memory.py
Outdated
raise IndexError("index out of range") | ||
# value must be bytearray ex. cast bytes() to bytearray | ||
if not isinstance(value, array.array) and not isinstance(value, bytearray): | ||
# value = array.array('B', value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this line can be removed?
# value must be bytearray ex. cast bytes() to bytearray | ||
if not isinstance(value, array.array) and not isinstance(value, bytearray): | ||
# value = array.array('B', value) | ||
value = bytearray(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is bytearray
the best choice of conversion here? If the input is a bytes
, for example, wouldn't this create a temporary copy before then copying into the linear memory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, bytes
does not work because src_ptr = ptr_type.from_buffer(value)
would give
TypeError: underlying buffer is not writable
I would love to include thewtex suggestion def get_buffer_ptr(self, store):
"""
buffer suitable for creating zero-copy writable NumPy Buffer Protocol
np_mem = np.frombuffer(memory.get_buffer_ptr(store), dtype=np.uint8)
np_mem[start:end] = A
B = np_mem[start:end]
"""
ptr_type = ctypes.c_ubyte * (self.data_len(store))
return ptr_type.from_address(ctypes.addressof(self.data_ptr(store).contents)) |
…ction and document it
…ction and document it
…ction and document it
…ction and document it
…ction and document it
…ction and document it
…ction and document it
…ction and document it
Thanks! This looks good so I'll merge this. I think that method might make a good follow-up PR if you're interested. |
for more information check #81