-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement macros {{{ ptrToIdx() }}}, {{{ convertPtrToIdx() }}}, {{{ idxToPtr() }}} #19289
base: main
Are you sure you want to change the base?
Conversation
41551a4
to
4710fe3
Compare
Perhaps "index" could be "i53"? That seems more clear than either index or number, I think, as i53 makes it obvious it is an integer, but it's in 53 bits so it's stored in a JS number.
How about "convert" => "update", since it updates the value in place? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, these comments are maybe old but I thought I would send them anyway.. I still ned to revisit this since you updated some stuff.
@@ -194,7 +194,7 @@ mergeInto(LibraryManager.library, { | |||
emscripten_resize_heap: 'ip', | |||
emscripten_resize_heap: function(requestedSize) { | |||
var oldSize = HEAPU8.length; | |||
requestedSize = requestedSize >>> 0; | |||
{{{ convertPtrToIdx('requestedSize') }}}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about just dropping the convert and just calling this ptrToIdx
? I'm also not very clear that "Index" is the right thing to be calling this. I don't love either of these but what about ptrToNumber
or wasmToJSPtr
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because emscripten_resize_heap has its argument marked as p
I don't think we need the full conversion here do we? i.e. we didn't need from64
here before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a separate convertPtrToIdx('variable')
function has the benefit that it avoids a redundant assignment to self in wasm2gb builds. I.e. if one writes
variable = {{{ ptrToIdx('variable') }}};
Then it will generate variable = variable;
in <2GB builds, and one will need to use Closure to fix that up.
src/library.js
Outdated
@@ -3128,7 +3128,7 @@ mergeInto(LibraryManager.library, { | |||
// Used by wasm-emscripten-finalize to implement STACK_OVERFLOW_CHECK | |||
__handle_stack_overflow__deps: ['emscripten_stack_get_base', 'emscripten_stack_get_end', '$ptrToString'], | |||
__handle_stack_overflow: function(requested) { | |||
requested = requested >>> 0; | |||
{{{ convertPtrToIdx('requested') }}}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I think the incoming param is already a number so conversion from BigInt here is not needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't there need to be a __handle_stack_overflow__sig
for the automatic conversion to occur? There isn't one for __handle_stack_overflow
?
src/library.js
Outdated
// Function pointers are 64-bit, but wasmTable.get() requires a Number. | ||
// https://github.com/emscripten-core/emscripten/issues/18200 | ||
funcPtr = Number(funcPtr); | ||
#endif | ||
{{{ convertPtrToIdx(funcPtr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this line can be removed because funcPtr should already have been converted to a number when was first passed to JS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, these comments are maybe old but I thought I would send them anyway.. I still ned to revisit this since you updated some stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, these comments are maybe old but I thought I would send them anyway.. I still ned to revisit this since you updated some stuff.
@@ -78,10 +78,11 @@ var LibraryHtml5WebGL = { | |||
'$JSEvents', '$emscripten_webgl_power_preferences', '$findEventTarget', '$findCanvasEventTarget'], | |||
// This function performs proxying manually, depending on the style of context that is to be created. | |||
emscripten_webgl_do_create_context: function(target, attributes) { | |||
{{{ convertPtrToIdx('target') }}}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, because emscripten_webgl_do_create_context
has it signature set in library_sigs.js
to ipp
these incoming pointers should already have been converted (at least in the wasm64 case). This should be handled automatically I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it wasn't.. I was getting a crash here in Unity, and had to add this here to fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps that was an older version before the __sig
attribute was added? If you look at the generated code for emscripten_webgl_do_create_context you should see that it has auto-generated conversion code that is generated by convertPointerParams
in src/jsifier.js
.
src/library.js
Outdated
// magic here. This is because the __sig wrapper uses arrow function | ||
// notation which causes the inner call to traverseStack to fail. | ||
{{{ from64('str') }}}; | ||
{{{ convertPtrToIdx('str') }}}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should no longer be needed since #19820 landed.
After working on #19980 I think something like I'm not sure about the removal of If you have time would you like to update this PR, perhaps with the to64/from64 stuff kept separate? Otherwise, I'd be up for carving more parts of this out and landing in chunks if that is OK with you? |
I landed this ptrToIdx PR to our Unity Emscripten branch in June, and verified its behavior against Unity in 4GB mode. That uncovered a number of failures in tests, which I have been now in the process of going through. I still would like to land these types of macros {{{ ptrToIdx() }}}, {{{ convertPtrToIdx() }}}, {{{ idxToPtr() }}} into general Emscripten JS library use, I think they are the right type of abstraction that allows users to manually control the exact codegen. These could be paired with the automatic __sig-based type conversion for use cases where people are ok to rely on automatic (but maybe slightly code size pessimized) conversions and do not feel as pedantic about code size as we do. |
I agree that When reading the code I think its useful to be able to distinguish between these to cases.
I can see that it might sometimes be useful to combine these two when it makes sense. Note that (1) is no-op on wasm32 Most of our JS library can save on code size by assuming they are always working with numbers (i53) and that i53 conversions has already been performed. i.e. it only needs to worry about (2). So I think it makes sense to make (1) and (2) distinct operations in the macro system. |
Hmm, can you give an example of the kind of code where it would be beneficial to separate the two? I am not sure I follow. I mean, certainly could be done but my mind is drawing blank on what the code construct/patterns with that would be intended to look like, or how that would be used best? |
Any function that takes a pointer and then does a bunch of shift heap accesses using that pointer would only want to do the i53 convertion (cast of Number) once .. and not for each heap access. If we to do the conversions once then heap accesses (which are more common) should be smaller. Another way of putting it: if possible, we would not want to do the i53 conversion (cast to Number) more than once for any given incoming or outgoing pointer. |
…dxToPtr() }}} to help 2GB, 4GB and MEMORY64 memory modes. Remove {{{ to64() }}} and {{{ from64() }}} since they had an awkward mismatch of in-place vs expression styles. This is a looser form of PR emscripten-core#15433, and it embraces the idea that pointer values in JavaScript side will not need to retain the full 64-bit values, but can be truncated to 53 bits. Update library_html5.js for >2GB Address review and fix convertPtrToIdx on non-byte shifts
8809d51
to
b5f597f
Compare
Shouldn't such code want to do the shift heap accesses also only once? Multiple shifts would be redundant? |
I agree - also I think we should not do the shifts more than once for a pointer. So in my view having the common |
Found a bug in Chrome with WebGL 2 around > 2GB WebGL support: https://bugs.chromium.org/p/chromium/issues/detail?id=1476859 . CC @kenrussell |
The things is that in almost all cases the convertion to Number happens in the automatic wrapper generated by The convertion to heap offset (e.g. HEAP8 or the HEAPU32) I agree would ideally only happen once, but a given function might want to access other the HEAP8 and HEAPU32 via the same pointer, so several shifts might still be needed in a single function, even if we try make just one per type. BTW, I've been making some good progress on this using the existing Lines 294 to 304 in 7858816
Isn't One problem with the "try to only do one shift per function" is that a lot of our existing code relies on |
BTW did you see that we can now run the entire Aside from a couple of exceptions all the test now pass after a bunch of changes in the last few weeks. See: Lines 9893 to 9894 in 7858816
Lines 9882 to 9884 in 7858816
|
Ah I see, you have landed a new
Relying on a combination of First is that it requires using the addToLibrary({
foo__sig: 'vp',
foo: (ptr) => {
HEAPU32[{{{ getHeapOffset('ptr', 'u32') }}}] = 42;
}
}); it generates var bigintToI53Checked = num => (num < MIN_INT53 || num > MAX_INT53) ? NaN : Number(num);
function _foo(ptr) {
ptr = bigintToI53Checked(ptr);
HEAPU32[((ptr) / 4)] = 42;
} With addToLibrary({
foo: (ptr) => {
HEAPU32[{{{ ptrToIdx('ptr', 2) }}}] = 42;
}
}); one will get a better output that reads var _foo = ptr => {
HEAPU32[Number(ptr >> 2n)] = 42;
}; which to me is the optimal form for such an access. No extra function calls or divisions. I would advise against having that division in there. (maybe for Note btw that the effect of Another suboptimality with __sig is that if you have a function that only holds a pointer and does not do anything with it, except to pass it back to another function back to wasm (prime example being a |
This is superb. I am running against this now, and checking also some WebGL builds with Although in general, I wonder if there may be wins to be had by assuming that global static data section pointers will always be < 2GB pointers. |
ca62199
to
e97af5a
Compare
It sounds like you are arguing for allowing pointers to flow in the JS as bigint value whereas the current approach that I've been aiming for is to assume pointers a numbers everywhere in JS (except on the boundary). I don't mind revisiting that debate, but perhaps after we have all tests passing in the current mode? I hope that the abstractions we are using today |
I updated |
05da82a
to
1805e3a
Compare
I'm now working to fix the long tail of wasm64 issues. See #21177. I'm running into issues that this PR addresses. I'm not sure if we should try to land this PR but I'm certainly using it as inspiration for my changes. |
Implement macros
{{{ ptrToIdx() }}}
,{{{ convertPtrToIdx() }}}
,{{{ idxToPtr() }}}
and{{{ isPtrAligned() }}}
to help 2GB, 4GB and MEMORY64 memory modes. Remove{{{ to64() }}}
and{{{ from64() }}}
since they had an awkward mismatch of in-place vs new returned expression styles.This is a looser form of PR #15433, and it embraces the idea that pointer values in JavaScript side will not need to retain the full 64-bit values, but can be truncated to 53 bits. In other words, pointers can always be leisurely converted to Numbers in JS side.
The idea of this new function family will be to be public to JS library authors, so looking to add API tests and documentation to accommodate this PR, if people agree on the contents?
The main motivation is to provide a single mechanism to deal with 64-bit pointers (and
size_t
) in a way that works in efficient code-size manner (no redundantx = x;
expressions, shifts or casts in any build mode) and gives developers a single API to allow covering each of 2GB, 4GB and MEMORY64 build modes.