Skip to content

[embind] Reuse signature codes from em_asm. NFC #24611

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 8, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
102 changes: 20 additions & 82 deletions system/include/emscripten/bind.h
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
#include <optional>
#endif

#include <emscripten/em_macros.h>
#include <emscripten/em_asm.h>
#include <emscripten/val.h>
#include <emscripten/wire.h>

Expand Down Expand Up @@ -565,97 +565,35 @@ struct FunctorInvoker<ReturnPolicy, FunctorType, void, Args...> {

namespace internal {

template<typename T>
struct SignatureCode {};

template<>
struct SignatureCode<int> {
static constexpr char get() {
return 'i';
}
};

template<>
struct SignatureCode<void> {
static constexpr char get() {
return 'v';
}
};
// TODO: this is a historical default, but we should probably use 'p' instead,
// and only enable it for smart_ptr_trait<> descendants.
template<typename T, typename = decltype(__em_asm_sig<int>::value)>
struct SignatureCode : __em_asm_sig<int> {};

template<>
struct SignatureCode<float> {
static constexpr char get() {
return 'f';
}
};
template<typename T>
struct SignatureCode<T, decltype(__em_asm_sig<T>::value)> : __em_asm_sig<T> {};

// TODO: should we add this override to em_asm?
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sbc100 Would love your thoughts on this one. In other places we represent size_t aka unsigned long as number, even though it's 32 bit on wasm32 and 64 on wasm64, so it's lossy on wasm64.

Should we change EM_ASM behaviour to match that, or should we change those other places to prevent precision loss for unsigned long, or is it expected that they handle it differently?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm maybe "most" is an exaggeration. Embind itself also encodes size_t aka unsigned long as a BigInt on wasm64 and number otherwise.

Shame that you can't distinguish between size_t and "plain" longs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Without looking at this specific instance yet) I think the only times we want to use i53 numbers for 64-bit values is:

  1. When we know that thing is a pointer type (e.g. when p is used in signature of a JS function)
  2. When there is an explicit opt-in (i.e. __i53abi tag on JS functions).

Otherwise 64-bit values should be preserved either via bigint (or pair-of-numbers)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, but in that case sounds like we should instead remove this override from Embind, as it will currently use i53 for any unsigned long.

Does that sound right / should I do that in the same PR?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this PR is NFC than lets keep that way,.. is it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the behavior in another PR to be consistent sounds good to me. FWIW, I thought size_t was already a bigint on memory64, but I'm guessing we left it int32 to avoid users having to deal with BigInt.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We handle size_t like we handle pointers yes. They have both have pointer-like behaviour in that they have a different size on wasm32 and wasm64.

Other types such as uint64 have the same size on both wasm32 and wasm64.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They have both have pointer-like behaviour in that they have a different size on wasm32 and wasm64.

But for pointers we translate them to int53 aka plain numbers in JS code, including in Embind, whereas size_t is different, as it will become bigint on wasm64.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is, I suppose, a limitation of type system. I assume its not possible to distinguish between unsigned long and size_t? If it is possible we should treat size_t like we do pointers.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is, I suppose, a limitation of type system. I assume its not possible to distinguish between unsigned long and size_t?

Indeed. That's what I said above too:

Shame that you can't distinguish between size_t and "plain" longs.

// Most places, including Embind, use `p` for `size_t` (aka `unsigned long`) but
// `em_asm` uses platform-specific code instead which represents `unsigned long`
// as a JavaScript `number` on wasm32 and as a `BigInt` on wasm64.
template<>
struct SignatureCode<double> {
static constexpr char get() {
return 'd';
}
};
struct SignatureCode<size_t> : __em_asm_sig<void*> {};

template<>
struct SignatureCode<void*> {
static constexpr char get() {
return 'p';
}
};
template<>
struct SignatureCode<size_t> {
static constexpr char get() {
return 'p';
}
};

template<>
struct SignatureCode<long long> {
static constexpr char get() {
return 'j';
}
};
template<typename T>
struct SignatureCode<T&> : SignatureCode<T*> {};

#ifdef __wasm64__
template<>
struct SignatureCode<long> {
static constexpr char get() {
return 'j';
}
struct SignatureCode<void> {
static constexpr char value = 'v';
};
#endif

template<typename... Args>
const char* getGenericSignature() {
static constexpr char signature[] = { SignatureCode<Args>::get()..., 0 };
return signature;
}

template<typename T> struct SignatureTranslator { using type = int; };
template<> struct SignatureTranslator<void> { using type = void; };
template<> struct SignatureTranslator<float> { using type = float; };
template<> struct SignatureTranslator<double> { using type = double; };
#ifdef __wasm64__
template<> struct SignatureTranslator<long> { using type = long; };
#endif
template<> struct SignatureTranslator<long long> { using type = long long; };
template<> struct SignatureTranslator<unsigned long long> { using type = long long; };
template<> struct SignatureTranslator<size_t> { using type = size_t; };
template<typename PtrType>
struct SignatureTranslator<PtrType*> { using type = void*; };
template<typename PtrType>
struct SignatureTranslator<PtrType&> { using type = void*; };
template<typename ReturnType, typename... Args>
struct SignatureTranslator<ReturnType (*)(Args...)> { using type = void*; };

template<typename... Args>
EMSCRIPTEN_ALWAYS_INLINE const char* getSpecificSignature() {
return getGenericSignature<typename SignatureTranslator<Args>::type...>();
}
constexpr const char Signature[] = { SignatureCode<Args>::value..., 0 };

template<typename Return, typename... Args>
EMSCRIPTEN_ALWAYS_INLINE const char* getSignature(Return (*)(Args...)) {
return getSpecificSignature<Return, Args...>();
constexpr const char* getSignature(Return (*)(Args...)) {
return Signature<Return, Args...>;
}

} // end namespace internal
Expand Down Expand Up @@ -2159,7 +2097,7 @@ struct MapAccess {

} // end namespace internal

template<typename K, typename V, class Compare = std::less<K>,
template<typename K, typename V, class Compare = std::less<K>,
class Allocator = std::allocator<std::pair<const K, V>>>
class_<std::map<K, V, Compare, Allocator>> register_map(const char* name) {
typedef std::map<K,V, Compare, Allocator> MapType;
Expand Down