-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Measure performance with Canavas #77
Comments
I ran the Canvas benchmarks on my Windows machine, comparing results from before and after the NAPI port (using the same build of NAPI-enabled node). While some benchmarks show no measurable change, others are up to 5x slower on NAPI: Note that the benchmarks that show significant slowness from NAPI are the ones that have a high number of operations per second -- that is, they have very frequent calls through the NAPI layer. The first data point there, With the current APIs, every call from JavaScript to C++ requires 4-6 NAPI calls, not including any additional parameter type validation and retrieval that may be done by the C++ function being called. The sequence is (in pseudocode): argc = napi_get_cb_args_length();
argv = malloc(argc);
napi_get_cb_args(argv);
callbackData = napi_get_cb_data();
thisWrapper = napi_get_cb_this(); // Even static methods in JS have a 'this' (usually 'global')
thisArg = napi_unwrap(thisWrapper); // Only called for instance methods
returnValue = callbackData->method(thisArg, argv, callbackData->userData); // Call the user function
napi_set_return_value(returnValue); // Only called for methods that have a non-void return type The I did some experiments and some math, and found that on my machine every NAPI call costs approximately 25ns. That's actually not much, but I think there are some things we can do to reduce the number of NAPI calls required. To be continued... |
To reduce the number of NAPI calls required for every call, we could define an ugly API that looks something like this, to retrieve all the callback info at once: napi_status napi_get_cb_info(
napi_env e, // [in] NAPI environment handle
napi_callback_info cbinfo, // [in] Opaque callback-info handle
int* argc, // [in-out] Specifies the size of the provided argv and argt arrays
// and receives the actual count of args.
napi_value* argv, // [out] Array of values
napi_valuetype* argt, // [out] Optional array of value types, for optimizing arg validation
napi_value* thisArg, // [out] Receives the JS 'this' arg for the call
void** data); // [out] Receives the data pointer for the callback. While we could skip the optional In the case of the |
I still want to test canvas perf on a non-Windows system, since we might find different performance characteristics for calling through the NAPI layer. |
Are you actually doing a malloc as shown in the pseudocode ? thats going to be a killer I think. We should do a stack allocation (even if we have to overestimate the size we need) up to a certain number of parameters as 99% of the time that will probably be less than ~6 |
Currently it's using a |
I think the I also put the idea of one big ugly API as you propose here in the back of my mind to explore later if we ever hit a case where the performance overhead was significant, so +1 to this proposal. |
The above benchmark data was collected on a 5-year-old workstation PC running Windows, with a Xeon W3530 @2.8GHz, 20 GB RAM. I also ran on a 1.5-year-old Mac Mini and the results were very similar percentage-wise. |
The performance improvements in my PR reduce the worst case canvas benchmark from 505% to 277%. Other benchmarks that stress the JS-to-C++ NAPI callback layer show similar improvements. |
No description provided.
The text was updated successfully, but these errors were encountered: