-
Notifications
You must be signed in to change notification settings - Fork 7.3k
Incorrect encoding of null character in buffer #394
Comments
You should use |
In v0.2.x this worked: |
I stumbled over this, too. It feels like an arbitrary workaround for a string terminator problem somewhere else. Buffer.byteLength('\u0000') actually does return 1 but new Buffer('\u0000') cuts off a trailing null byte: https://github.com/ry/node/blob/master/src/node_buffer.cc#L430 'binary' encoding works, but "this encoding method is depreciated and should be avoided in favor of Buffer objects where possible. This encoding will be removed in future versions of Node." (http://nodejs.org/docs/v0.3.2/api/buffers.html#buffers) |
This is still a problem. In v0.4.7, |
It's a bug / feature in v8::String::WriteAscii() that translates nul bytes into spaces[1]. V8 has always done that but Node didn't trigger that code path in 0.2.x. [1] https://github.com/v8/v8/blob/e3319f4/src/api.cc#L3615 (warning, big file) |
v8::String::Utf8Write() appends '\0' if there is room left in the output buffer. Pass in the exact decoded length obtained with v8::String::Utf8Length() to make it stop doing that. Fixes nodejs#394.
Assert that the trailing '\0' is not swallowed in UTF-8 input to the Buffer constructor.
The problem is that V8 appends a nul byte if there is still room left in the output buffer. We can work around that by passing in the exact byte length as returned by
Current master:
With 22fabd4 applied:
An alternative take is to patch V8 to not append '\0' if requested. That might be necessary for #297 anyway (0x00 is converted to 0x20 in ASCII input) because there is no way right now to side-step that. @ry Can you review? |
long string version:
current master:
with 22fabd4 applied:
with 6079f1a applied:
|
I can't wait |
I see a similar ~40% performance drop with I suppose we should lobby the V8 guys to fix it but I suspect the patch from that issue you linked to won't pass muster. |
Never mind, I agree Ben. My patch is workaround to avoid calling May I tag "v8" on this issue? :-) |
@koichik Go for it. :) |
In node v0.3.0 buffers are not encoding null character '\0' correctly
Totally breaks creation of null terminated strings within buffers
The text was updated successfully, but these errors were encountered: