-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Buffer.indexOf #4
Comments
|
Hey 👋 Regarding what I read in this issue, I thought that having something like that: if (encoding === 'utf8' || encoding === 'utf-8') {
return Uint8ArrayPrototype.indexOf.call(buffer, val, byteOffset);
} in https://github.com/nodejs/node/blob/main/lib/buffer.js#L943 could make the trick. Am I correct? |
Yes, but do not forgot also check for |
You also need to check the type of the search value. |
I give it a first try: nodejs/node@main...tony-go:node:buff-utf8-fast-path It doesn't work, even with this simple test: const buf = Buffer.from('abcde');
const index = buf.indexOf('b');
console.log('index of b is:', index); // "index of b is: -1" Whereas when I do this, it works perfectly: const buf = Buffer.from('abcde');
const index = buf.indexOf('b');
console.log(index); // 1
const index2 = Uint8Array.prototype.indexOf.call(buf, 'b');
console.log(index); // 1 I didn't have more time than that to investigate today, I'll continue later |
Val needs to be a number not a string |
I misunderstood this: "where the char size is 8", making me think that it was for string 🙈 But yeah, when we look at your benchmark you only use numbers. Thanks 🙌 |
I have an issue with these tests: // Test truncation of Number arguments to uint8
{
const buf = Buffer.from('this is a test');
assert.strictEqual(buf.indexOf(0x6973), 3);
assert.strictEqual(buf.indexOf(0x697320), 4);
assert.strictEqual(buf.indexOf(0x69732069), 2);
assert.strictEqual(buf.indexOf(0x697374657374), 0);
assert.strictEqual(buf.indexOf(0x69737374), 0);
assert.strictEqual(buf.indexOf(0x69737465), 11);
assert.strictEqual(buf.indexOf(-140), 0); // <======= here!
assert.strictEqual(buf.indexOf(-152), 1);
assert.strictEqual(buf.indexOf(0xff), -1);
assert.strictEqual(buf.indexOf(0xffff), -1);
} I updated my code to handle the positive hex ones: const isUtf8 = encoding === 'utf8' || encoding === 'utf-8' || encoding === undefined;
if (isUtf8 && typeof val === 'number') {
const method = dir ? 'indexOf' : 'lastIndexOf';
let value = val % 256;
return Uint8ArrayPrototype[method].call(buffer, value, byteOffset); } Indeed, I don't really know how to truncate the number into uint8. Also, |
It seems neither V8 nor Node uses simd for this... Something like: // 0 for not found, non-zero for found. (Bit position tells you where).
unsigned contains(__m128i data, uint8_t needle) {
__m128i k = _mm_set1_epi8(needle);
__m128i cmp = _mm_cmpeq_epi8(data, k); // vector mask
return _mm_movemask_epi8(cmp); // integer bitmask
} Could provide a significant speedup. |
Thanks for sharing @ronag 👍 Is that: https://www.cs.virginia.edu/~cr4bd/3330/S2018/simdref.html what you refer? |
A bit stuck here: https://github.com/nodejs/node/pull/45627/files#r1032668045 Maybe I could cast it with bitwise, I'll browse the web. EDIT: I think I found something |
It seems
Buffer.indexOf
is much slower thanUint8Array.indexOf
. Not sure why we implement our ownindexOf
instead of just using the base classindexOf
.The text was updated successfully, but these errors were encountered: