Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RESP2 parser error with large, multi-chunk responses #2419

Closed
fast-facts opened this issue Feb 23, 2023 · 20 comments
Closed

RESP2 parser error with large, multi-chunk responses #2419

fast-facts opened this issue Feb 23, 2023 · 20 comments
Labels

Comments

@fast-facts
Copy link
Contributor

fast-facts commented Feb 23, 2023

Description

I'm running an FT.Search command that returns 5000 items (using a limit argument). Upon debugging, I noticed the data usually comes in through 8-10 chunks. 95% of the time, the command just hangs. I have to reduce the limit argument to about 1000 for the command to succeed regularly. It seems to be there's some race condition with a larger number of chunks.

I confirmed the issue is not with the redis server or with the network as I was running it directly from the redis-cli command without any issues.

After much debugging, I boiled down the issue to the write function in the RESP2 decoder.

After adding various logging, I noticed that you can guarantee the function finishes, at least for me, if you slow down the parseBulkString function, but obviously the function takes way longer to finish. I added a console.trace on line 178 to achieve this. It is possible that I could just slow down some other parts of the file as well to achieve the same result. It might not be anything special about that function.

I'm unsure of the nuances of it all to be able to resolve this on my own, but here's the information that I've gathered, mostly through inserting various logging. I'll keep working on it and providing feedback that might be helpful.

I modified the function as such to add logging

    write(chunk) {
        console.log('start:', this.cursor, chunk.length) // <---------- added this line
        while (this.cursor < chunk.length) {
            if (!this.type) {
                this.currentStringComposer = this.options.returnStringsAsBuffers() ?
                    this.bufferComposer :
                    this.stringComposer;
                this.type = chunk[this.cursor];
                if (++this.cursor >= chunk.length) {
                    break;
                }
            }
            const reply = this.parseType(chunk, this.type);
            if (reply start: undefined) {
                break;
            }
            this.type = undefined;
            this.options.onReply(reply);
        }
        this.cursor -= chunk.length;
        console.log('end:', this.cursor, chunk.length) // <---------- added this line
    }

Here's a sample of the log it produces during a successful run

start: 0 1460
end: 0 1460
start: 0 13140
end: 2 13140
start: 2 2920
end: 0 2920
start: 0 1460
end: 0 1460
start: 0 24820
end: 0 24820
start: 0 4380
end: 1 4380
start: 1 1460
end: 0 1460
start: 0 10220
end: 0 10220
start: 0 24795
end: 0 24795

Here's a sample of a hanging run (notice the negative numbers)

start: 0 1460
end: 0 1460
start: 0 13140
end: 2 13140
start: 2 1460
end: 0 1460
start: 0 7300
end: -7298 7300
start: -7298 20440
end: -27737 20440
start: -27737 1460
end: -29196 1460
start: -29196 14600
end: -43795 14600
start: -43795 24795
end: -68589 24795

Node.js Version

18.12.1

Redis Server Version

7.0.8

Node Redis Version

4.6.4

Platform

Windows and Linux

Logs

No response

@fast-facts fast-facts added the Bug label Feb 23, 2023
@fast-facts
Copy link
Contributor Author

@leibale, tagging you here since you made that decoder and no one has modified it since. Hope you don't mind.

@leibale
Copy link
Contributor

leibale commented Feb 23, 2023

@fast-facts thanks for reporting! Can you please share code that reproduce the issue so I'll be able to debug that?

@fast-facts
Copy link
Contributor Author

See below. I did also notice if I remove the RETURN: '0', the issue goes away. However, I do want to do NOCONTENT and that argument isn't available in client.ft.search and the ft.search page states that RETURN: '0' acts like NOCONTENT.

import { createClient } from 'redis';

void (async () => {
  const client = createClient({ url: 'redis://url' });

  await client.connect();
  console.log('Connected');

  const { total, documents } = await client.ft.search('idx', '*', { RETURN: '0', LIMIT: { from: 0, size: 5000 }, DIALECT: 2 });

  console.log(total);
  console.log(documents.length);
})();

@leibale
Copy link
Contributor

leibale commented Feb 23, 2023

@fast-facts tnx, i'll use this to debug it tomorrow
Regarding RETURN: 0 - use RETURN: [] instead. NOCONTENT is indeed not implemented, but that's not high on the list (cause you can do RETURN: [] as a workaround).

edit: I guess I'll need to generate 5k records first... can you share your schema, please?
another edit: BTW, I'm currently working on a new decoder that will have support for RESP3, hopefully its already fixed in the new parser

@fast-facts
Copy link
Contributor Author

Return: [] looks to have resolved the original problem (I'm able to query arbitrarily large data sets without issue). Thank you for that, Return: 0 was clearly a stupid mistake on my part.

However, Return: [] produces an incorrect result. More specifically, each second document is used to create the object for the prior document. To provide a simple example, I loaded just 2 records into the database, 1642500 and 2497885 (they are just internal identifiers to external documents). Here's what I get when I use Return: [].

Notice how the second id, 2497885, gets converted into {2: 4, 9: 7, 8: 8, 5: undefined} and is the value associated with the first id.

[
  {
    id: '1642500',
    value: [Object: null prototype] {
      '2': '4',
      '5': undefined,
      '8': '8',
      '9': '7'
    }
  }
]

Full code:

import { createClient } from 'redis';

void (async () => {
  const client = createClient({ url: 'redis://url' });

  await client.connect();
  console.log('Connected');

  const { total, documents } = await client.ft.search('idx', '*', { RETURN: [], LIMIT: { from: 0, size: 5000 }, DIALECT: 2 });

  console.log(documents);

  process.exit();
})();

Running the command directly:

127.0.0.1:6379> "FT.SEARCH" "idx" "*" "RETURN" "0" "LIMIT" "0" "5000" "DIALECT" "2"
1) (integer) 2
2) "1642500"
3) "2497885"

@acai2046
Copy link

See below. I did also notice if I remove the RETURN: '0', the issue goes away. However, I do want to do NOCONTENT and that argument isn't available in client.ft.search and the ft.search page states that RETURN: '0' acts like NOCONTENT.

import { createClient } from 'redis';

void (async () => {
  const client = createClient({ url: 'redis://url' });

  await client.connect();
  console.log('Connected');

  const { total, documents } = await client.ft.search('idx', '*', { RETURN: '0', LIMIT: { from: 0, size: 5000 }, DIALECT: 2 });

  console.log(total);
  console.log(documents.length);
})();

err:
Uncaught (in promise) TypeError: url_1.URL is not a constructor

@fast-facts
Copy link
Contributor Author

See below. I did also notice if I remove the RETURN: '0', the issue goes away. However, I do want to do NOCONTENT and that argument isn't available in client.ft.search and the ft.search page states that RETURN: '0' acts like NOCONTENT.

import { createClient } from 'redis';

void (async () => {
  const client = createClient({ url: 'redis://url' });

  await client.connect();
  console.log('Connected');

  const { total, documents } = await client.ft.search('idx', '*', { RETURN: '0', LIMIT: { from: 0, size: 5000 }, DIALECT: 2 });

  console.log(total);
  console.log(documents.length);
})();

err: Uncaught (in promise) TypeError: url_1.URL is not a constructor

I removed the url from actual sample code and replaced it with 'redis://url' as the original url I had is A) sensitive and B) inaccessible to the public.

@leibale
Copy link
Contributor

leibale commented Feb 23, 2023

@fast-facts which version of redisearch are you using?

@fast-facts
Copy link
Contributor Author

fast-facts commented Feb 23, 2023

@leibale Built from source using the guide: https://redis.io/docs/stack/search/quick_start/

Latest commit:

commit f2727687e3e680822f155713b96f73f17c1dee79 (HEAD -> master, origin/master, origin/HEAD)
Author: Rafi Einstein <rafi@redislabs.com>
Date:   Thu Jan 26 15:28:35 2023 +0200

    MOD-4560: Updated libuv (v1.44.2), also fixing crash during cluster failover (#3354)

@leibale
Copy link
Contributor

leibale commented Feb 23, 2023

@fast-facts I just checked with the latest version of RediSearch and looks like RETURN 0 omits the document from the reply (instead of empty array/null), I think it wasn't like that in the older versions of RedisSearch, but I'm not sure. I'll address this issue as well.

@fast-facts
Copy link
Contributor Author

@leibale Thank you, I appreciate your help on this.

@leibale
Copy link
Contributor

leibale commented Feb 23, 2023

@fast-facts I just tried with RedisSearch 2.0.16, the reply didn't change, it's #2366 that never worked with more than one document.. 🤦‍♂️
I'll fix that tomorrow, and hopefully I'll manage to fix the original issue as well and release a new version, but no promises..

Thanks for helping and answering all the questions, it's very helpful :)

edit:
#2421 - fix for RETURN: []

leibale added a commit to leibale/node-redis that referenced this issue Feb 23, 2023
@leibale
Copy link
Contributor

leibale commented Feb 23, 2023

@fast-facts I tried to reproduce the original issue using this code

import { createClient, SchemaFieldTypes } from 'redis';

const client = createClient();
client.on('error', err => console.error('Redis Client Error', err));

await client.connect();

const promises = [
  client.flushAll(),
  client.ft.create('idx', {
    field: SchemaFieldTypes.TEXT
  })
];
for (let i = 0; i < 100000; i++) {
  promises.push(client.hSet(i.toString(), 'field', 'value'));
}
await Promise.all(promises);

for (let i = 0; i < 100; i++) {
  console.log(`Starting ${i}`);
  const { documents: { length } } = await client.ft.search('idx', '*', { LIMIT: { from: 0, size: 100000 } }); 
  console.log(`${i} returnd ${length} documents`);
}
await client.disconnect();

but it's working... (make sure not to run this code with your production redis-server, it's executing the FLUSHALL command)

@fast-facts
Copy link
Contributor Author

fast-facts commented Feb 23, 2023

Yeah, I confirmed that the original issue is resolved when I changed from RETURN: 0 to RETURN: []. Though the new issue popped up which is what your PR is fixing.

Not sure how the RETURN: 0 caused the random hangs, but it was clearly bad syntax on my part anyway.

@leibale
Copy link
Contributor

leibale commented Feb 23, 2023

@fast-facts I'll try with RETURN: '0', I want to understand what is going on..

leibale added a commit to leibale/node-redis that referenced this issue Feb 23, 2023
@fast-facts
Copy link
Contributor Author

@leibale Do you, per chance, have an idea of when this would be merged in and a new release created?

@leibale
Copy link
Contributor

leibale commented Feb 24, 2023

@fast-facts I think it'll be ready tomorrow, I want to review #2424 again and do #2411. You can follow the progress here.

Sorry for the release delay...

@fast-facts
Copy link
Contributor Author

No problem, I appreciate your help.

leibale added a commit that referenced this issue Feb 24, 2023
* ref #2419 - fix FT.SEARCH RETURN []

* fix transformReply

* fix PROFILE SEARCH as well

* fix PROFILE SEARCH preserve

* move preserve login to `pushSearchOptions`

* attach preserve only if true

* fix RETURN: [] test
@leibale
Copy link
Contributor

leibale commented Feb 24, 2023

@fast-facts redis@4.6.5/@redis/client@1.5.6 is on npm 🎉

@fast-facts
Copy link
Contributor Author

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants