-
Notifications
You must be signed in to change notification settings - Fork 615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sqlsmith bug hunt: channels not propagating errors #3908
Comments
This is because the channel we use between executors passes And so when we read from the channel, the errors are not propagated: Am thinking of changing the Item type in fifo channel to |
We can modify the error message to be more user friendly, like "[E2333] internal error when computing", where E2333 = fifo channel broken. Then developers can go into the log dashboard (maybe ELK in the future) to find what's happening. |
That makes sense, the above behaviour should be undefined, following PostgreSQL, so it is alright to return internal error. Instead our fuzzers need to generate queries w/o undefined behaviour, since those may raise internal errors as above (which we can't distinguish with actual internal system errors). |
Actually what about other defined errors? E.g. division by zero. These errors are also not propagated in RisingWave but in postgres they return the relevant error code. 🤔 RisingWave dev=> create table t(v int); CREATE_TABLE
dev => insert into t values(0);
INSERT 0 1
dev=> select (SMALLINT '33' / SMALLINT v) from t;
ERROR: RPC error: Status { code: Internal, message: "internal error: broken fifo_channel", metadata: MetadataMap { headers: {"risingwave-error-bin": "CAESI2ludGVybmFsIGVycm9yOiBicm9rZW4gZmlmb19jaGFubmVs"} }, source: None } Postgres should throw division by zero error:
Related: #3939 |
Related issue: #2473 |
Had an offline discussion with @lmatz and he suggested the following:
This currently blocks sqlsmith work, since we can't know what the source of This issue will take a fair amount of work to resolve. |
For stream, we can collect the actor errors here: https://github.com/singularity-data/risingwave/blob/b64d8294c091413fa517eb70a0659ae3c6bcd3d0/src/stream/src/task/stream_manager.rs#L621 and then pipe it directly from the node to the frontend. |
related issue #3730 |
Create a issue relates to this: #4811 |
Some expressions might only fail during runtime, for instance if they only fetch columns.
Errors should be propagated. In the following cases, the underlying error is
NumericOutOfRangeError
, but it is not propagated to downstream executors.The text was updated successfully, but these errors were encountered: