-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Invoke fuse to ensure output of heterogeneous csv #1271
Comments
…by philrz This is an auto-generated commit with a zq dependency update. The zq PR brimdata/zed#1300, authored by @philrz, has been merged. Output format changes: Add "csv", remove "types" While verifying brimdata/zed#1237, I noticed that CSV is not yet listed among the output formats. I wondered if maybe we were intentionally holding off on revealing it until we address brimdata/zed#1271, but it seems useful enough in its present form that I'm proposing here that we reveal it now. I'd also recalled seeing @mccanne mention recently that `types` was removed as an output format. Indeed, as of `zq` commit `4bce00d`: ``` $ zq -version Version: v0.21.0-27-g4bce00d ``` Therefore I'm also taking that out while I'm at it.
This is an auto-generated commit with a zq dependency update. The zq PR brimdata/zed#1880, authored by @nwt, has been merged. Extract Fuser from proc/fuse.Proc Groundwork for brimdata/zed#1271.
This is an auto-generated commit with a zq dependency update. The zq PR brimdata/zed#1908, authored by @nwt, has been merged. Invoke fuse for CSV output Send records through a `fuse.Fuser` for 1. /search?format=csv` in `zqd` 2. `zq -f csv` unless `-csvfuse=false` Depends on brimdata/zed#1880. Closes brimdata/zed#1271.
Verified in Circling back to the previous behavior, in the last GA
Now with the benefit of the enhancement, the output proceeds to completion.
And if the user is confident their data should conform to a single record definition and hence want to avoid the two passes through the data to guarantee a
Thanks @nwt! |
The CSV writer should be able to write output zng data that comes from different record types. Including the new
fuse
processor at the end of the ZQL pipeline ensures this is possible today. However,fuse
requires making two passes through the data, which has a performance cost and delays the immediate stream of output. Power users that are confident their data already conforms to a single record definition may want to avoid this penalty.As a group we discussed adding a flag to determine this behavior when CSV output format is requested. In one mode it would always implicitly add
fuse
to the pipeline even if the user didn't request it, ensuring successful CSV output no matter what. There was consensus that this behavior would be invoked by the Brim app for CSV export. The other mode would follow the current behavior where only a single pass is made through the data and output stops as soon as a record is encountered in the stream that doesn't match the schema for the header already printed, at which point the user would see a message that effectively tells them to rework their query or explicitly addfuse
. There still seemed to be some room for debate on whetherzq
at the command line should also default to the "alwaysfuse
" behavior planned for the Brim app or if thezq
default should flip to this more "power user" mode.The text was updated successfully, but these errors were encountered: