Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature(format): refactor output format #5422

Merged
merged 14 commits into from
May 18, 2022

Conversation

sundy-li
Copy link
Member

@sundy-li sundy-li commented May 17, 2022

I hereby agree to the terms of the CLA available at: https://databend.rs/dev/policies/cla/

Summary

Summary about this PR:

  1. Make clickhouse handler support output format.
  2. Make clickhouse handler support post SQL in body.
  3. Introduce parquet/csv/ndjson output format

Changelog

  • New Feature

Related Issues

Fixes #5433

@vercel
Copy link

vercel bot commented May 17, 2022

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Updated
databend ⬜️ Ignored (Inspect) May 18, 2022 at 8:20AM (UTC)

@sundy-li sundy-li changed the title feature(format): refactor-output-format feature(format): refactor output format May 17, 2022
@mergify
Copy link
Contributor

mergify bot commented May 17, 2022

Thanks for the contribution!
I have applied any labels matching special text in your PR Changelog.

Please review the labels and make any necessary changes.

@mergify mergify bot added the pr-feature this PR introduces a new feature to the codebase label May 17, 2022
@sundy-li sundy-li marked this pull request as ready for review May 18, 2022 03:55
@sundy-li sundy-li requested a review from BohuTANG as a code owner May 18, 2022 03:55
@sundy-li sundy-li requested a review from youngsofun May 18, 2022 03:56
Copy link
Member

@BohuTANG BohuTANG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job!
I think this is useful for the result cache cc @youngsofun @flaneur2020

@BohuTANG
Copy link
Member

BohuTANG commented May 18, 2022

The unit-test is skipped because the disk space is full:

[test_unit](https://github.com/datafuselabs/databend/runs/6482164666?check_suite_focus=true)
System.IO.IOException: No space left on device : '/home/runner/runners/2.291.1/_diag/Worker_20220518-030049-utc.log'
   at System.IO.RandomAccess.WriteAtOffset(SafeFileHandle handle, ReadOnlySpan`1 buffer, Int64 fileOffset)
   at System.IO.Strategies.BufferedFileStreamStrategy.FlushWrite()
   at System.IO.StreamWriter.Flush(Boolean flushStream, Boolean flushEncoder)
   at System.Diagnostics.TextWriterTraceListener.Flush()
   at GitHub.Runner.Common.HostTraceListener.WriteHeader(String source, TraceEventType eventType, Int32 id)
   at GitHub.Runner.Common.HostTraceListener.TraceEvent(TraceEventCache eventCache, String source, TraceEventType eventType, Int32 id, String message)
   at System.Diagnostics.TraceSource.TraceEvent(TraceEventType eventType, Int32 id, String message)
   at GitHub.Runner.Worker.Worker.RunAsync(String pipeIn, String pipeOut)
   at GitHub.Runner.Worker.Program.MainAsync(IHostContext context, String[] args)
System.IO.IOException: No space left on device : '/home/runner/runners/2.291.1/_diag/Worker_20220518-030049-utc.log'
   at System.IO.RandomAccess.WriteAtOffset(SafeFileHandle handle, ReadOnlySpan`1 buffer, Int64 fileOffset)
   at System.IO.Strategies.BufferedFileStreamStrategy.FlushWrite()
   at System.IO.StreamWriter.Flush(Boolean flushStream, Boolean flushEncoder)
   at System.Diagnostics.TextWriterTraceListener.Flush()
   at GitHub.Runner.Common.HostTraceListener.WriteHeader(String source, TraceEventType eventType, Int32 id)
   at GitHub.Runner.Common.HostTraceListener.TraceEvent(TraceEventCache eventCache, String source, TraceEventType eventType, Int32 id, String message)
   at System.Diagnostics.TraceSource.TraceEvent(TraceEventType eventType, Int32 id, String message)
   at GitHub.Runner.Common.Tracing.Error(Exception exception)
   at GitHub.Runner.Worker.Program.MainAsync(IHostContext context, String[] args)
Unhandled exception. System.IO.IOException: No space left on device : '/home/runner/runners/2.291.1/_diag/Worker_20220518-030049-utc.log'
   at System.IO.RandomAccess.WriteAtOffset(SafeFileHandle handle, ReadOnlySpan`1 buffer, Int64 fileOffset)
   at System.IO.Strategies.BufferedFileStreamStrategy.FlushWrite()
   at System.IO.StreamWriter.Flush(Boolean flushStream, Boolean flushEncoder)
   at System.Diagnostics.TextWriterTraceListener.Flush()
   at System.Diagnostics.TraceSource.Flush()
   at GitHub.Runner.Common.TraceManager.Dispose(Boolean disposing)
   at GitHub.Runner.Common.TraceManager.Dispose()
   at GitHub.Runner.Common.HostContext.Dispose(Boolean disposing)
   at GitHub.Runner.Common.HostContext.Dispose()
   at GitHub.Runner.Worker.Program.Main(String[] args)

Re-run the job to check it.

@BohuTANG
Copy link
Member

@sundy-li Seems some tests break the unit-test in this patch. There is no space left for the CI, so the unit-test is skipped every time.

image

@BohuTANG
Copy link
Member

BohuTANG commented May 18, 2022

Oops, it's not related to this patch, another PR has the same issue too:
https://github.com/datafuselabs/databend/actions/runs/2343011703

Ping @Xuanwo @everpcpc for help

let column: &StringColumn = Series::check_get(column)?;
let result: Vec<String> = column
.iter()
.map(|v| format!("{:?}", String::from_utf8_lossy(v)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need escape, but we can do it later.

@Xuanwo
Copy link
Member

Xuanwo commented May 18, 2022

Oops, it's not related to this patch, another PR has the same issue too: https://github.com/datafuselabs/databend/actions/runs/2343011703

Ping @Xuanwo @everpcpc for help

Maybe related to our rust cache, I have reset it to see whether it works.

@@ -186,6 +186,7 @@ fn test_query() {
let mut mint = Mint::new("tests/it/testdata");
let mut file = mint.new_goldenfile("query.txt").unwrap();
let cases = &[
r#"select * from a limit 3 offset 4 format csv"#,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sundy-li where supposed to use the format in this SQL?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the end of the sql.

@sundy-li
Copy link
Member Author

Now unit tests have some bugs, I'll fix them later.

@Xuanwo
Copy link
Member

Xuanwo commented May 18, 2022

Is test_unit able to run without hitting no space left error?

@mergify mergify bot merged commit 3d5700c into databendlabs:main May 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need-review pr-feature this PR introduces a new feature to the codebase
Projects
Development

Successfully merging this pull request may close these issues.

Tracking support output format
5 participants