Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: build sst in stream way #747

Merged
merged 11 commits into from
Mar 22, 2023

Conversation

ShiKaiWi
Copy link
Member

Which issue does this PR close?

Closes #486

Rationale for this change

Refer to #486

What changes are included in this PR?

  • Introduce AsyncArrowWriter in the parquet_ext crate;
  • Build sst in stream way using AsyncArrowWriter;

Are there any user-facing changes?

None.

How does this change test

Existing tests.

@ShiKaiWi ShiKaiWi marked this pull request as draft March 16, 2023 10:09
@ShiKaiWi ShiKaiWi marked this pull request as ready for review March 17, 2023 06:13
@codecov-commenter
Copy link

codecov-commenter commented Mar 17, 2023

Codecov Report

Merging #747 (df6fb63) into main (56bb7b0) will increase coverage by 0.12%.
The diff coverage is 77.48%.

❗ Current head df6fb63 differs from pull request most recent head eb9ecaf. Consider uploading reports for the commit eb9ecaf to get more accurate results

@@            Coverage Diff             @@
##             main     #747      +/-   ##
==========================================
+ Coverage   68.43%   68.56%   +0.12%     
==========================================
  Files         294      293       -1     
  Lines       45712    45628      -84     
==========================================
+ Hits        31284    31285       +1     
+ Misses      14428    14343      -85     
Impacted Files Coverage Δ
analytic_engine/src/instance/engine.rs 64.03% <ø> (ø)
analytic_engine/src/instance/mod.rs 83.33% <ø> (ø)
analytic_engine/src/sst/factory.rs 89.28% <ø> (ø)
analytic_engine/src/sst/writer.rs 92.85% <ø> (ø)
components/parquet_ext/src/lib.rs 100.00% <ø> (ø)
server/src/http.rs 0.00% <0.00%> (ø)
server/src/server.rs 0.00% <0.00%> (ø)
sql/src/frontend.rs 0.00% <0.00%> (ø)
sql/src/influxql/mod.rs 100.00% <ø> (+100.00%) ⬆️
sql/src/planner.rs 91.96% <0.00%> (-0.06%) ⬇️
... and 16 more

... and 1 file with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@jiacai2050 jiacai2050 self-requested a review March 17, 2023 06:50
@jiacai2050
Copy link
Contributor

I guess features should be tested against OSS.

analytic_engine/src/sst/writer.rs Outdated Show resolved Hide resolved
components/parquet_ext/src/async_arrow_writer.rs Outdated Show resolved Hide resolved
analytic_engine/src/sst/parquet/writer.rs Outdated Show resolved Hide resolved
analytic_engine/src/sst/parquet/writer.rs Outdated Show resolved Hide resolved
analytic_engine/src/sst/parquet/writer.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@jiacai2050 jiacai2050 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ShiKaiWi ShiKaiWi added this pull request to the merge queue Mar 22, 2023
@ShiKaiWi ShiKaiWi merged commit cba54b3 into apache:main Mar 22, 2023
chunshao90 pushed a commit to chunshao90/ceresdb that referenced this pull request May 15, 2023
* feat: support build sst in stream way

* use AsyncWrite for building sst procedure

* shutdown when close the async writer

* build sst in streaming way

* find the custom metadata

* better names

* add config for write sst max buffer size

* polish up some comments

* address CR

* fix license header

* use readable size for write_sst_max_buffer_size
@ShiKaiWi ShiKaiWi deleted the feat-build-sst-in-stream-way branch May 29, 2023 08:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Build sst file in more resource friendly way
3 participants