Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add write to output stream #1468

Merged
merged 4 commits into from
Dec 16, 2024
Merged

Add write to output stream #1468

merged 4 commits into from
Dec 16, 2024

Conversation

snuyanzin
Copy link
Collaborator

@snuyanzin snuyanzin commented Dec 15, 2024

This is the next step after #1176 (some formats are already supporting streaming).

Now it would make sense to add ability to write data to output streams (in tests there is an example with files).

Currently supported for json, sql, csv, xml

This means for those formats it is possible to write huge amount of data into files or somewhere else (since streaming is happening in row basis keep in mind that the case when one row of data is not able to be kept in memory is not supported yet)

fyi: @eliasnogueira , I noticed you've mentioned something similar to huge files writing in one of the talks

@snuyanzin snuyanzin force-pushed the file branch 3 times, most recently from 6879c73 to 72268ce Compare December 15, 2024 14:50
@codecov-commenter
Copy link

codecov-commenter commented Dec 15, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 82.35294% with 3 lines in your changes missing coverage. Please review.

Project coverage is 92.32%. Comparing base (0be88ad) to head (f30ad74).

Files with missing lines Patch % Lines
...ava/net/datafaker/transformations/Transformer.java 75.00% 2 Missing ⚠️
...tafaker/transformations/JavaObjectTransformer.java 0.00% 1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #1468      +/-   ##
============================================
+ Coverage     92.26%   92.32%   +0.05%     
- Complexity     3243     3251       +8     
============================================
  Files           324      324              
  Lines          6377     6387      +10     
  Branches        622      623       +1     
============================================
+ Hits           5884     5897      +13     
+ Misses          343      340       -3     
  Partials        150      150              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -111,13 +113,13 @@ public String generate(Schema<IN, ?> schema, int limit) {
@Override
public String getStartStream(Schema<IN, ?> schema) {
StringBuilder sb = new StringBuilder();
generateHeader(null, sb);
generateHeader(schema, sb, false);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suddenly it was an uncovered bug leading to NPE 🙈

@snuyanzin snuyanzin requested a review from kingthorin December 15, 2024 20:32
Comment on lines 58 to 63
return null;
}

@Override
public String getEndStream() {
throw new UnsupportedOperationException();
return null;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to return empty string?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's where I'm still not sure, empty string also might be ok here

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In both cases

L58 too

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't get it, it has been already changed in previous commit
f30ad74
Or what exactly do you mean?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for some reason I was only seeing one changed. (Maybe I looked at a single commit vs the whole change set 🤷‍♂️)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no problem, thanks for the feedback

@snuyanzin snuyanzin requested a review from kingthorin December 16, 2024 07:57
@asolntsev asolntsev added this to the 2.4.3 milestone Dec 16, 2024
@snuyanzin snuyanzin merged commit b8151dd into datafaker-net:main Dec 16, 2024
10 checks passed
@eliasnogueira
Copy link
Contributor

eliasnogueira commented Dec 17, 2024

This is the next step after #1176 (some formats are already supporting streaming).

Now it would make sense to add ability to write data to output streams (in tests there is an example with files).

Currently supported for json, sql, csv, xml

This means for those formats it is possible to write huge amount of data into files or somewhere else (since streaming is happening in row basis keep in mind that the case when one row of data is not able to be kept in memory is not supported yet)

fyi: @eliasnogueira , I noticed you've mentioned something similar to huge files writing in one of the talks

This is awesome!
Thank you for letting me know.

In the examples I show I've implemented a simple writer. This feature will make a lot of difference in writing large amounts of data.
https://github.com/eliasnogueira/datafaker-java-examples/tree/main/src/test/java/com/eliasnogueira/datafaker/schema

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants