-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more documentation on Diffy, Sampler usage #65
Conversation
Codecov Report
@@ Coverage Diff @@
## master #65 +/- ##
==========================================
- Coverage 70.88% 70.86% -0.02%
==========================================
Files 22 22
Lines 941 968 +27
Branches 123 122 -1
==========================================
+ Hits 667 686 +19
- Misses 274 282 +8
Continue to review full report at Codecov.
|
ratatool-diffy/diffy.md
Outdated
|
||
## Usage | ||
|
||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to discuss the output and how to read it as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's discussed a bit in the earlier section. Do you think it needs to be more detailed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah maybe give an example of how to read the output since I know we mentioned it's hard to parse potentially
ratatool-sampling/sampler.md
Outdated
|
||
# BigSampler | ||
|
||
BigSampler will run a [Scio](https://github.com/spotify/scio) pipeline sampling either Avro or BigQuery data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we note that it supports gs://
paths as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
(pct, args("input"), args("output"), args.list("fields"), | ||
args.optional("seed")) | ||
} catch { | ||
case e: Throwable => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to catch only NonFatal
here instead of everything
http://www.scala-lang.org/api/current/scala/util/control/NonFatal$.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly I think it will be NoSuchElementException
, if we wanted to be more granular we can just catch on that I think. Not sure what args.list with throw
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made a few high level comments looks good otherwise
#53 #34