-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move ML lib data generator files to util/ #711
Conversation
Thank you for your pull request. All automated tests for this request have passed. |
I think this is a fine place to put this - but can we call it "LogisticRegressionDataGenerator"? or something like that? Also, can we factor the part that produces the RDD into it's own function within the LogisticRegression[Data]Generator object? I think having Data Generation functions available externally that don't rely on writing things to disk will be extremely useful for e.g. unit tests. Separately, I think it should be pretty easy to set up such a function with an arbitrary input distribution: maybe it could take something like this: The last part is probably overkill, though. |
Thanks for the comments. Renamed files to |
Thank you for submitting this pull request. Unfortunately, the automated tests for this request have failed. |
FYI - the Jenkins build failed because it couldn't reach github.com. |
The refactored code looks good to me, thanks Shivavaram! |
Move ML lib data generator files to util/
Thanks guys, merged this in. |
Add `limit` transformation to `SchemaRDD`. Author: Takuya UESHIN <ueshin@happy-camper.st> Closes mesos#711 from ueshin/issues/SPARK-1778 and squashes the following commits: 33169df [Takuya UESHIN] Add 'limit' transformation to SchemaRDD.
The generator classes are out of place in the regression directory --- I've moved them to util to avoid creating another directory (say data or something like that ?)
@mateiz, @etrain, @atalwalkar -- Any other ideas ?