-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-6053][MLLIB] support save/load in PySpark's ALS #4811
Conversation
Test build #28056 has started for PR 4811 at commit
|
Test build #28056 has finished for PR 4811 at commit
|
Test PASSed. |
I messed up not passing sc to save/load. Is this patch going into 1.3? If not, then I'll submit a separate patch fixing the documentation (which will conflict a little). |
If we have couple days before RC2, this would be nice to have. We use the same API as in Scala/Java and there is no real implementation in this PR. Having save/load would benefit many users. |
Test build #28088 has started for PR 4811 at commit
|
@@ -220,6 +218,10 @@ predictions = model.predictAll(testdata).map(lambda r: ((r[0], r[1]), r[2])) | |||
ratesAndPreds = ratings.map(lambda r: ((r[0], r[1]), r[2])).join(predictions) | |||
MSE = ratesAndPreds.map(lambda r: (r[1][0] - r[1][1])**2).reduce(lambda x, y: x + y) / ratesAndPreds.count() | |||
print("Mean Squared Error = " + str(MSE)) | |||
|
|||
# Save and load model | |||
model.save("myModelPath") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add sc to save call.
Also import MatrixFactorizationModel
Test build #28088 has finished for PR 4811 at commit
|
Test PASSed. |
LGTM. I ran into a bug running the example, but it seems to be coming from elsewhere. It happens when calling train---and not all the time, only sometimes:
I'll make a separate JIRA for it. |
Made JIRA: [https://issues.apache.org/jira/browse/SPARK-6071] |
Test build #28151 has started for PR 4811 at commit
|
Test build #28151 has finished for PR 4811 at commit
|
Test PASSed. |
LGTM |
A simple wrapper to save/load `MatrixFactorizationModel` in Python. jkbradley Author: Xiangrui Meng <meng@databricks.com> Closes #4811 from mengxr/SPARK-5991 and squashes the following commits: f135dac [Xiangrui Meng] update save doc 57e5200 [Xiangrui Meng] address comments 06140a4 [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into SPARK-5991 282ec8d [Xiangrui Meng] support save/load in PySpark's ALS (cherry picked from commit aedbbaa) Signed-off-by: Xiangrui Meng <meng@databricks.com>
Merged into master and branch-1.3. Thanks! |
A simple wrapper to save/load
MatrixFactorizationModel
in Python. @jkbradley