Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Changes in sklearn wrappers for LDA and LSI models #1398

Merged

Conversation

chinmayapancholi13
Copy link
Contributor

This PR makes the following changes in the sklearn wrappers for LDA and LSI models :

  • Doesn't train model from within the __init__ function of the wrappers (following the convention followed by the fuctions in scikit-learn)
  • Uses super while calling the parent class' __init__ function from fit function

@@ -46,15 +46,6 @@ def __init__(
self.gamma_threshold = gamma_threshold
self.minimum_probability = minimum_probability
self.random_state = random_state
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove corpus parameter from constructor (you pass a corpus only for fit* methods) in both models.

@menshikh-iv
Copy link
Contributor

The same as #1405

@chinmayapancholi13 chinmayapancholi13 changed the title Changes in sklearn wrappers for LDA and LSI models [WIP] Changes in sklearn wrappers for LDA and LSI models Jun 15, 2017
transformed_approx = matutils.sparse2full(transformed, 2) # better approximation
expected = [0.13, 0.87]
expected = [0.87, 0.0]
passed = numpy.allclose(sorted(transformed_approx), sorted(expected), atol=1e-1)
self.assertTrue(passed)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's often broken in Travis, please check, what is an reason of this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made changes to solve this problem and this should now be resolved. :)

@@ -191,6 +181,22 @@ def testSetGetParams(self):
for key in param_dict.keys():
self.assertEqual(model_params[key], param_dict[key])

def testPersistence(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as LdaSeq

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Done.

score = text_lda.score(corpus, data.target)
text_lsi = Pipeline((('features', model,), ('classifier', clf)))
text_lsi.fit(corpus, data.target)
score = text_lsi.score(corpus, data.target)
self.assertGreater(score, 0.50)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as LdaSeq

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Done.

@menshikh-iv menshikh-iv merged commit b989be7 into piskvorky:develop Jun 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants