LDA hyperparameter fix: eta dimensionality and optimization #1024

olavurmortensen · 2016-11-18T11:15:11Z

Working on a problem with the parameter eta and automatic learning of eta. For now, the unchanged version is in the ldamodelold.py file, so that I can compare both versions. There is a notebook in docs/notebooks/lda_tests.ipynb with the tests I have been doing.

The changes I've made improve convergence (increase in bound over iterations), when learning eta.

The problem

Basically, eta is supposed to be either

a scalar
a V vector

where V is the size of the vocabulary. In ldamodel, we instead have the options that eta is

a scalar
a K vector
a K x V matrix

where K is the number of topics. As we will see in a little bit, this causes some problems.

Results so far

The figure below shows the bound (perwordbound) as a function of the number of passes over data. It shows the bound both before and after the changes I made, and for eta='symmetric' and for eta='auto'.

The figure above shows that using automatically learned eta actually decreased the bound before this fix.

I also noticed that if we set iterations=1, and eta='auto', the algorithm diverges. This is no longer a problem with this commit.

Asymmetric priors

I added a check that raises a ValueError if eta is equal to 'asymmetric'. While a explicit asymmetric prior supplied by the user makes sense (e.g. if the user wants to boost certain words, or has estimated the prior from data), there is no reason for using an arbitrarily initialized asymmetric prior. Furthermore, with the current method of initializing asymmetric priors, and when eta is the same size as the vocabulary, I noticed some convergence problems.

As a side note on the topic of asymmetric priors, I'm not sure the current method is ideal. One could, for example, initialize from a gamma distribution the same way lambda and gamma are initialized instead.

…d compare to the original.

…what changes I have made in the PR.

…aising ValueError if it is. Updated test notebook.

olavurmortensen · 2016-11-18T11:41:17Z

There is one problem that remains. The symmetric priors are initialized as
init_prior = np.asarray([1.0 / self.num_topics for i in xrange(prior_shape)])

instead of
init_prior = np.asarray([1.0 / prior_shape for i in xrange(prior_shape)])

Because if 1.0 / prior_shape is used, the problem does not converge. This happens because when the prior is a small number (e.g. eta = 0.001), the problem diverges. This does not happen if e.g. alpha=0.001, however.

While this works fine, I'm not sure if this way of initializing the priors is sound in general.

EDIT: Since Matt Hoffman's code (which Gensim's code is based on) uses eta = 1 / num_topics and alpha = 1 / num_topics, I gather we might as well do the same here (see Matt's code here).

…s,)'. Removed tests of asymmetric eta, and where eta has shape '(num_topics, num_terms)'.

olavurmortensen · 2016-11-18T13:08:53Z

PR ready for review.

@tmylk @piskvorky

hazelybell · 2016-11-18T17:41:48Z

I agree that this should be changed because when I wrote the original auto_eta code, I didn't realize that gensim was using the smoothed LDA model, and thus I misinterpreted the variables and my code doesn't make sense in the context of the smoothed LDA model.

olavurmortensen · 2016-11-20T13:46:14Z

The failing unit tests are not from my code. There is more on that on issue #971.

tmylk

Please re-do the merge with develop to make sure keyedvectors get properly merged in.

tmylk · 2016-11-20T17:23:57Z

docs/notebooks/lda_tests.ipynb

@@ -0,0 +1,762 @@
+{


Please give a better name to the ipynb

Once we are ready to merge, I'll remove this file.

tmylk · 2016-11-20T17:24:41Z

gensim/models/__init__.py

@@ -7,6 +7,7 @@
 from .coherencemodel import CoherenceModel
 from .hdpmodel import HdpModel
 from .ldamodel import LdaModel
+from .ldamodelold import LdaModelOld


when we merge, there is no need to keep the old code.

Once we are ready to merge, I'll remove this file.

tmylk · 2016-11-20T17:25:04Z

gensim/models/ldamodel.py

@@ -84,7 +84,7 @@ def update_dir_prior(prior, N, logphat, rho):

    dprior = -(gradf - b) / q

-    if all(rho * dprior + prior > 0):
+    if (rho * dprior + prior > 0).all():


why this change?

The old way has never worked for me. Whenever I install Gensim from source I have to change this line. Can we just keep it this way?

what is the error that you get? which python version?

I just tested it, and I don't experience this problem any more. Something I've done must have fixed it for me. So I just changed it back.

tmylk · 2016-11-20T17:25:22Z

gensim/models/ldamodel.py

@@ -301,17 +303,19 @@ def __init__(self, corpus=None, num_topics=100, id2word=None,
        self.minimum_phi_value = minimum_phi_value
        self.per_word_topics = per_word_topics

+        self.random_state = get_random_state(random_state)


why move this?

Oh, there was a reason, but it's not relevant any more. I'll just move it back to its old spot.

tmylk · 2016-11-20T17:26:00Z

gensim/models/ldamodel.py

-        assert (self.eta.shape == (self.num_topics, 1) or self.eta.shape == (self.num_topics, self.num_terms)), (
-            "Invalid eta shape. Got shape %s, but expected (%d, 1) or (%d, %d)" %
-            (str(self.eta.shape), self.num_topics, self.num_topics, self.num_terms))
+        assert self.eta.shape == (self.num_terms,), "Invalid alpha shape. Got shape %s, but expected (%d, )" % (str(self.eta.shape), self.num_terms)


the old format was ok

Ok, re-re-formatted :)

tmylk · 2016-11-20T17:27:37Z

gensim/models/ldamodelold.py

@@ -0,0 +1,1049 @@
+#!/usr/bin/env python


no need for this file in the merge.

Once we are ready to merge, I'll remove this file.

tmylk · 2016-11-20T17:28:42Z

gensim/test/test_ldamodel.py

-        model = self.class_(**kwargs)
-        self.assertEqual(model.eta.shape, expected_shape)
-        self.assertTrue(np.allclose(model.eta, [[0.630602], [0.369398]]))
+        self.assertTrue(all(model.eta == np.array([0.5] * num_terms)))


where is the assymmetric eta exception test?

The "asymmetric" option should not be used for eta. It makes arbitrary words more likely apriori, which does not make sense. I have added an exception in the ldamodel.py file to make sure that eta is not equal to "asymmetric".

where do you test that exception happens?
Please add assertRaises test case

It is tested here. I added an assertRaise in unit tests.

that is not a test, it is a check :) just kidding.

tmylk · 2016-11-20T17:30:41Z

gensim/test/test_ldamodel.py


        # all should raise an exception for being wrong shape
-        kwargs['eta'] = testeta.reshape(tuple(reversed(testeta.shape)))
-        self.assertRaises(AssertionError, self.class_, **kwargs)
-
        kwargs['eta'] = [0.3, 0.3, 0.3]


use num_terms+1 instead of just 3 elements

Done. Although just num_terms, not num_terms+1.

where is this change? line 213/196 is still the same

https://github.com/olavurmortensen/gensim/blob/lda_hyperparam_fix/gensim/test/test_ldamodel.py#L196

…anges on PR.

olavurmortensen · 2016-11-28T12:58:50Z

I re-introduced the K x V asymmetric prior. While a K vector eta is not correct, there is a special place for the K x V prior.

As said in the docstring for eta, using K x V asymmetric prior the user may impose specific priors for each topic (e.g. to coerce a specific topic, or to get a "garbage collection" topic).

@tmylk can you read the new docstring, just to check that it's ok?

EDIT: @tmylk I made some unit tests for K x V eta (same as were there before).

tmylk · 2016-11-28T14:00:20Z

gensim/test/test_ldamodel.py

-        self.assertRaises(AssertionError, self.class_, **kwargs)
-
-        kwargs['eta'] = [0.3, 0.3, 0.3]
+        kwargs['eta'] = [0.3] * num_terms


this is same as line 188 above

Rmoved it. Don't know why I put that there.

tmylk · 2016-11-28T15:08:03Z

gensim/test/test_ldamodel.py

@@ -210,15 +201,15 @@ def testEta(self):
        kwargs['eta'] = testeta.reshape(tuple(reversed(testeta.shape)))
        self.assertRaises(AssertionError, self.class_, **kwargs)

-        kwargs['eta'] = [0.3, 0.3, 0.3]
-        self.assertRaises(AssertionError, self.class_, **kwargs)
-
        kwargs['eta'] = [0.3]
        self.assertRaises(AssertionError, self.class_, **kwargs)


where did assertraises for a longer eta shape go?

I'll add an assertRaises where shape of eta is num_terms+1.

tmylk · 2016-11-28T15:09:08Z

Let's clean it up and then ready to merge.
Please add a good explanation to CHANGELOG.md. I will publish your explanation of this breaking change together with release notes.

olavurmortensen · 2016-11-28T15:15:57Z

@tmylk should the explanation be under "0.13.5, 2016-11-12" in the changelog? Or where should I put it?

olavurmortensen · 2016-11-28T15:38:50Z

@tmylk you said:

Please re-do the merge with develop to make sure keyedvectors get properly merged in.

What do you mean? What happened?

tmylk · 2016-11-28T17:59:24Z

The explanation should be on the top of CHANGELOG.md

olavurmortensen · 2016-11-29T14:24:34Z

@tmylk CHANGELOG.md updated.

olavurmortensen added 5 commits November 17, 2016 12:52

Copied ldamodel into ldamodel2. Going to make changes to ldamodel2 an…

db64eeb

…d compare to the original.

Fixed initialization of eta, and optimization of eta.

5557231

Changed the name of both LDA versions, so that it is possible to see …

7dee48d

…what changes I have made in the PR.

Added a notebook with tests.

e0cb079

Added check of eta shape. Added check that eta is not 'asymmetric', r…

baac8de

…aising ValueError if it is. Updated test notebook.

olavurmortensen added 2 commits November 18, 2016 13:23

Updated lda unit tests. Expected dimensions of eta are now '(num_term…

dc35083

…s,)'. Removed tests of asymmetric eta, and where eta has shape '(num_topics, num_terms)'.

Just removed a print statement.

7045ea7

tmylk suggested changes Nov 28, 2016

View reviewed changes

olavurmortensen added 2 commits November 28, 2016 13:28

Not logging eta, as it can be quite huge. Updates w.r.t. requested ch…

a1190fa

…anges on PR.

Re-introduced K x V asymmetric priors. Updated eta docstring.

8df642d

olavurmortensen added 2 commits November 28, 2016 14:34

Reverted positivity of prior check to the way it was before.

ce5b186

Added a assertionRaise for asymmetric option of eta.

24b3967

tmylk reviewed Nov 28, 2016

View reviewed changes

olavurmortensen added 2 commits November 28, 2016 15:09

Removed an incorrect unit test. Added unit tests for K x V eta.

6c61b3c

Fixed indentation.

6298621

olavurmortensen closed this Nov 28, 2016

olavurmortensen reopened this Nov 28, 2016

tmylk reviewed Nov 28, 2016

View reviewed changes

olavurmortensen added 2 commits November 28, 2016 16:12

Added an assertRaises where eta is too long.

0d33ac0

Removed temporary test notebook and old version of lda.

b0e9c57

Removed import of old version of lda.

66dad59

olavurmortensen added 2 commits November 29, 2016 15:17

Removed unnecessary comments.

bb0dfde

Updated CHANGELOG.md

013b5bb

tmylk merged commit 54871ba into piskvorky:develop Nov 29, 2016

LDA hyperparameter fix: eta dimensionality and optimization #1024

LDA hyperparameter fix: eta dimensionality and optimization #1024

Conversation

olavurmortensen commented Nov 18, 2016 • edited Loading

The problem

Results so far

Asymmetric priors

olavurmortensen commented Nov 18, 2016 • edited Loading

olavurmortensen commented Nov 18, 2016

hazelybell commented Nov 18, 2016

olavurmortensen commented Nov 20, 2016

tmylk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

olavurmortensen commented Nov 28, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tmylk commented Nov 28, 2016

olavurmortensen commented Nov 28, 2016

olavurmortensen commented Nov 28, 2016

tmylk commented Nov 28, 2016

olavurmortensen commented Nov 29, 2016

olavurmortensen commented Nov 18, 2016 •

edited

Loading

olavurmortensen commented Nov 18, 2016 •

edited

Loading

olavurmortensen commented Nov 28, 2016 •

edited

Loading