-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LDAModel depracation warning (issue 494) #513
Conversation
@@ -367,7 +367,15 @@ def inference(self, chunk, collect_sstats=False): | |||
# Lee&Seung trick which speeds things up by an order of magnitude, compared | |||
# to Blei's original LDA-C code, cool!). | |||
for d, doc in enumerate(chunk): | |||
ids = [id for id, _ in doc] | |||
if isinstance(doc, numpy.ndarray): | |||
if len(doc) == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
len(array)
will return the length along the first axis (=number of rows). Is that always the case here, is it what we want?
Thanks Matti. We can do these ad-hoc fixes in various places, though I still think it's cleaner to solve this at the place of origin = where the float values are generated ( Also, what's the advantage of your |
If the I didn't consider the case of 1-d documents, the I thought I was being so clever. |
:D You are clever, just in this case I think simplicity and consistency are more important (it's not the bottleneck). |
…for calling .update with chunks_as_numpy=True/False.
Ok, so I decided to keep the check for the IDs being ints since I assume numpy will complain about it, pyro or no pyro. The default is now for the chunks to be not numpy, and there's an option on I had an issue with some tests failing because my distro wasn't picking up on the added keyword parameter to |
That looks like a nice solution! Thanks. If it later turns out this "make sure ids are ints" functionality is more widely used, we could factor it out into a separate function, something like Travis reports all tests as failing though:
|
Ah I see, the fails are coming from |
Thank you very much Matti, ready to merge! Can you add yourself + a short description of this change to CHANGELOG? |
I always forget the CHANGELOG |
LDAModel depracation warning (issue 494)
Great, thanks! |
Unfortunately LDA doesn't converge on Windows.
|
That's probably unrelated to this PR -- or do you see a connection? In general, on these "Windows failures": maybe the RNG works differently on Windows, which causes tests that pass on *NIX to fail on Windows? |
It was passing on Sunday and it's the only change I can find around it.
|
I decided to not call
utils.grouper
withas_numpy=False
and break the Pyro stuff as I thought we can work around that. So now there's just a check in.inference
that makes sure the indices areints
when a numpy array is passed in.This seemed like a better choice as it doesn't introduce any breaking changes unnecessarily.