-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Str to byte fix for DTM wrapper. #768
Conversation
Note: would ideally want to convert to bytes from the list itself instead of converting to string and then to byte. |
@tmylk , I can't directly convert from integer to bytes while writing to file, it has to be to string first. Is there any idea for a workaround to this? |
@@ -173,9 +173,9 @@ def convert_input(self, corpus, time_slices): | |||
corpora.BleiCorpus.save_corpus(self.fcorpustxt(), corpus) | |||
|
|||
with utils.smart_open(self.ftimeslices(), 'wb') as fout: | |||
fout.write(six.u(str(len(self.time_slices)) + "\n")) | |||
fout.write(six.u(utils.to_utf8(str(len(self.time_slices)) + "\n"))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get this. Doesn't six.u
return unicode? How does this work @tmylk ?
@bhargavvader we want to store binary strings, not "bytes" as such. We don't want to be storing numbers as raw bytes here, as in Integers are always representable in ASCII encoding (subset of utf8), so the bytes/unicode conversion is no problem. |
@piskvorky , here it says that I understand we'd want to store it in binary strings, yeah. Edit: Put in a PR and removed |
I don't understand how this version with storing unicode to binary files even worked. It means our unit tests must be faulty / incomplete. |
Were the changes pushed to PyPI? Pip installs version 0.13.1. and the bug still persists. |
@piskvorky , this is with respect to #698.
You were right, all that had to be done was use
utils.to_utf8
before writing to file.Works fine with both python2 and 3.