-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MmCorpus file-like object support bug #1869
Comments
@menshikh-iv I am trying to fix this by replacing the "with" statement with the following code ,
Do you have any other suggestions for this ?
Current code looks like this
|
|
@menshikh-iv Using with statement will close the file object anyways so do you plan to use the nested try and finally blocks as mentioned in that thread ? For indexed corpus , we need to support both file-object and filename right ? So i guess for the file object support we can directly convert it into a numpy array. |
@sj29-innovate about with - yes, this is a problem in this case (because we shouldn't close the file if we didn't open it). For indexed-corpus - yes, we need both (but I'm not sure about possibility) |
@menshikh-iv I believe we can support both filename and file-like object with a difference that we wont be able to save it to an index file in case of the file object. This can create performance issues if in any subclass we need to call "super" so the "init" function of "IndexedCorpus" will again need to generate a new indexed corpus form the object instead of using a previously saved indexed corpus. Also please have a look at this , I tried to fix the file closing problem using nested try and finally blocks as mentioned in that thread. with utils.file_or_filename(self.input) as lines:
BLOCK Changes : mgr = (utils.file_or_filename(self.input))
exit = type(mgr).__exit__
value = type(mgr).__enter__(mgr)
exc = True
try:
try:
lines = value
BLOCK
except:
exc = False
if not exit(mgr, *sys.exc_info()):
raise
finally:
if isinstance(self.input, string_types):
exit(mgr, None, None, None) I have tested the changes as per your demonstration and they work fine.(Not closing the file) |
Aha, so, probably this issue impossible to fix (by the current code looks scary (especially formatting), try to publish PR & write tests (I'll look into again on concrete example). |
Into
We have some "weird" behavior if a user passes a
file-like
object toMmCorpus
, based on this mailing list threadDemonstration
What happens
File-like object was closed when we call
MmReader
, problem located herehttps://github.com/RaRe-Technologies/gensim/blob/5342153eb4f4b02bb45bfa3951eef8250ac9f6b6/gensim/matutils.py#L1274
with
automatically closefile-like
when we out of scope, this is OK if we open this file, but we shouldn't close file-like passed from user.Related PR #1867
UPD: another problem here - call
IndexCopus.__init__
, that didn't supportfile-like
object at all.The text was updated successfully, but these errors were encountered: