Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to include space in the path to dictionary? #43

Closed
yuyuxing-wu opened this issue Apr 3, 2020 · 7 comments
Closed

How to include space in the path to dictionary? #43

yuyuxing-wu opened this issue Apr 3, 2020 · 7 comments

Comments

@yuyuxing-wu
Copy link

I want to include space in path and have tried like below.

import MeCab
parser=MeCab.Tagger('-d /opt/test1 test2/dictionary')
parser=MeCab.Tagger('-d /opt/test1\ test2/dictionary')
parser=MeCab.Tagger('-d "/opt/test1 test2/dictionary"')

All three of them failed as RuntimeError.

Is any way to include space in path?

@polm
Copy link
Collaborator

polm commented Apr 3, 2020

This is not possible right now because the argument handling doesn't do full shell quoting; see #25 for details.

I haven't thought about this problem in a long time, but I'll take another look at whether there's a good way to handle it. It may be possible to just preparse argument with Python's shlex.

@yuyuxing-wu yuyuxing-wu changed the title How to include space in path to dictionary? How to include space in the path to dictionary? Apr 3, 2020
polm added a commit that referenced this issue Apr 3, 2020
Old argument handling couldn't handle quoted arguments. This handles
quoted arguments in Python and passes an array of strings to the C++
MeCab code.

Most of the complexity in this change is not in Python but in SWIG. Most
details are handled by a SWIG typedef converter.

This should fix #25 and #43.
@polm
Copy link
Collaborator

polm commented Apr 3, 2020

So I took a look at it and this issue should be fixed in master. Unfortunately that broke Node-based parsing for some reason. Should be fixable at least, though can't say how long it'll take.

@polm
Copy link
Collaborator

polm commented Apr 4, 2020

OK, fixed the issues, so this should be resolved in the latest release candidate.

Could you please install the latest version like below and tell me if it fixes your issue?

pip install mecab-python3=0.996.6rc2

@yuyuxing-wu
Copy link
Author

Thanks!

I have tried two patterns like below and both of them works!

import MeCab
parser=MeCab.Tagger('-d /opt/test1\ test2 -u /opt/test1\ test2/user.dic')
parser=MeCab.Tagger('-d "/opt/test1 test2" -u "/opt/test1 test2/user.dic"')

@polm
Copy link
Collaborator

polm commented Apr 6, 2020

OK, thanks for the confirmation. I'll release a new update with this change soon.

@yuyuxing-wu
Copy link
Author

yuyuxing-wu commented Apr 6, 2020

FYI: After this update, the escaped character also need to be changed as below.

Before:

'-F%M\\t%H\\n'

After:

'-F%M\t%H\n'

@polm
Copy link
Collaborator

polm commented Jun 29, 2020

Fix for this is included in the 1.0 release today.

@polm polm closed this as completed Jun 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants