Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

💫 Improve "spacy download" and support for different installation prefixes #1456

Closed
honnibal opened this issue Oct 24, 2017 · 4 comments
Closed
Labels
enhancement Feature requests and improvements feat / cli Feature: Command-line interface help wanted (easy) Contributions welcome! (also suited for spaCy beginners) help wanted Contributions welcome! install Installation issues

Comments

@honnibal
Copy link
Member

honnibal commented Oct 24, 2017

Currently spacy download seems to struggle when Python is installed into a different prefix. See #1220 .

Possible resolutions could include an argument to spacy download. Ideal resolution would be detecting the correct prefix, and installing there.

If you work on this, please try to pay attention to cross-platform and cross-version support. Try to use the built-in pkg_resources module. The tools in spacy.compat might assist with this. Using pathlib.Path for file-system manipulation is also helpful.

@honnibal honnibal added enhancement Feature requests and improvements help wanted Contributions welcome! labels Oct 24, 2017
@ines ines added the install Installation issues label Oct 24, 2017
@ines
Copy link
Member

ines commented Oct 24, 2017

Related issues: #924, #1284

Also related and could be included in the same enhancement to spacy download:

#1373: python -m spacy download should not overwrite an existing model of the same name

I think the solution should be as simple as appending the model name and version to the file URL in cli.download, e.g. #egg=en_core_web_sm-2.0.0a0 (see here). This should tell pip which package and version you're trying to download and make it behave like it does when you're downloading packages from PyPi.

Edit: Okay, looks like this wasn't so easy – specifying the version with #egg= doesn't seem to work. However, with only the package name, it does skip the download – but this doesn't take the version into account. But maybe I was doing something wrong...

One option could be to make this the default behaviour, and introduce a -U flag for upgrading only. But this would be a big change and not exactly intuitive (since all the user gets to see is the default pip response, which doesn't always make it clear whether a package was installed or whether all requirements were already satisfied.) The alternative would be to have spaCy perform the check – but I'm not sure I like that option, since it'd again require getting the installed distributions and performing hacky checks on them (which always introduces error potential).

But maybe that's what we have to do. If spaCy thinks you already have the model and version installed, spacy download exits and prints a warning, telling you to run spacy download -U to download and install regardless.

@ines ines changed the title 💫 Improve support for different installation prefixes in spacy download 💫 Improve "spacy download" and support for different installation prefixes Oct 24, 2017
@ines ines added help wanted (easy) Contributions welcome! (also suited for spaCy beginners) 🌙 nightly Discussion and contributions related to nightly builds labels Oct 24, 2017
@jnothman
Copy link
Contributor

jnothman commented Oct 26, 2017

FWIW, for me pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-1.2.0/en_core_web_sm-1.2.0.tar.gz#egg=en_core_web_sm-1.2.0 successfully avoids downloading when run a second time.

I'm a bit confused about @honnibal's description of #1220 as being an issue about Python being installed into a different prefix. download_model currently correctly ensures that it is the current Python version, and its PYTHONPATH being used (although an alternative is to avoid subprocess and just use import pip; pip.commands.InstallCommand().main(args_list) assuming that API is stable). The problem in #1220 is that the user does not want to install into the default site-packages, particularly because it is not writable to that user. Forwarding args to pip is one option, but so is asking the user to create a pip.conf containing [install]\nuser=true because that might be useful to them in any case... Changed my mind on pip.conf which may override virtualenvs...

@parajain
Copy link

parajain commented Feb 7, 2018

Facing same issue:


  error: could not create '/usr/local/lib/python2.7/dist-packages/en_vectors_web_lg': Permission denied

    ----------------------------------------
Command "/usr/bin/python2.7 -u -c "import setuptools, tokenize;__file__='/tmp/pip-qn3PBj-build/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-g4hcUl-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-qn3PBj-build/

I tried

pip2 install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-1.2.0/en_core_web_sm-1.2.0.tar.gz#egg=en_core_web_sm-1.2.0

 Running setup.py clean for en-core-web-sm
  Complete output from command /usr/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-vXeKyY/en-core-web-sm/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" clean --all:
  Traceback (most recent call last):
    File "<string>", line 1, in <module>
  IOError: [Errno 2] No such file or directory: '/tmp/pip-build-vXeKyY/en-core-web-sm/setup.py'

@ines ines added the feat / cli Feature: Command-line interface label Mar 27, 2018
ines added a commit that referenced this issue May 20, 2018
Use #egg=model==version to allow pip to check for existing installations. The download is only started if no installation matching the package/version is found. Fixes a long-standing inconvenience.
ines added a commit that referenced this issue May 20, 2018
Treat all additional arguments passed to the download command as pip options to allow user to customise the command. For example:

python -m spacy download en --user
@lock
Copy link

lock bot commented Jun 19, 2018

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Jun 19, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement Feature requests and improvements feat / cli Feature: Command-line interface help wanted (easy) Contributions welcome! (also suited for spaCy beginners) help wanted Contributions welcome! install Installation issues
Projects
None yet
Development

No branches or pull requests

4 participants