Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSError('[Errno 24] Too many open files') with pytest 3.3.0 #2970

Closed
4 tasks done
vphilippon opened this issue Nov 28, 2017 · 10 comments
Closed
4 tasks done

OSError('[Errno 24] Too many open files') with pytest 3.3.0 #2970

vphilippon opened this issue Nov 28, 2017 · 10 comments
Labels
status: needs information reporter needs to provide more information; can be closed after 2 or more weeks of inactivity

Comments

@vphilippon
Copy link

vphilippon commented Nov 28, 2017

After upgrading to pytest 3.3.0 (previously on 3.2.5), my unitests using pytest and GitPython are failing with:

Cmd('git') not found due to: OSError('[Errno 24] Too many open files')

More precisely, all of my tests pass up to a certain number (I don't have the exact number right now), and then all the following tests start failing due to that error.

Unfortunately, I cannot share the code and tests (private projects).

Here's a partial pip list, after removing a set of private packages:

amqp (1.4.9)
anyjson (0.3.3)
APScheduler (3.2.0)
asn1crypto (0.23.0)
attrs (17.3.0)
backports.shutil-get-terminal-size (1.0.0)
beautifulsoup4 (4.6.0)
billiard (3.3.0.23)
celery (3.1.18)
celery-with-redis (3.0)
certifi (2017.11.5)
cffi (1.11.2)
chardet (3.0.4)
check-manifest (0.36)
CherryPy (3.8.2)
click (6.7)
colander (1.0)
colorama (0.3.9)
cornice (1.0.0)
coverage (4.4.2)
cryptography (2.1.3)
decorator (4.1.2)
devpi-client (2.3.0)
devpi-common (3.2.0)
enum34 (1.0.4)
fastavro (0.14.11)
funcsigs (1.0.2)
functools32 (3.2.3.post2)
future (0.14.3)
futures (3.0.5)
gitdb2 (2.0.3)
GitPython (2.1.7)
graphviz (0.8.1)
hiredis (0.2.0)
idna (2.6)
ipaddress (1.0.18)
ipdb (0.10.3)
ipython (5.5.0)
ipython-genutils (0.2.0)
iso8601 (0.1.12)
Jinja2 (2.10)
jsonschema (2.6.0)
kafka-python (0.9.5)
kazoo (2.2.1)
kombu (3.0.37)
lxml (3.4.4)
MarkupSafe (1.0)
MiniMock (1.2.8)
mock (1.0.1)
msgpack-python (0.4.6)
MySQL-python (1.2.5)
nose (1.3.7)
nose-parameterized (0.3.5)
objgraph (3.1.2)
PasteDeploy (1.5.2)
pathlib2 (2.3.0)
pbr (1.10.0)
pickleshare (0.7.4)
pip (9.0.1)
pkginfo (1.4.1)
pluggy (0.6.0)
plumbum (1.6.4)
prompt-toolkit (1.0.15)
psutil (3.1.1)
py (1.5.2)
pycparser (2.18)
pycrypto (2.6.1)
pycurl (7.19.5.1)
PyDispatcher (2.0.5)
Pygments (2.2.0)
pymongo (3.5.1)
pyOpenSSL (17.4.0)
pypiwin32 (219)
pyramid (1.5.8)
pyreadline (2.0)
pytest (3.3.0)
pytest-mock (1.6.3)
python-dateutil (2.4.2)
python-memcached (1.57)
python-memcached-stats (0.1)
pytz (2017.3)
PyYAML (3.11)
pyzmq (14.7.0)
redis (2.10.6)
repoze.lru (0.6)
requests (2.18.4)
scandir (1.6)
setuptools (28.8.0)
setuptools-scm (1.15.6)
simplegeneric (0.8.1)
simplejson (3.8.2)
six (1.11.0)
smmap2 (2.0.3)
SQLAlchemy (1.0.8)
sqlalchemy-migrate (0.11.0)
sqlparse (0.2.4)
statsd (3.2.1)
stevedore (1.20.1)
tabulate (0.7.7)
Tempita (0.5.2)
tox (2.9.1)
traitlets (4.3.2)
translationstring (1.3)
tzlocal (1.4)
urllib3 (1.22)
validators (0.10.1)
venusian (1.1.0)
virtualenv (15.1.0)
waitress (1.1.0)
wcwidth (0.1.7)
web.py (0.37)
WebOb (1.7.4)
WebTest (2.0.29)
wheel (0.29.0)
win-unicode-console (0.5)
wincertstore (0.2)
wrapt (1.10.11)
xmltodict (0.11.0)
zope.deprecation (4.3.0)
zope.interface (4.4.3)

What I can tell is that those tests perform a lot of "git clone" operation into temp directories made using the tmpdir and tmpdir_factory fixtures.
The only peculiar thing I can think of about my use case is that I have some fixtures from the "per test" level using "session" level fixtures.

Here's an almost identical copy of the conftest.py:

import os

import git
import pytest

@pytest.fixture(scope="session")
def remote_repo(tmpdir_factory):
    tmpdir = tmpdir_factory.mktemp('repos')
    tmpdir.chdir()
    os.mkdir('remote')

    # Repo init (on master by default)
    repo = git.Repo.init('remote')
    # I simplified/obfuscated a part of the git operation here.
    # lots of repo operation like making commits, creating branch and making tag...
    repo.index.commit('master 1st commit')
    repo.index.commit('master 2nd commit')
    master = repo.heads.master

    master.checkout()
    releases_1_0 = repo.create_head('releases/1.0')
    releases_1_0.checkout()
    repo.index.commit('releases 1.0 1st commit')
    repo.index.commit('releases 1.0 2nd commit')
    repo.create_tag('v1.0.0', message='v1.0.0')

    master.checkout()
    merge_base = repo.merge_base(master, releases_1_0)
    repo.index.merge_tree(releases_1_0, base=merge_base)
    repo.index.commit('1st Merge releases/1.0 into master',
                      parent_commits=(master.commit, releases_1_0.commit))

    master.checkout()
    repo.index.commit('master 3rd commit')
    repo.index.commit('master 4th commit')

    return repo


@pytest.fixture(scope="session")
def _local_repo(remote_repo):
    return remote_repo.clone(os.path.abspath('./local'))


@pytest.fixture()
def local_repo_per_session(_local_repo):
    os.chdir(_local_repo.working_dir)
    return _local_repo


@pytest.fixture()
def local_repo_per_test(tmpdir, remote_repo):
    clone = remote_repo.clone(str(tmpdir.join('./local')))
    os.chdir(clone.working_dir)
    return clone

And then I have tests using local_repo_per_session or local_repo_per_test depending on if the test is only doing reads, or if it will change the repo given by the fixture.

It feels as if the file handler for per-test level fixtures aren't closed properly anymore, but that's only my guess.

You can ask for more details about the tests, I'll try to give as much information as I can / am allowed to.

Pytest version: 3.3.0
Python: 2.7 (32 bits version)
OS: Windows 7/10

  • Include a detailed description of the bug or suggestion
  • pip list of the virtual environment you are using
  • pytest and operating system versions
  • Minimal example if possible
@nicoddemus
Copy link
Member

Hi @vphilippon thanks for the detailed report.

Could you try to see which files are being kept open? For Windows 10 you can use Process Explorer to see which files are being kept open by a process. Actually it seems it doesn't provide that information, but it seems a custom script using psutil open_files() function might do the trick.

AFAIK tmpdir and tmpdir_factory don't keep any files open, they just create directories and the fixtures provide the path to the directory as their return value.

@nicoddemus nicoddemus added the status: needs information reporter needs to provide more information; can be closed after 2 or more weeks of inactivity label Nov 28, 2017
@vphilippon
Copy link
Author

Thanks for the quick reply, I'll check this tomorrow morning (EST)

@nicoddemus
Copy link
Member

Another thing to try: one new addition was the new logging plugin, try running the test suite with -p no:logging to see if that makes any difference.

@vphilippon
Copy link
Author

vphilippon commented Nov 29, 2017

Reporting back:

  • Running with -p no:logging did not resolve the issue.
  • Using Process Explorer, I noted that a lot of git.exe process with the command line git cat-file --batch-check were running and accumulating, until the OSError occured (on pytest 3.2.5, those process would stop along the way during the tests, not accumulating).
  • Using psutil, I can tell that the open_files() returns an empty list, so it's not really an issue of "opened file", but more of a "too many subprocess" issue. But I have some more testing to do here, I'll confirm this once I'm done Edit1: I completed my tests, I can confirm that this does return an empty list Edit2: Scratch that, even if I explicitly open a file, psutil.open_files() will still return an empty list. I'm doing something wrong here with my investigation.
    Edit3: I was using an old version of psutil that didn't work on Windows (10?), my bad. I'll get back on investigating this part.
  • The process leak only cause issues when running tests with the per-test fixture. That's not saying that the session-level fixture isn't keeping some unneeded process open, only that it really seems to be bound to the amount of git clone being performed, so running fewer git clone keep the leak at a reasonable level, if that makes sense.

Here's an example of what I get in with Process Explorer:
git-cat-file

@vphilippon
Copy link
Author

vphilippon commented Nov 29, 2017

Update on the psutil.open_files list:
On every test of a test suite run, the list of opened file is exactly:

popenfile(path='C:\\Windows\\SysWOW64\\en-US\\winnlsres.dll.mui', fd=-1)
popenfile(path='F:\\tmp\\tmpmyfgop', fd=-1)
popenfile(path='F:\\tmp\\tmprqrnxo', fd=-1)
popenfile(path='C:\\Windows\\SysWOW64\\en-US\\KernelBase.dll.mui', fd=-1)

with the same 2 temp files everytime. So nothing like something that would cause a "to many opened files" issue.
I get the same list with pytest 3.2.5.

@vphilippon
Copy link
Author

vphilippon commented Nov 29, 2017

The leaking process is not from the git clone, it's from another operation (see below).

I managed to make a reproducible case:
requirements.txt:

pytest==3.3.0
gitpython

conftest.py

import os

import git
import pytest


@pytest.fixture(scope="session")
def remote_repo(tmpdir_factory):
    tmpdir = tmpdir_factory.mktemp('repos')
    tmpdir.chdir()
    os.mkdir('remote')

    # Repo init (on master by default)
    repo = git.Repo.init('remote')
    repo.index.commit('master 1st commit')
    repo.index.commit('master 2nd commit')

    return repo


@pytest.fixture()
def local_repo_per_test(tmpdir, remote_repo):
    clone = remote_repo.clone(str(tmpdir.join('./local')))
    os.chdir(clone.working_dir)
    return clone

test.py

import pytest

@pytest.mark.parametrize('foo', ['bar',]*200)
def test_foobar(foo, local_repo_per_test):
    # This will cause the process leak.
    local_repo_per_test.head.reference = local_repo_per_test.commit('origin/master')

@nicoddemus
Copy link
Member

Thanks @vphilippon for following up!

I can reproduce the issue, and I see it is related to #2968 (comment): you are relying on the garbage collector to implicitly close the Repo object.

If you explicitly close it the process leak goes away:

@pytest.fixture(scope="session")
def remote_repo(tmpdir_factory):
    tmpdir = tmpdir_factory.mktemp('repos')
    tmpdir.chdir()
    os.mkdir('remote')

    # Repo init (on master by default)
    repo = git.Repo.init('remote')
    repo.index.commit('master 1st commit')
    repo.index.commit('master 2nd commit')
    yield repo
    repo.close()


@pytest.fixture()
def local_repo_per_test(tmpdir, remote_repo):
    clone = remote_repo.clone(str(tmpdir.join('./local')))
    os.chdir(clone.working_dir)
    yield clone
    clone.close()

This is definitely a bug in pytest that we intend to fix, but either way explicitly closing your resources is considered good-practice. 👍

@vphilippon
Copy link
Author

Well color me surprised, I didn't even knew the Repo object was a closeable resource.
As someone once said: "Learn to love your bugs, as you will learn from them."
Thanks a lot @nicoddemus! 👍
I'll go ahead and close this one as the root issue is already tracked.

@nicoddemus
Copy link
Member

Great, thanks @vphilippon for patience and understanding!

tonybaloney added a commit to tonybaloney/wily that referenced this issue Dec 14, 2018
…alstead metric typo which was leaving files unopened
@tonybaloney
Copy link
Contributor

Thank you for this well documented issue, I stumbled across the same thing 👍 @nicoddemus

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: needs information reporter needs to provide more information; can be closed after 2 or more weeks of inactivity
Projects
None yet
Development

No branches or pull requests

3 participants