Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problem collecting files with spaces in filename (regressed in 2.9.2 version) #1706

Closed
ldiary opened this issue Jul 6, 2016 · 22 comments
Closed

Comments

@ldiary
Copy link

ldiary commented Jul 6, 2016

I have this code working in pytest 2.9.1 version:

>>> import os
>>> import pytest
>>> saved_path = os.getcwd()
>>> path = "testbooks\colo\Service_and_Order"
>>> os.chdir(os.path.join(saved_path, path))
>>> os.getcwd()
'C:\\Users\\ernesto.luzon\\tronline\\nboss-ci\\testrunner\\testbooks\\colo\\Service_and_Order'
>>> arguments = '-sv --collect-only --junitxml=616_testlog.xml "Provision Colo Service.ipynb"'
>>> pytest.main(args=arguments)
============================= test session starts =============================
platform win32 -- Python 3.5.1, pytest-2.9.1, py-1.4.31, pluggy-0.3.1 -- C:\Users\ernesto.luzon\tronline\pyenv351\Scripts\python.exe
cachedir: ..\..\.cache
rootdir: C:\Users\ernesto.luzon\tronline\nboss-ci\testrunner\testbooks, inifile: pytest.ini
plugins: testbook-0.0.5
collected 23 items
<Testbook 'Provision Colo Service'>
  <Teststep 'when_i_prepare_customer'>
  <Teststep 'when_i_prepare_contact'>
  <Teststep 'when_i_create_and_edit_room'>
  <Teststep 'when_i_create_and_edit_room'>
  <Teststep 'when_i_create_and_edit_rack'>
  <Teststep 'when_i_create_and_edit_cage'>
  <Teststep 'when_i_create_and_edit_sensor'>
  <Teststep 'when_i_create_and_edit_gate'>
  <Teststep 'when_i_prepare_account_manager'>
  <Teststep 'when_i_create_an_order'>
  <Teststep 'when_i_add_colo_service'>
  <Teststep 'when_i_submit_order'>
  <Teststep 'when_i_link_to_project'>
  <Teststep 'when_i_start_actioning'>
  <Teststep 'when_set_live_date'>
  <Teststep 'when_i_add_colo_cab'>
  <Teststep 'when_i_create_a_colo_permission_for_the_contact'>
  <Teststep 'when_i_activate_colo_cab'>
  <Teststep 'when_i_activate_service'>
  <Teststep 'when_i_complete_the_order'>
  <Teststep 'when_i_deploy_to_customer_portal'>
  <Teststep 'when_i_set_portal_password'>
  <Teststep 'when_i_access_colo_cab_from_customer_portal'>

 generated xml file: C:\Users\ernesto.luzon\tronline\nboss-ci\testrunner\testbooks\colo\Service_and_Order\616_testlog.xml
======================== no tests ran in 0.77 seconds =========================
0
>>> exit()

After the 2.9.2 release, the same code no longer works:

>>> import pytest
>>> import os
>>> saved_path = os.getcwd()
>>> path = "testbooks\colo\Service_and_Order"
>>> os.chdir(os.path.join(saved_path, path))
>>> os.getcwd()
'C:\\Users\\ernesto.luzon\\tronline\\nboss-ci\\testrunner\\testbooks\\colo\\Service_and_Order'
>>> arguments = '-sv --collect-only --junitxml=616_testlog.xml "Provision Colo Service.ipynb"'
>>> pytest.main(args=arguments)
============================= test session starts =============================
platform win32 -- Python 3.5.1, pytest-2.9.2, py-1.4.31, pluggy-0.3.1 -- C:\Users\ernesto.luzon\tronline\pyenv351\Scripts\python
cachedir: ..\..\.cache
rootdir: C:\Users\ernesto.luzon\tronline\nboss-ci\testrunner\testbooks, inifile: pytest.ini
plugins: testbook-0.0.5

 generated xml file: C:\Users\ernesto.luzon\tronline\nboss-ci\testrunner\testbooks\colo\Service_and_Order\616_testlog.xml
======================== no tests ran in 0.75 seconds =========================
ERROR: file not found: A
4
>>> exit()

Perhaps related to what the change log says:
"Fix win32 path issue when puttinging custom config file with absolute path in pytest.main("-c your_absolute_path")."
But I don't know yet which commit to look at and how to figure a workaround on this.

@RonnyPfannschmidt
Copy link
Member

at first glance there seems to be something missing, its looking for a file called "A" there
is the code for the plugin public?

how exactly do you invoke/use pytest?

@ldiary
Copy link
Author

ldiary commented Jul 6, 2016

The code to the plugin can be found here,
https://github.com/ldiary/pytest-testbook/blob/master/pytest_testbook/plugin.py
do you have any idea, how I can modify it to cope with the path changes in 2.9.2?

In order to limit the tests that pytest can discover, I need to change directory (mostly inside subdirectory) before invoking pytest. I'm invoking pytest programmatically using multiprocessing.

arguments = ' '.join(['-sv',
                      "--collect-only" if collect_only == "true" else "",
                      '--junitxml={}'.format(xmllog),
                      '"{}"'.format(testfile) if run_specific_scenario else ''])

process = multiprocessing.Process(name=run_name, target=multi_procesed, args=(arguments, path, run_id, xmllog))
process.start()
def multi_procesed(arguments, path, run_id, logfile):
    sys.stdout = OutputWriter(run_id)
    saved_path = os.getcwd()
    os.chdir(os.path.join(saved_path, path))
    exit_status = pytest.main(args=arguments)

@nicoddemus
Copy link
Member

The code to the plugin can be found here
...
I'm invoking pytest programmatically using multiprocessing.

The code you then post is not found on the link you posted... are you the plugin's author or an user?

But I don't know yet which commit to look at and how to figure a workaround on this.
Could you post a small, reproducible example? We could use that to try to pin point the commit which changed this behavior somehow. A simple workaround is just passing cwd as a parameter instead of changing the actual cwd invocation, but I agree this should be investigated further, as just changing the cwd it should work..

@ldiary
Copy link
Author

ldiary commented Jul 6, 2016

Thanks a lot for looking into this issue!

I'm the author of the plugin and I'd like to modify the plugin to cope with the new release of Pytest.

There's a sample Jupyter Notebook that I uploaded here:
https://github.com/ldiary/pytest-testbook/tree/master/tests/testbooks/colo/Service_and_Order

I then summarized the steps to reproduce the issue using the above Jupyter Notebook here:
https://github.com/ldiary/pytest-testbook/blob/master/tests/Issue1706_Reproduce.md

@nicoddemus
Copy link
Member

@ldiary sorry but I didn't have time to test this yet.

Meanwhile could you please test with pytest's feature branch? We did make some changes recently on how the rootdir was discovered, so this might have been fixed already.

@ldiary
Copy link
Author

ldiary commented Jul 8, 2016

I tried the latest features branch but it gave the same error:

============================= test session starts =============================
platform win32 -- Python 3.5.1, pytest-2.10.0.dev1, py-1.4.31, pluggy-0.3.1 -- C:\Users\ernesto.luzon\testbookenv\Scripts\python.exe
cachedir: ..\..\..\..\.cache
rootdir: C:\Users\ernesto.luzon\pytest-testbook, inifile:
plugins: testbook-0.0.5

 generated xml file: C:\Users\ernesto.luzon\pytest-testbook\tests\testbooks\colo\Service_and_Order\616_testlog.xml
======================== no tests ran in 0.82 seconds =========================
ERROR: file not found: "Provision Colo Service.ipynb"
4
>>>

@ldiary
Copy link
Author

ldiary commented Jul 8, 2016

I think I found the commit that introduced this regression:

Author: taschini <taschini@users.noreply.github.com>  2016-06-08 14:17:55
Committer: taschini <taschini@users.noreply.github.com>  2016-06-14 06:12:40
Parent: accd962c9f88dbd5b2b0eef6efe7bf6fe5444b29 (Fixed issue shadowing error when missing argument on teardown_method)
Child:  4d9e293b4d32fe20b7c5a5ef37ca190481bbfaf3 (Incorporated feedback (#1597).)
Branches: features, master, remotes/origin/features, remotes/origin/master, remotes/origin/reorganize-docs
Follows: 2.9.2
Precedes: 

    Ensure that a module within a namespace package can be found by --pyargs.

------------------------------- _pytest/main.py -------------------------------
index 4a6c087..40df132 100644
@@ -1,7 +1,5 @@
 """ core implementation of testing process: init, session, runtest loop. """
-import imp
 import os
-import re
 import sys

 import _pytest
@@ -25,8 +23,6 @@ EXIT_INTERNALERROR = 3
 EXIT_USAGEERROR = 4
 EXIT_NOTESTSCOLLECTED = 5

-name_re = re.compile("^[a-zA-Z_]\w*$")
-
 def pytest_addoption(parser):
     parser.addini("norecursedirs", "directory patterns to avoid for recursion",
         type="args", default=['.*', 'CVS', '_darcs', '{arch}', '*.egg'])
@@ -658,36 +654,29 @@ class Session(FSCollector):
         return True

     def _tryconvertpyarg(self, x):
-        mod = None
-        path = [os.path.abspath('.')] + sys.path
-        for name in x.split('.'):
-            # ignore anything that's not a proper name here
-            # else something like --pyargs will mess up '.'
-            # since imp.find_module will actually sometimes work for it
-            # but it's supposed to be considered a filesystem path
-            # not a package
-            if name_re.match(name) is None:
-                return x
-            try:
-                fd, mod, type_ = imp.find_module(name, path)
-            except ImportError:
-                return x
-            else:
-                if fd is not None:
-                    fd.close()
+        """Convert a dotted module name to path.

-            if type_[2] != imp.PKG_DIRECTORY:
-                path = [os.path.dirname(mod)]
-            else:
-                path = [mod]
-        return mod
+        """
+        import pkgutil
+        try:
+            loader = pkgutil.find_loader(x)
+        except ImportError:
+            return x
+        if loader is None:
+            return x
+        try:
+            path = loader.get_filename()
+        except:
+            path = loader.modules[x][0].co_filename
+        if loader.is_package(x):
+            path = os.path.dirname(path)
+        return path

     def _parsearg(self, arg):
         """ return (fspath, names) tuple after checking the file exists. """
-        arg = str(arg)
-        if self.config.option.pyargs:
-            arg = self._tryconvertpyarg(arg)
         parts = str(arg).split("::")
+        if self.config.option.pyargs:
+            parts[0] = self._tryconvertpyarg(parts[0])
         relpath = parts[0].replace("/", os.sep)
         path = self.config.invocation_dir.join(relpath, abs=True)
         if not path.check():

The error happens inside def _parsearg(self, arg):.

@nicoddemus
Copy link
Member

Thanks for digging. I don't have time to dig into it right now, pinging @taschini if he wants to join the discussion.

@ldiary
Copy link
Author

ldiary commented Jul 8, 2016

Okay, digging a little bit further, it turns out the code I mentioned above was not related (my apologies to @taschini). The path.check() of def _parsearg(self, arg) was failing because the value of arg was not correct.

When I checked the def parse() of Class Config in config.py

    def parse(self, args, addopts=True):
        # parse given cmdline arguments into this config object.
        assert not hasattr(self, 'args'), (
                "can only parse cmdline args at most once per Config object")
        self._origargs = args
        self.hook.pytest_addhooks.call_historic(
                                  kwargs=dict(pluginmanager=self.pluginmanager))
        self._preparse(args, addopts=addopts)
        # XXX deprecated hook:
        self.hook.pytest_cmdline_preparse(config=self, args=args)
        args = self._parser.parse_setoption(args, self.option, namespace=self.option)
        if not args:
            cwd = os.getcwd()
            if cwd == self.rootdir:
                args = self.getini('testpaths')
            if not args:
                args = [cwd]
        self.args = args

In Pytest 2.9.1,
self.args = ['Provision Colo Service.ipynb']

While in Pytest 2.9.2,
self.args = ['"Provision Colo Service.ipynb"']

There's now an unnecessary quotation marks in the parsed result of Pytest 2.9.2; hence the error, ERROR: file not found: "Provision Colo Service.ipynb"

@nicoddemus
Copy link
Member

Thanks for digging further.

Do you have any idea of who's adding the unnecessary quotation marks?

@ldiary
Copy link
Author

ldiary commented Jul 8, 2016

I think it's the other way around, 2.9.1 was able to remove the quotation marks while 2.9.2 did not bother to remove it during command line parsing. I had to pass the arguments in quotation marks just like below, because the filename has spaces in between.

arguments = '-sv --collect-only --junitxml=616_testlog.xml "Provision Colo Service.ipynb"'

@ldiary
Copy link
Author

ldiary commented Jul 8, 2016

Finally, I am now able to pinpoint the root cause.

In 2.9.1, we have the following line of code in def _prepareconfig(args=None, plugins=None): of config.py

args = shlex.split(args)

In 2.9.2, this changed into the following due to #1523 and #1566:

args = shlex.split(args, posix=sys.platform != "win32")

When I tested on Windows 7, Windows 8 and Windows 10 machines... the posix argument needs to be True in order to get rid of the quotation marks.

C:\Users\ernesto.luzon>python
Python 3.4.3 (v3.4.3:9b73f1c3e601, Feb 24 2015, 22:44:40) [MSC v.1600 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import shlex
>>> arguments = '-sv --collect-only --junitxml=616_testlog.xml "Provision Colo Service.ipynb"'
>>> shlex.split(arguments, posix=False)
['-sv', '--collect-only', '--junitxml=616_testlog.xml', '"Provision Colo Service.ipynb"']
>>> shlex.split(arguments, posix=True)
['-sv', '--collect-only', '--junitxml=616_testlog.xml', 'Provision Colo Service.ipynb']
>>>

Using posix=False on win32 systems, files that have spaces in their filenames can no longer be parsed correctly and thus results to "file not found" error. Should we revert it back to posix=True (the default) or should we just add additional code that will perform additional cleaning for files with spaces in their filenames?

@nicoddemus
Copy link
Member

Hey @ldiary, thanks for pinpointing the cause!

TBH I'm not sure anymore... I made a quick test here as well, and posix=True seems to be the right approach on Windows (although posix=True on Windows does not make sense):

Python 3.4.4 |Continuum Analytics, Inc.| (default, Jun 15 2016, 15:41:58) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import shlex
>>> arguments = r'-c d:\\foo\\bar.py -f "file with spaces.txt"'
>>> print(arguments)
-c d:\\foo\\bar.py -f "file with spaces.txt"
>>> shlex.split(arguments, posix=True)
['-c', 'd:\\foo\\bar.py', '-f', 'file with spaces.txt']
>>> shlex.split(arguments, posix=False)
['-c', 'd:\\\\foo\\\\bar.py', '-f', '"file with spaces.txt"']
>>>

But as commented here in #1554, apparently that was causing one of the tests on Windows to fail:

    def __init__(self, path, data=None):
        self.path = str(path) # convenience
        if data is None:
>           f = open(self.path)
E           IOError: [Errno 2] No such file or directory: 'c:usersappveyorappdatalocaltemp1pytest-of-appveyorpytest-0testdirtest_absolute_win32_path0custom.cfg

On this case passing posix=False seemed to fix the issue. 😦

@nicoddemus
Copy link
Member

Docs for the posix argument:

The posix argument defines the operational mode: when posix is not true (default), the shlex instance will operate in compatibility mode. When operating in POSIX mode, shlex will try to be as close as possible to the POSIX shell parsing rules

@ldiary
Copy link
Author

ldiary commented Jul 9, 2016

It looks like we might need to consider the shlex.whitespace_split as previously mentioned by Rony on #1523 . I'll try to test other options later today.

@RonnyPfannschmidt
Copy link
Member

we should also consider going back on allowing the string, its demonstrating to be utterly error-prone
and it would be a good chance to do this for 3.0

the problems we see clearly show that there is no safe portable way to do this,
unless we implement it our self and its not that hard to make a proper correct list

we should consider leaving some conveniences out simply because correct implementation for us is riddled with a few days of work dragged over weeks due to finding platform incompatibilities one after another

the more this piles up problems, the more im thinking that we want an api to call into pytest that can pass options properly pythonic instead of mushed strings

@nicoddemus
Copy link
Member

nicoddemus commented Jul 9, 2016

we should also consider going back on allowing the string, its demonstrating to be utterly error-prone
and it would be a good chance to do this for 3.0

Hmm actually I was thinking the same, it seems like a good intended minor convenience that proved to be more complicated/error prone than initially thought. We could then make an error passing strings to pytest.main.

@ldiary how do you feel about this?

the more this piles up problems, the more im thinking that we want an api to call into pytest that can pass options properly pythonic instead of mushed strings

That's seems like a good idea... something like:

retcode = pytest.main(args=['path/to/tests', '-k', 'match', 'x'])

Something:

outcome = pytest.run(paths=['path/to/tests'], k='match', exitfirst=True)
outcome.tests # list of "test nodes"
outcome.success # bool
outcome.failure # bool
# etc

This would make easy to integrate pytest into scripts which need to call pytest and actually do something with the results.

@nicoddemus
Copy link
Member

The last suggestion seems also simple to implement on top of the existing mechanisms

@ldiary
Copy link
Author

ldiary commented Jul 11, 2016

@nicoddemus if we have pytest.run that would be super cool.

The actual use cases I have right now are:

outcome = pytest.run(cwd='directory_name', files=['a bc.ipynb', 'subdirectory/cd.ipynb'], exitfirst=True)
outcome = pytest.run(cwd='directory_name', run-all=True, exitfirst=True)
outcome = pytest.run(cwd='directory_name', exclude=['a12.ipynb', 'subdirectory/b24.ipynb'], exitfirst=True)

The cwd option is very important and it serves as a namespace between multiple tests. The files, run-all, and exclude are nice to have and I'll just try to create a plugin if they will not be implemented.

I always encounter a problem using -k "a or b or c or d" when the list of expression to match becomes too long.

@ldiary ldiary changed the title cwd after os.chdir() is not honoured during test collection (regressed in 2.9.2 version) problem collecting files with spaces in filename (regressed in 2.9.2 version) Jul 11, 2016
@nicoddemus
Copy link
Member

@ldiary I see, but can you change your code to pass a list of strings instead of a single string? This way we can remove support for passing a single string to pytest.main, and later on we can add pytest.run or similar.

@ldiary
Copy link
Author

ldiary commented Jul 13, 2016

@nicoddemus I've modified my code to pass a list of strings as you've suggested. And it looks like it is also the perfect workaround for this shlex problem.

I'm happy to close this issue now (assuming you can re-open it should you wish to).

@nicoddemus
Copy link
Member

Thanks! This form is deprecated now. 😁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants