Skip to content
This repository has been archived by the owner on Jun 21, 2022. It is now read-only.

Name clash between pandas.py & pandas (python 2 on SWAN) #294

Closed
stderr-enst opened this issue Jun 25, 2019 · 4 comments · Fixed by #295
Closed

Name clash between pandas.py & pandas (python 2 on SWAN) #294

stderr-enst opened this issue Jun 25, 2019 · 4 comments · Fixed by #295

Comments

@stderr-enst
Copy link

stderr-enst commented Jun 25, 2019

I am using uproot on CERNs SWAN, which is hosting a jupyter notebook with a python 2 kernel. uproot 3.7.0 has been installed via pip install --user uproot on the SWAN node, which is essentially putting the python module into a temporary home dir on the network storage system EOS.

Now copy pasting a section from uproots examples in binder

import uproot
for df in uproot.pandas.iterate("http://scikit-hep.org/uproot/examples/Zmumu.root", "events", "p[xyz]1", entrysteps=500):
    print(df[:3])

into the notebook, resulted in the following error:

AttributeError: 'module' object has no attribute 'DataFrame'

Luckily the Internet is big enough, that just throwing the error message into google lead me to a useful stack overflow discussion, and renaming ~/.local/lib/python2.7/site-packages/uproot/pandas.py to something else and replacing in __init__.py

from uproot import pandas

with

from uproot import somethingelse as pandas

really solved this problem.

I am not sure if this is just some kind of edge case or if I'm doing something silly without knowing it, but I thought I'd raise this, since other uproot users might run into this as well.

edit:
Ah, I just thought of something else.
In order to load the users python modules on SWAN, you have to start the instance with a script that modifies the environment like this:

#! /bin/env sh
export PYTHONPATH=$CERNBOX_HOME/.local/lib/python2.7/site-packages:$PYTHONPATH

is this bad practice in some sense?

@chrisburr
Copy link
Member

The bug is caused by Python 2 supporting implicit relative imports (see here for details). These were removed in Python 3 as they can cause difficult to understand bugs. The "new" behaviour can be enabled for a single file in Python 2.5+ by adding this as the first non-comment line:

from __future__ import absolute_import

I'll make a PR to add this to every file.

Using PYTHONPATH is generally bad practice and it's only really intended for quick hacks as it can have unexpected side effects (like this bug, which comes from SWAN's use of PYTHONPATH).

@jpivarski
Copy link
Member

@mhedges I guess we forgot to test Python 2. Probably the cleanest way to deal with it is to introduce a prefix on all the modules that call external libraries, like ext_pandas and ext_pyarrow. In uproot, I put all of these in a _connect submodule and gave everything an underscore as a prefix, but those were private, internal functions. We want these to be publicly accessible. Normally, I wouldn't like the ext_ prefix because it's unguessable, but everything in the public API is exposed at the top level (awkward.*), so it's not a problem for finding entry points, only for finding implementations.

I'll do this soon.

@jpivarski
Copy link
Member

I was totally confused, thinking this was a bug report on awkward (so it's not relevant for your extension, @mhedges). In awkward, we could have changed pandas.py → ext_pandas.py and arrow → ext_pyarrow.py with some downstream changes to account for that, but in uproot, we really need the name to be pandas.py so that uproot.pandas.iterate has the form that it does (for symmetry with uproot.iterate and ttree.pandas.*). So @chrisburr's solution in PR #295 is really the only way to do it: to make Python 2 and 3 behave the same way (the Python 3 way). I'll close this and merge PR #295.

@stderr-enst
Copy link
Author

Thanks a lot for the quick fix you two! The new release from pypi works like a charm.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants