-
Notifications
You must be signed in to change notification settings - Fork 67
Fix iteration over DataFrames and provide more interfaces #264
Conversation
…rame; tree.df and tree.iterate are good shortcuts; uproot.pandas.iterate doesn't work yet
Hi Jim, I just manually updated my uproot version to 3.4.16 following the master branch, and I'm running into this error now when using uproot.iterate:
I notice you mentioned something about Int64Index in the PR note, but I'm not sure how exactly this might relate? Thanks! |
I chose the Does a new version of Pandas fix it? Even if it does, I'm going to want to support more than just the latest version... |
This method was introduced in Pandas 0.24.0. The new property is |
Cool, I'm trying to install pandas 0.24.2 (current one is 0.23.4). It's probably a good idea to have the latest pandas anyway. |
Cool, that worked! I think you were faster than me and I ended up getting 3.4.17, so it works with pandas 0.24.2. I can also try with my other conda env that still has pandas 0.23.4 to see if 3.4.17 works as well. |
Running another version through is also a good test of Travis. It looks like the system is back up—mostly. Jobs start in decent time now, but "solving environment" for conda in Python 2.7 now takes longer than Travis is willing to wait. I'm thinking this issue is unrelated to the Travis outage; it just happened at the same time. I don't like not being able to deploy whenever I want. (Grumble.) |
I feel you! But apart from the conda delay it looks like this was a major Travis outage. Anyway, I tested 3.4.17 with 0.23.4 and it seems to work as well! I also asked Alexx who manages a conda package in the LPC to update uproot when the latest release gets deployed, so others don't have to install their own env. Thanks a lot for your help and quick problem-solving! |
This PR fixes #263 and provides new methods and functions:
tree.pandas.iterate
is liketree.pandas.df
in that it sets some Pandas-friendly defaults, but ontree.iterate
, rather thantree.arrays
.uproot.pandas.iterate
sets those Pandas-friendly defaults onuproot.iterate
.Various bugs were fixed. For instance,
tree.iterate
really wasn't Pandas-ready: it had fallen considerably behindtree.arrays
, but fortunately most of the Pandas-specific stuff is in a function call now, so no code duplication was needed to gettree.iterate
up to speed and now it will inherit any future updates.Also,
globalentrystart
setting inuproot.iterate
was broken for Pandas DataFrames, both forMultiIndex
and forRangeIndex
. The original code seemed to be expectingInt64Index
for some reason.All in all, the combination of iterate + Pandas was just out of date with respect to the changes that have been more fully tested on arrays + Pandas.