Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to pandas 1.0.4 prior to release #33

Closed
hgrecco opened this issue Jun 20, 2020 · 19 comments
Closed

Update to pandas 1.0.4 prior to release #33

hgrecco opened this issue Jun 20, 2020 · 19 comments

Comments

@hgrecco
Copy link
Owner

hgrecco commented Jun 20, 2020

The current status for the tests (run in my computer) is:

  • failed: 63
  • passed: 265
  • ignored: 2
@hgrecco
Copy link
Owner Author

hgrecco commented Jun 20, 2020

Now

  • failed: 62
  • passed: 266
  • ignored: 2

But not sure if all should pass or some are expected to fail

@znicholls
Copy link
Contributor

@andrewgsavage i think you did most of the heavy lifting, any thoughts?

@hgrecco
Copy link
Owner Author

hgrecco commented Jun 21, 2020

Update:

  • failed: 51
  • passed: 301
  • ignored: 29

Things that re failing that I think should not:

  • reverse binary ops (i.e. radd)
  • compare ops

@hgrecco
Copy link
Owner Author

hgrecco commented Jun 21, 2020

Update:

  • failed: 36
  • passed: 316
  • ignored: 29

@hgrecco
Copy link
Owner Author

hgrecco commented Jun 21, 2020

Update:

  • failed: 23
  • passed: 329
  • ignored: 29

@andrewgsavage
Copy link
Collaborator

My last look at it was here.
#29

I also had trouble with the TestMethods.test_where_series test
pandas-dev/pandas#33825

@znicholls
Copy link
Contributor

znicholls commented Jun 21, 2020

#29

Looks nasty. @hgrecco how would you feel about xfailing the remaining tests and doing a 0.1.0dev release just to reserve the name on pypi and conda? Yes it's full of bugs, but I can't see a fast fix for #29 given you have to somehow remove the __iter__ property on Quantity depending on whether magnitude is an array or not (or a ScalarQuantity class has to be distinguished from a ArrayQuantity class or something...)

@hgrecco
Copy link
Owner Author

hgrecco commented Jun 22, 2020

Update:

  • failed: 21
  • passed: 331
  • ignored: 29

@hgrecco
Copy link
Owner Author

hgrecco commented Jun 22, 2020

@znicholls I actually agree. Adoption will help to find and fix bugs. And 21 failing and 331 passing is not so bad for a 0.1

@hgrecco
Copy link
Owner Author

hgrecco commented Jun 22, 2020

somehow remove the iter property on Quantity depending on whether magnitude is an array or not (or a ScalarQuantity class has to be distinguished from a ArrayQuantity class or something...)

This discussion is one of the reasons preventing me to release Pint 1.0.

Looking at the the error in pint-pandas, I found that many of the problems arise form a wrong result in is_list_like (likely related to exactly what you mentioned). But ndarray does it right.

>>> import numpy as np
>>> x = np.zeros(3)
>>> y = np.asarray(3)
>>> type(x)
<class 'numpy.ndarray'>
>>> type(y)
<class 'numpy.ndarray'>
>>> x
array([0., 0., 0.])
>>> y
array(3)
>>> len(x)
3
>>> len(y)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: len() of unsized object
>>> iter(x)
<iterator object at 0x7f8831837af0>
>>> iter(y)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: iteration over a 0-d array
>>> from pandas._libs import lib
>>> lib.is_list_like(x)
True
>>> lib.is_list_like(y)
False

Can we copy from it?

@znicholls
Copy link
Contributor

Can we copy from it?

Here 'it' being ndarray?

@hgrecco
Copy link
Owner Author

hgrecco commented Jun 22, 2020

Yes. I trying to look how it is done. It is weird because Quantity raises TypeError (like ndarray) upon calling __iter__ or __len__

@hgrecco
Copy link
Owner Author

hgrecco commented Jun 22, 2020

is_list_like is defined here

    return (
        isinstance(obj, abc.Iterable)
        # we do not count strings/unicode/bytes as list-like
        and not isinstance(obj, (str, bytes))
        # exclude zero-dimensional numpy arrays, effectively scalars
        and not (util.is_array(obj) and obj.ndim == 0)
        # exclude sets if allow_sets is False
        and not (allow_sets is False and isinstance(obj, abc.Set))
    )

@znicholls
Copy link
Contributor

obj.ndim == 0

looks like that'll be the key

@znicholls
Copy link
Contributor

znicholls commented Jun 22, 2020

util.is_array(obj)

I think is_array is defined here https://github.com/pandas-dev/pandas/blob/f89ce8c079c43df99ff84ce8755fb8ea3915e30c/pandas/_libs/tslibs/util.pxd#L154 so a pint object will never return False because it's not a numpy array i.e. the check and not (util.is_array(obj) and obj.ndim == 0) fails for Pint quantities (or any non-numpy array for that matter)

@hgrecco
Copy link
Owner Author

hgrecco commented Jun 22, 2020

That is indeed the case! Tomorrow I will make a release of pint-pandas and open and issue in Pint

@znicholls
Copy link
Contributor

Hey @hgrecco, just wanted to see if that release ended up happening?

@hgrecco
Copy link
Owner Author

hgrecco commented Jul 1, 2020

We release on friday no matter what, but this looks very promising: hgrecco/pint#1125

@hgrecco
Copy link
Owner Author

hgrecco commented Jul 1, 2020

I did it today as after several tries, I think that hgrecco/pint#1125 won't work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants