Awkward v2 update #620

kkothari2001 · 2022-06-22T06:15:27Z

No description provided.

kkothari2001 · 2022-06-22T18:11:22Z

The above commit has all instances of awkward.layout removed and replaced with awkward.contents and awkward.index. Classes which were separate in akv1 but are now combined in akv2, for eg ListOffsetArray64 have been replaced by their corresponding classes in akv2.

@jpivarski There are 6 places in uproot.interpretation.library where I need some help in deciding what to do. I have marked all these locations with # PLEASE SEE: ...
2 of them were in the form cls = getattr(awkward.layout, form["class"]). Now, I'm not really sure if I should have replaced it with awkward.contents or awkward.index. With some minor context from other functions, I decided to go with awkward.contents for now. Please suggest. For the other 4, some tests are failing because we aren't initialising the RecordArray class right. Changing keys to fields didn't seem to work. Please suggest.

Aryan's work had a few instances of awkward.layout so I changed that too, I'll revert before merging if those changes were not to be done, or if they can possibly interfere with his work.

Lastly as discussed, the typeparser was added to v2. The release hasn't made it to pypi at the time of writing, so I just did ip install -U git+https://github.com/scikit-hep/awkward@main to test it out, which gave me the correct 1.9.0rc6 version. But despite that, the number of failing tests just went from 57 to 52. A lot of tests are failing due to TypeError: object of type 'awkward._ext.ArrayType' has no len() that is called somewhere down the stacktrace of the newly added from_datashape() function.

Essentially this had the same effect as directly importing ak_v1 inline and using that v1 parser, as the same tests fail, and the same tests are resolved (the 5 which were failing earlier). Please have a look at the last few tests in the CI for the error.

kkothari2001 · 2022-06-22T18:39:44Z

I found 7 major ways in which the tests are failing:

awkward.operations.convert.to_list() throws TypeError: use ak._v2.operations.convert.to_list for v2 arrays (for now)
Edit: I just realised this is because of the test importing ak_v1 and then using ak.to_list(). We can just remove these tests for now.
awkward.operations.describe.type() throws

E           TypeError: unrecognized array type: <Array [{x: 1.1, y: 1, z: 97}, ..., {...}] type='5 * {x: float64, y: int32,...'>
E
E           (https://github.com/scikit-hep/awkward-1.0/blob/1.9.0rc6/src/awkward/operations/describe.py#L191)

I believe this is because somewhere in the tests we might be using ak_v1 arrays and that is conflicting.

tests/test_0034-generic-objects-in-ttrees.py is particularly troublesome because it is comparing 2 json strings where one actually has ak_v1-specific classes like ListOffsetArray64. So all the tests here fail somewhere along _v2.types.numpytype.py.
Another error is AttributeError: type object 'Form' has no attribute 'fromjson' for which I'm sure there is a good replacement. Please suggest.

E       AssertionError: assert False
E        +  where False = isinstance(<Array [{i8: -15, f8: -14.9}, ..., {...}] type='420 * {i8: int64, f8: float64}'>, <class 'awkward.highlevel.Array'>)
E        +    where <class 'awkward.highlevel.Array'> = <module 'awkward' from '/home/kmk/uprootall/uproot5/venv/lib/python3.8/site-packages/awkward/__init__.py'>.Array

I'm not sure what this is, but you provided a few notes on the highlevel submodule, I'll go through that, some code and get back to this test.

The changed API for RecordArray is causing issues, as mentioned in the comment above. TypeError: __init__() got an unexpected keyword argument 'keys'
Lastly the various TypeErrors caused during parsing through the from _datashape() function.
Examples are:
TypeError: not a NumPy dtype or an Awkward datashape: int32
TypeError: not a NumPy dtype or an Awkward datashape: {"x": float64, "y": 3 * float64} etc...

But somewhere along the stacktrace the typeparser code in awkward is trying to use PrimitiveType and RecordType which as you explained were replaced with corresponding Numpy types.
Examples of errors thrown are:
TypeError: object of type 'awkward._ext.RecordType' has no len()
TypeError: object of type 'awkward._ext.PrimitiveType' has no len()

jpivarski · 2022-06-22T20:17:43Z