Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Record type is not correct for downcasting awkward vectors #412

Closed
nsmith- opened this issue Jan 24, 2024 · 5 comments · Fixed by #413
Closed

Record type is not correct for downcasting awkward vectors #412

nsmith- opened this issue Jan 24, 2024 · 5 comments · Fixed by #413
Labels
bug The problem described is something that must be fixed

Comments

@nsmith-
Copy link
Member

nsmith- commented Jan 24, 2024

Vector Version

1.1.1.post1

Python Version

3.10.12

OS / Environment

OS/x

Describe the bug

import awkward as ak
import vector.backends.awkward

a = ak.zip(
    {
        "x": [10.0, 20.0, 30.0],
        "y": [-10.0, 20.0, 30.0],
        "z": [5.0, 10.0, 15.0],
        "t": [16.0, 31.0, 46.0],
    },
    with_name="Momentum4D",
    behavior=vector.backends.awkward.behavior,
)
b = ak.zip(
    {
        "x": [-10.0, 20.0, -30.0],
        "y": [-10.0, -20.0, 30.0],
        "z": [5.0, -10.0, 15.0],
    },
    with_name="Momentum3D",
    behavior=vector.backends.awkward.behavior,
)
c = ak.zip(
    {"x": [-10.0, 13.0, 15.0], "y": [12.0, -4.0, 41.0]},
    with_name="Momentum2D",
    behavior=vector.backends.awkward.behavior,
)

print("a+b")
print(a+b)
print(ak.parameters(a+b))
print("b+a")
print(b+a)
print(ak.parameters(b+a))

returns

a+b
[{x: 0, y: -20, z: 10}, {x: 40, y: 0, z: 0}, {x: 0, y: 60, z: 30}]
{'__record__': 'Momentum4D'}
b+a
[{x: 0, y: -20, z: 10}, {x: 40, y: 0, z: 0}, {x: 0, y: 60, z: 30}]
{'__record__': 'Momentum3D'}

So, although the fields available are the ones expected, the record type is incorrect in the first case. Similar issues are present for a+c and b+c.

Any additional but relevant log output

No response

@nsmith- nsmith- added the bug (unverified) The problem described would be a bug, but needs to be triaged label Jan 24, 2024
@nsmith-
Copy link
Member Author

nsmith- commented Jan 24, 2024

One significant consequence is that checking equality a+c == c+a raises a type error:

TypeError: <MomentumArray4D [{x: 0, y: 2}, {...}, {x: 45, ...}] type='3 * Momentum4D[x...'> and <MomentumArray2D [{x: 0, y: 2}, {...}, {x: 45, ...}] type='3 * Momentum2D[x...'> do not have the same dimension

@Saransh-cpp
Copy link
Member

Saransh-cpp commented Jan 24, 2024

Thanks for reporting this! This definitely is a bug, but we should decide what behavior we consider a bug. For instance, the other 2 backends preserve the t field while performing the calculations -

In [1] import vector

In [2]: vector.obj(x=1, y=2, z=3, t=4) + vector.obj(x=1, y=2, z=3)
Out[2]: VectorObject4D(x=2, y=4, z=6, t=4)

In [3]: vector.obj(x=1, y=2, z=3, t=4) - vector.obj(x=1, y=2, z=3)
Out[3]: VectorObject4D(x=0, y=0, z=0, t=4)

In [4]: - vector.obj(x=1, y=2, z=3, t=4) + vector.obj(x=1, y=2, z=3)
Out[4]: VectorObject4D(x=0, y=0, z=0, t=-4)

In [5]: v = vector.VectorNumpy4D(
    ...:     {
    ...:         "x": [1.1, 1.2, 1.3, 1.4, 1.5],
    ...:         "y": [2.1, 2.2, 2.3, 2.4, 2.5],
    ...:         "z": [3.1, 3.2, 3.3, 3.4, 3.5],
    ...:         "t": [4.1, 4.2, 4.3, 4.4, 4.5],
    ...:     }
    ...: )

In [6]: v2 = vector.VectorNumpy3D(
    ...:     {
    ...:         "x": [1.1, 1.2, 1.3, 1.4, 1.5],
    ...:         "y": [2.1, 2.2, 2.3, 2.4, 2.5],
    ...:         "z": [3.1, 3.2, 3.3, 3.4, 3.5],
    ...:     }
    ...: )

In [7]: v - v2
Out[7]: 
VectorNumpy4D([(0., 0., 0., 4.1), (0., 0., 0., 4.2), (0., 0., 0., 4.3),
               (0., 0., 0., 4.4), (0., 0., 0., 4.5)],
              dtype=[('x', '<f8'), ('y', '<f8'), ('z', '<f8'), ('t', '<f8')])

In [8]: v2 -v
Out[8]: 
VectorNumpy3D([(0., 0., 0.), (0., 0., 0.), (0., 0., 0.), (0., 0., 0.),
               (0., 0., 0.)], dtype=[('x', '<f8'), ('y', '<f8'), ('z', '<f8')])

In [9]: v - v2
Out[9]: 
VectorNumpy4D([(0., 0., 0., 4.1), (0., 0., 0., 4.2), (0., 0., 0., 4.3),
               (0., 0., 0., 4.4), (0., 0., 0., 4.5)],
              dtype=[('x', '<f8'), ('y', '<f8'), ('z', '<f8'), ('t', '<f8')])

In [10]: - v + v2
Out[10]: 
VectorNumpy4D([(0., 0., 0., -4.1), (0., 0., 0., -4.2), (0., 0., 0., -4.3),
               (0., 0., 0., -4.4), (0., 0., 0., -4.5)],
              dtype=[('x', '<f8'), ('y', '<f8'), ('z', '<f8'), ('t', '<f8')])

I think the bug here is that the t field is missing from Momentum4D vectors in your example. Changing the nature of promotion / demotion of vectors should maybe be a new feature?

Tagging @jpivarski here for further discussion.

@Saransh-cpp Saransh-cpp added bug The problem described is something that must be fixed and removed bug (unverified) The problem described would be a bug, but needs to be triaged labels Jan 24, 2024
@nsmith-
Copy link
Member Author

nsmith- commented Jan 24, 2024

It's more than that, the other backends also have the same infix operator issue:

>>> vector.obj(x=1, y=2, z=3, t=4) + vector.obj(x=1, y=2, z=3)
VectorObject4D(x=2, y=4, z=6, t=4)
>>> vector.obj(x=1, y=2, z=3) + vector.obj(x=1, y=2, z=3, t=4)
VectorObject3D(x=2, y=4, z=6)

you get a different type depending on which is first.

@nsmith-
Copy link
Member Author

nsmith- commented Jan 24, 2024

My preference is to not have t in either output in the above example, i.e. both should return VectorObject3D(x=2, y=4, z=6)

@jpivarski
Copy link
Member

I also think that it should drop t in all backends.

Namely, if an operation takes more than one vector as input and would return a vector of the same type as output (e.g. T + T → T), but the two inputs differ in type (e.g. T1 + T2), then the output should be the more restrictive of the two (upcasting).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The problem described is something that must be fixed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants