-
Notifications
You must be signed in to change notification settings - Fork 67
Dimensional issue of subclassed object in nested branches #465
Comments
My understanding of your problem is that (a) you expect multidimensional arrays for On (a), I would expect a one-dimensional array in each event. The On (b), I can see the problem: the extra values are physically in the ROOT file, but the length of your >>> f['E']['Evt']['mc_trks']['mc_trks.usr'].array(uproot.asdebug)[0]
array([ 64, 0, 0, 126, 0, 9, 0, 0, 0, 4, 63, 168, 238,
40, 103, 39, 86, 134, 63, 174, 33, 16, 27, 0, 54, 135,
64, 8, 0, 0, 0, 0, 0, 0, 64, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 1, 64, 79, 158, 226, 235, 28,
67, 45, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
dtype=uint8) The first 6 bytes ( >>> f['E']['Evt']['mc_trks']['mc_trks.usr'].interpretation
asjagged(asdtype('>f8'), 10) just skips over the first 10 bytes, expecting the rest to be meaningful data. In all the cases I've seen up to this point, it has been. In your case, there's junk padding after the meaningful data. So ROOT has more degrees of freedom than previously thought and we have to actually check the 4 byte size in every STL vector jagged array. I'm going to see about adding a parameter for that. |
…ve unused/padding/junk in each event after the vector's serialized data).
Sorry for the weird issue title, we can change it later, I could not find a better one 😉
I need some help parsing a fairly simple data structure. It derives from a class which has:
These two vectors are used to store arbitrary data, so that e.g.
usr_names = ["bx", "by", "ichan", "cc"]
and the corresponding values are at the specific indices. So if you need to look up the value for"by"
, you look up the index of it and then accessusr[idx]
.So far so good, it works for "one dimensional" (flat) branches. Here, the event of class
Evt
derives from thatAAObject
and the the branchEvt
simply contains a flat vector ofEvt
instances (3 of them) and each event contains 17 "usr entries":The problem appears with classes which have instances in nested branches. This means that for example the
Trk
class, which also derives fromAAObject
and is part of theEvt
branch. EachEvt
entry has a variable length ofTrks
, as seen here (just the relevant parts):The
Trk
class itself is also quite straight forward and consists of some attributes:The file
usr-nested
contains a few events and every event contains multipleTrk
instances. Only the first twoTrk
entries should have some entries in theusr*
attributes, the first oneby
,bx
,ichan
andcc
, the second one onlyenergy_lost_in_can
. The structure of the arrays however I get back fromuproot
are all one dimensional per event. It seems that only the first entry is extracted and also the length of the arrays seems a bit "random".This is what I get:
What I expect, is that the following line returns a nested list, where the first nested list has a length of 4, the second 1 and all others are empty. I however get 15 single entries and the first 4 entries correspond to the first track (and the values are OK):
The array elements apart from the first 4 values seem to be random memory bits.
I tried to figure out why the data is parsed incorrectly but I failed so far. I also did not found the word
energy_lost_in_can
in theusr_fields
(converted to string etc.), so I guess it is lost somewhere in the low level parsing inuproot
.This is what the ROOT based library spits out for the first event (all 21 tracks, the first one with 4 usr-entries, the second with 1 and every other with no entries):
Do you have any idea, or is this a known issue with nested vectors?
I attached both files in case you want to have a look.
usr.zip
The text was updated successfully, but these errors were encountered: