Skip to content
This repository has been archived by the owner on Jun 21, 2022. It is now read-only.

dtype byteswap #434

Closed
8me opened this issue Jan 10, 2020 · 3 comments
Closed

dtype byteswap #434

8me opened this issue Jan 10, 2020 · 3 comments

Comments

@8me
Copy link

8me commented Jan 10, 2020

I have some trouble with the dtype in the KM3NeT_TIMESLICE format (see for example #433). The data can be read using

f = uproot.open("file.root")
tree = f[b'KM3NET_TIMESLICE_L2'][b'KM3NETDAQ::JDAQTimeslice']
superframes = tree[b'vector<KM3NETDAQ::JDAQSuperFrame>']
hits_buffer = superframes[
b'vector<KM3NETDAQ::JDAQSuperFrame>.buffer'].lazyarray(
    uproot.asjagged(uproot.astable(
        uproot.asdtype([("pmt", "u1"), ("tdc", "u4"),
              ("tot", "u1")])),
           skipbytes=6),
        basketcache=uproot.cache.ThreadSafeArrayCache(23*1024**2))

(taken from #433 ... thanks @tamasgal ;).

I tried to modify this code and use the dtypes "<u4" and ">u4" for the "tdc" field, in order to invert byteorder. Unfortunately this modification did not have any effect.

example.zip

@jpivarski
Copy link
Member

uproot.asdtype takes two arguments, the first is the dtype it interprets the file as having and the second (optional, usually isn't written) is the dtype it presents to Python. In most cases, the second argument is a native-endian version of the first argument, so given a single dtype, the asdtype constructor makes what you've given it big-endian, then makes a native-endian copy for the second argument.

It sounds like you have a special case where you need to subvert that. You can do so by passing two arguments with the endianness of your choice.

@tamasgal
Copy link
Contributor

tamasgal commented Jan 11, 2020

Thanks Jim. The confusion came from the assumption that one can pass the dtypes as list. After having a quick look at the asdtype constructor I realised that it makes a type check for fromdtype and if it's not a numpy.dtype or a string_type type definition, it will force big endian, which of course makes sense since ROOT is always big endian.

So if I pass in a numpy.dtype instance, it works fine (see below). I think that's good enough and not important to be "fixed", but I added a PR (#435) so you can decide.

In [1]: import uproot

In [2]: import numpy as np

In [3]: fromdtype = [("pmt", "u1"), ("tdc", "<u4"), ("tot", "u1")]

In [4]: todtype = [("pmt", "u1"), ("tdc", ">u4"), ("tot", "u1")]

In [5]: f = uproot.open("file.root")
   ...: tree = f[b'KM3NET_TIMESLICE_L1'][b'KM3NETDAQ::JDAQTimeslice']
   ...: superframes = tree[b'vector<KM3NETDAQ::JDAQSuperFrame>']
   ...: hits_buffer = superframes[
   ...: b'vector<KM3NETDAQ::JDAQSuperFrame>.buffer'].lazyarray(
   ...:     uproot.asjagged(uproot.astable(
   ...:         uproot.asdtype(fromdtype, todtype)), skipbytes=6))

In [6]: hits_buffer['tdc']
Out[6]: <ChunkedArray [[1349256960 1517029120 2379418112 ... 3408848901 771419397 1039854853] [4169531904 176620032 2551121408 ... 3021399557 3259299077 3125081349] [1036452096 1355219200 1311639040 ... 1999826949 4087149829 228455685]] at 0x00011454ea90>

In [7]: fromdtype = np.dtype([("pmt", "u1"), ("tdc", "<u4"), ("tot", "u1")])

In [8]: todtype = [("pmt", "u1"), ("tdc", ">u4"), ("tot", "u1")]

In [9]: f = uproot.open("file.root")
   ...: tree = f[b'KM3NET_TIMESLICE_L1'][b'KM3NETDAQ::JDAQTimeslice']
   ...: superframes = tree[b'vector<KM3NETDAQ::JDAQSuperFrame>']
   ...: hits_buffer = superframes[
   ...: b'vector<KM3NETDAQ::JDAQSuperFrame>.buffer'].lazyarray(
   ...:     uproot.asjagged(uproot.astable(
   ...:         uproot.asdtype(fromdtype, todtype)), skipbytes=6))

In [10]: hits_buffer['tdc']
Out[10]: <ChunkedArray [[486480 486490 709517 ... 99102411 99482157 99482173] [165624 165642 397208 ... 98965172 99960002 99959994] [116541 116560 405070 ... 99627639 99982579 99982605]] at 0x000114da7750>

@jpivarski
Copy link
Member

As I said on the PR, this was an overlooked case and I'll approve it when you give me the go-ahead. (I don't want to merge something when you might be adding more to it that I don't know about.) Since your PR fixes this issue, you can close it whenever you want.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants