-
In LHCb we've started including some file metadata as JSON embedded in ROOT files as strings. Reading this with PyROOT results in a cppy object which isn't a string but can be trivially converted. With uproot I can't figure out how to get the data of the string at all. Maybe this should be a bug report or feature request as I think it would be reasonable to expect uproot to just return a from pathlib import Path
import uproot
import ROOT
fn = "https://cburr.web.cern.ch/string-example.root"
f1 = ROOT.TFile.Open(fn)
obj1 = f1.Get("FileSummaryRecord")
print(f"{type(obj1)=} {isinstance(obj1, str)=} {str(obj1)=}")
# type(obj1)=<class cppyy.gbl.std.string at 0x29c722270>
# isinstance(obj1, str)=False
# str(obj1)='{"LumiCounter.eventsByRun":{"counts":{},"empty":true,"type":"LumiEventCounter"},"guid":"5FE9437E-D958-11EE-AB88-3CECEF1070AC"}'
f2 = uproot.open(fn)
obj2 = f2["FileSummaryRecord"]
print(f"{type(obj2)=} {isinstance(obj2, str)=} {str(obj2)=}")
# type(obj2)=<class 'uproot.dynamic.Unknown_string'>
# isinstance(obj2, str)=False
# str(obj2)='<Unknown string at 0x000112d8b990>' |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 3 replies
-
Something that could make quick work of this, if you can change the LHCb format, is to save a I didn't know that non-TObjects could be written directly to TDirectories. Perhaps STL collections are an exception? Or maybe just If you get any instance of I took a look at this object: >>> import uproot
>>> file = uproot.open("https://cburr.web.cern.ch/string-example.root")
>>> chunk, cursor = file.key("FileSummaryRecord").get_uncompressed_chunk_cursor()
>>> cursor.debug(chunk)
--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+-
126 123 34 76 117 109 105 67 111 117 110 116 101 114 46 101 118 101 110 116
~ { " L u m i C o u n t e r . e v e n t
--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+-
115 66 121 82 117 110 34 58 123 34 99 111 117 110 116 115 34 58 123 125
s B y R u n " : { " c o u n t s " : { }
--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+-
44 34 101 109 112 116 121 34 58 116 114 117 101 44 34 116 121 112 101 34
, " e m p t y " : t r u e , " t y p e "
--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+-
58 34 76 117 109 105 69 118 101 110 116 67 111 117 110 116 101 114 34 125
: " L u m i E v e n t C o u n t e r " }
--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+-
44 34 103 117 105 100 34 58 34 53 70 69 57 52 51 55 69 45 68 57
, " g u i d " : " 5 F E 9 4 3 7 E - D 9
--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+-
53 56 45 49 49 69 69 45 65 66 56 56 45 51 67 69 67 69 70 49
5 8 - 1 1 E E - A B 8 8 - 3 C E C E F 1
--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+-
48 55 48 65 67 34 125
0 7 0 A C " } It's a very simple serialization: just a 1-byte size followed by the string data. I'll bet that if the size of the string is larger than 254 bytes, the first byte would be 255 followed by a 4-byte size, and then the string data. (That's how I'm tempted to hard-code this as a special case type for reading, especially if only Following up in #1160. |
Beta Was this translation helpful? Give feedback.
-
Thank you @jpivarski, this is very nice as we are recording since a while uncalibrated luminoty information as you have seen in a dict stored as a string under a "file summary record" and will soon be adding more metadata in the produced files. With this enhancement users will be able to use ROOT or uproot :-). |
Beta Was this translation helpful? Give feedback.
Something that could make quick work of this, if you can change the LHCb format, is to save a
TObjString
instead of astd::string
. We useTObjString
for a lot of debugging because it's such a simple type.I didn't know that non-TObjects could be written directly to TDirectories. Perhaps STL collections are an exception? Or maybe just
std::string
?If you get any instance of
uproot.model.UnknownClass
, then Uproot has already given up trying to interpret it. Uproot returns this object instead of raising an error because someone might want to indiscriminately try to read everything from a file for which almost all of the objects are readable.I took a look at this object: