-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Instrument] Add method for loading multiple instruments data at once #6869
Conversation
Blocked by #6868 |
c70d566
to
ed6eb24
Compare
There is some minor linting errors to fix: FILE: ...nner/work/Loris/Loris/php/libraries/NDB_BVL_Instrument.class.incFOUND 2 ERRORS AFFECTING 2 LINES2142 | ERROR | [x] Equals sign not aligned with surrounding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested and works perfectly.
@driusan will you be fixing this one ? |
Eventually. |
NDB_BVL_Instrument has an.. unusual.. way of dealing with instrument data. Rather than having a class method for getting the data, there's a static method which takes an instrument as a parameter. The result is that data can not be cached in the object and needs to be continuously re-loaded from the database every time if it's needed multiple times in the same request. (For instance, calling getFieldValue for different fields in a script.) This adds a normal getter and caches the results in the object, deprecating the old static loadInstanceData and its inexplicable behaviour.
There is currently no mechanism to load the data from multiple instrument instances at the same time efficiently and work with them in PHP. (ie. to work the the data from multiple commentIDs for a single instrument type in batch in a script) other than directly querying SQL. However, this has other drawbacks such as the you only know whether the instrument data should be loaded from the flag.Data column or a separate table once the instrument is instantiated. This adds a bulkLoadInstanceData function which will load the data from multiple commentIDs at the same time in a single SQL query and return an array of \NDB_BVL_Instruments with the data loaded. In my non-scientific benchmarking where I loaded all the data from a single instrument type on my sandbox multiple times, and printed the value of a single field this (unsurprisingly) resulted in significant performance gains. The naive way where factory is called multiple times took approximately 12 seconds: ``` foreach($commentIDs as $cid) { $inst = \NDB_BVL_Instrument::factory('adi_r_proband', $cid); print $inst->getFieldValue('informant_relation'); } ``` The slightly smarter way where we avoid the overhead of re-parsing the instrument took approximately 1.2 seconds: ``` $inst = \NDB_BVL_Instrument::factory('adi_r_proband'); foreach($commentIDs as $cid) { // note: must hack NDB_BVL_Instrument to make instanceData // public for this to work. $inst->instanceData = null; $inst->commentID = $cid; print $inst->getFieldValue('informant_relation'); } ``` The new function took approximately 0.2s: ``` $inst = \NDB_BVL_Instrument::factory('adi_r_proband'); $data = $inst->bulkLoadInstanceData($commentIDs); foreach($data as $i) { print $i->getFieldValue('informant_relation'); } ```
Co-authored-by: Rida Abou-Haidar <ridz1208@users.noreply.github.com>
e0b387f
to
300c8ae
Compare
I've rebased this and fixed the issue in the long thread. I didn't retest this, whoever tests this should ensure that when bulkLoadInstanceData is called on a CommentID that has a null Data column for a JSON instrument, calling |
@laemtl I just saw that you tested/approved this in November.. can you re-test the case in my comment above since the behaviour has changed? |
@driusan I get an SQL error if $commentIDs is empty:
Same with null (this case may be fine)
They are corner cases but it may worth considering returning null if such value is passed. |
@driusan Everything else works perfectly. Approving. |
…aces#6869) There is currently no mechanism to load the data from multiple instrument instances at the same time efficiently and work with them in PHP. (ie. to work the the data from multiple commentIDs for a single instrument type in batch in a script) other than directly querying SQL. However, this has other drawbacks such as the you only know whether the instrument data should be loaded from the flag.Data column or a separate table once the instrument is instantiated. This adds a bulkLoadInstanceData function which will load the data from multiple commentIDs at the same time in a single SQL query and return an array of \NDB_BVL_Instruments with the data loaded. In my non-scientific benchmarking where I loaded all the data from a single instrument type on my sandbox multiple times, and printed the value of a single field this (unsurprisingly) resulted in significant performance gains. The naive way where factory is called multiple times took approximately 12 seconds: ``` foreach($commentIDs as $cid) { $inst = \NDB_BVL_Instrument::factory('adi_r_proband', $cid); print $inst->getFieldValue('informant_relation'); } ``` The slightly smarter way where we avoid the overhead of re-parsing the instrument took approximately 1.2 seconds: ``` $inst = \NDB_BVL_Instrument::factory('adi_r_proband'); foreach($commentIDs as $cid) { // note: must hack NDB_BVL_Instrument to make instanceData // public for this to work. $inst->instanceData = null; $inst->commentID = $cid; print $inst->getFieldValue('informant_relation'); } ``` The new function took approximately 0.2s: ``` $inst = \NDB_BVL_Instrument::factory('adi_r_proband'); $data = $inst->bulkLoadInstanceData($commentIDs); foreach($data as $i) { print $i->getFieldValue('informant_relation'); } ```
There is currently no mechanism to load the data from multiple
instrument instances at the same time efficiently and work with them
in PHP. (ie. to work the the data from multiple commentIDs for a single
instrument type in batch in a script) other than directly querying SQL.
However, this has other drawbacks such as the fact that you only know
whether the instrument data should be loaded from the flag.Data column
or a separate table once the instrument is instantiated.
This adds a bulkLoadInstanceData function which will load the data from
multiple commentIDs at the same time in a single SQL query and return an
array of \NDB_BVL_Instruments with the data loaded.
In my non-scientific benchmarking where I loaded all the data from a single
instrument type on my sandbox multiple times, and printed the value of a single
field this (unsurprisingly) resulted in significant performance gains.
The naive way where factory is called multiple times took approximately
12 seconds:
The slightly smarter way where we avoid the overhead of re-parsing the
instrument took approximately 1.2 seconds:
The new function took approximately 0.2s: