Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Instrument] Add method for loading multiple instruments data at once #6869

Merged
merged 7 commits into from
Apr 12, 2021

Conversation

driusan
Copy link
Collaborator

@driusan driusan commented Jul 27, 2020

There is currently no mechanism to load the data from multiple
instrument instances at the same time efficiently and work with them
in PHP. (ie. to work the the data from multiple commentIDs for a single
instrument type in batch in a script) other than directly querying SQL.
However, this has other drawbacks such as the fact that you only know
whether the instrument data should be loaded from the flag.Data column
or a separate table once the instrument is instantiated.

This adds a bulkLoadInstanceData function which will load the data from
multiple commentIDs at the same time in a single SQL query and return an
array of \NDB_BVL_Instruments with the data loaded.

In my non-scientific benchmarking where I loaded all the data from a single
instrument type on my sandbox multiple times, and printed the value of a single
field this (unsurprisingly) resulted in significant performance gains.

The naive way where factory is called multiple times took approximately
12 seconds:

foreach($commentIDs as $cid) {
    $inst = \NDB_BVL_Instrument::factory('adi_r_proband', $cid);
    print $inst->getFieldValue('informant_relation');
}

The slightly smarter way where we avoid the overhead of re-parsing the
instrument took approximately 1.2 seconds:

$inst = \NDB_BVL_Instrument::factory('adi_r_proband');
foreach($commentIDs as $cid) {
    // note: must hack NDB_BVL_Instrument to make instanceData
    // public for this to work.
    $inst->instanceData = null;
    $inst->commentID = $cid;
    print $inst->getFieldValue('informant_relation');
}

The new function took approximately 0.2s:

$inst = \NDB_BVL_Instrument::factory('adi_r_proband');
$data = $inst->bulkLoadInstanceData($commentIDs);
foreach($data as $i) {
    print $i->getFieldValue('informant_relation');
}

@driusan driusan added State: Blocked PR or issue awaiting an external event such as the merge or another PR to proceed Category: Feature PR or issue that aims to introduce a new feature Meta PR does something that organizes, upgrades, or manages the functionality of the codebase labels Jul 27, 2020
@driusan
Copy link
Collaborator Author

driusan commented Jul 27, 2020

Blocked by #6868

@laemtl laemtl self-assigned this Nov 20, 2020
@laemtl
Copy link
Contributor

laemtl commented Nov 20, 2020

Waiting for #6868 to be merged and this PR rebased before testing, as some changes were made in #6868 which can possibly affect this PR.

@driusan driusan force-pushed the bulkLoadInstanceData branch from c70d566 to ed6eb24 Compare November 20, 2020 20:59
@driusan
Copy link
Collaborator Author

driusan commented Nov 20, 2020

@laemtl #6868 is merged and this is rebased

@laemtl laemtl removed the State: Blocked PR or issue awaiting an external event such as the merge or another PR to proceed label Nov 20, 2020
@laemtl
Copy link
Contributor

laemtl commented Nov 20, 2020

There is some minor linting errors to fix:

FILE: ...nner/work/Loris/Loris/php/libraries/NDB_BVL_Instrument.class.inc

FOUND 2 ERRORS AFFECTING 2 LINES

2142 | ERROR | [x] Equals sign not aligned with surrounding
| | assignments; expected 11 spaces but found 1 space
2148 | ERROR | [x] Equals sign not aligned with surrounding
| | assignments; expected 11 spaces but found 1 space

Copy link
Contributor

@laemtl laemtl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and works perfectly.

php/libraries/NDB_BVL_Instrument.class.inc Outdated Show resolved Hide resolved
php/libraries/NDB_BVL_Instrument.class.inc Show resolved Hide resolved
php/libraries/NDB_BVL_Instrument.class.inc Outdated Show resolved Hide resolved
@driusan driusan added the State: Needs work PR awaiting additional work by the author to proceed label Dec 7, 2020
@ridz1208
Copy link
Collaborator

@driusan will you be fixing this one ?

@driusan
Copy link
Collaborator Author

driusan commented Mar 16, 2021

Eventually.

driusan and others added 5 commits March 17, 2021 10:54
NDB_BVL_Instrument has an.. unusual.. way of dealing with instrument
data. Rather than having a class method for getting the data, there's
a static method which takes an instrument as a parameter. The result
is that data can not be cached in the object and needs to be continuously
re-loaded from the database every time if it's needed multiple times in
the same request. (For instance, calling getFieldValue for different fields
in a script.)

This adds a normal getter and caches the results in the object, deprecating
the old static loadInstanceData and its inexplicable behaviour.
There is currently no mechanism to load the data from multiple
instrument instances at the same time efficiently and work with them
in PHP. (ie. to work the the data from multiple commentIDs for a single
instrument type in batch in a script) other than directly querying SQL.
However, this has other drawbacks such as the you only know whether the
instrument data should be loaded from the flag.Data column or a separate
table once the instrument is instantiated.

This adds a bulkLoadInstanceData function which will load the data from
multiple commentIDs at the same time in a single SQL query and return an
array of \NDB_BVL_Instruments with the data loaded.

In my non-scientific benchmarking where I loaded all the data from a single
instrument type on my sandbox multiple times, and printed the value of a single
field this (unsurprisingly) resulted in significant performance gains.

The naive way where factory is called multiple times took approximately
12 seconds:

```
foreach($commentIDs as $cid) {
    $inst = \NDB_BVL_Instrument::factory('adi_r_proband', $cid);
    print $inst->getFieldValue('informant_relation');
}
```

The slightly smarter way where we avoid the overhead of re-parsing the
instrument took approximately 1.2 seconds:

```
$inst = \NDB_BVL_Instrument::factory('adi_r_proband');
foreach($commentIDs as $cid) {
    // note: must hack NDB_BVL_Instrument to make instanceData
    // public for this to work.
    $inst->instanceData = null;
    $inst->commentID = $cid;
    print $inst->getFieldValue('informant_relation');
}
```

The new function took approximately 0.2s:

```
$inst = \NDB_BVL_Instrument::factory('adi_r_proband');
$data = $inst->bulkLoadInstanceData($commentIDs);
foreach($data as $i) {
    print $i->getFieldValue('informant_relation');
}
```
Co-authored-by: Rida Abou-Haidar <ridz1208@users.noreply.github.com>
@driusan driusan force-pushed the bulkLoadInstanceData branch from e0b387f to 300c8ae Compare March 17, 2021 14:55
@driusan
Copy link
Collaborator Author

driusan commented Mar 17, 2021

I've rebased this and fixed the issue in the long thread.

I didn't retest this, whoever tests this should ensure that when bulkLoadInstanceData is called on a CommentID that has a null Data column for a JSON instrument, calling getInstanceData on the instance returned doesn't re-query the database.

@driusan
Copy link
Collaborator Author

driusan commented Mar 17, 2021

@laemtl I just saw that you tested/approved this in November.. can you re-test the case in my comment above since the behaviour has changed?

@driusan driusan removed the State: Needs work PR awaiting additional work by the author to proceed label Mar 17, 2021
@ridz1208 ridz1208 removed the Passed manual tests PR has been successfully tested by at least one peer label Mar 17, 2021
@laemtl
Copy link
Contributor

laemtl commented Mar 30, 2021

@driusan I get an SQL error if $commentIDs is empty:

$commentIDs = [];
$inst = \NDB_BVL_Instrument::factory('bmi');
$data = $inst->bulkLoadInstanceData($commentIDs);
foreach($data as $i) {
    error_log($i->getFieldValue('weight_lbs'));
}
[Tue Mar 30 18:14:22.877181 2021] [php7:notice] [pid 9760] [client ::1:39160] Could not execute SELECT * FROM bmi WHERE CommentID IN (). Stack trace#0 /var/www/html/loris-core/php/libraries/Database.class.inc(745): Database->execute()\n#1 /var/www/html/loris-core/php/libraries/NDB_BVL_Instrument.class.inc(2225): Database->pselect()\n#2 /var/www/html/loris-core/modules/dashboard/php/dashboard.class.inc(121): NDB_BVL_Instrument->bulkLoadInstanceData()\n#3 /var/www/html/loris-core/php/libraries/Module.class.inc(275): LORIS\\dashboard\\Dashboard->__construct()\n#4 /var/www/html/loris-core/php/libraries/Module.class.inc(342): Module->loadPage()\n#5 /var/www/html/loris-core/src/Middleware/ResponseGenerator.php(50): Module->handle()\n#6 /var/www/html/loris-core/src/Middleware/AuthMiddleware.php(63): LORIS\\Middleware\\ResponseGenerator->process()\n#7 /var/www/html/loris-core/src/Router/ModuleRouter.php(74): LORIS\\Middleware\\AuthMiddleware->process()\n#8 /var/www/html/loris-core/src/Middleware/ExceptionHandlingMiddleware.php(35): LORIS\\Router\\ModuleRouter->handle()\n#9 /var/www/html/loris-core/src/Router/BaseRouter.php(112): LORIS\\Middleware\\ExceptionHandlingMiddleware->process()\n#10 /var/www/html/loris-core/src/Middleware/ResponseGenerator.php(50): LORIS\\Router\\BaseRouter->handle()\n#11 /var/www/html/loris-core/src/Middleware/ContentLength.php(52): LORIS\\Middleware\\ResponseGenerator->process()\n#12 /var/www/html/loris-core/htdocs/index.php(48): LORIS\\Middleware\\ContentLength->process()\n#13 {main}, referer: https://localhost/
[Tue Mar 30 18:14:22.877319 2021] [php7:notice] [pid 9760] [client ::1:39160] You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ')' at line 1, referer: https://localhost/

Same with null (this case may be fine)

[Tue Mar 30 18:14:22.877319 2021] [php7:notice] [pid 9760] [client ::1:39160] You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ')' at line 1, referer: https://localhost/
[Tue Mar 30 18:17:57.226043 2021] [php7:notice] [pid 28504] [client ::1:39320] PHP Fatal error:  Uncaught TypeError: Argument 1 passed to NDB_BVL_Instrument::bulkLoadInstanceData() must be iterable, null given, called in /var/www/html/loris-core/modules/dashboard/php/dashboard.class.inc on line 121 and defined in /var/www/html/loris-core/php/libraries/NDB_BVL_Instrument.class.inc:2183\nStack trace:\n#0 /var/www/html/loris-core/modules/dashboard/php/dashboard.class.inc(121): NDB_BVL_Instrument->bulkLoadInstanceData()\n#1 /var/www/html/loris-core/php/libraries/Module.class.inc(275): LORIS\\dashboard\\Dashboard->__construct()\n#2 /var/www/html/loris-core/php/libraries/Module.class.inc(342): Module->loadPage()\n#3 /var/www/html/loris-core/src/Middleware/ResponseGenerator.php(50): Module->handle()\n#4 /var/www/html/loris-core/src/Middleware/AuthMiddleware.php(63): LORIS\\Middleware\\ResponseGenerator->process()\n#5 /var/www/html/loris-core/src/Router/ModuleRouter.php(74): LORIS\\Middleware\\AuthMiddleware->process()\n#6 /var/www/html/loris-core/src/Middleware/ExceptionHandlingMiddleware.php(35): LORIS\\Router\\ModuleRouter-> in /var/www/html/loris-core/php/libraries/NDB_BVL_Instrument.class.inc on line 2183, referer: https://localhost/

They are corner cases but it may worth considering returning null if such value is passed.

@laemtl laemtl added the Passed manual tests PR has been successfully tested by at least one peer label Mar 30, 2021
@laemtl
Copy link
Contributor

laemtl commented Mar 30, 2021

@driusan Everything else works perfectly. Approving.

@driusan driusan merged commit 8885c8f into aces:main Apr 12, 2021
AlexandraLivadas pushed a commit to AlexandraLivadas/Loris that referenced this pull request Jun 29, 2021
…aces#6869)

There is currently no mechanism to load the data from multiple
instrument instances at the same time efficiently and work with them
in PHP. (ie. to work the the data from multiple commentIDs for a single
instrument type in batch in a script) other than directly querying SQL.
However, this has other drawbacks such as the you only know whether the
instrument data should be loaded from the flag.Data column or a separate
table once the instrument is instantiated.

This adds a bulkLoadInstanceData function which will load the data from
multiple commentIDs at the same time in a single SQL query and return an
array of \NDB_BVL_Instruments with the data loaded.

In my non-scientific benchmarking where I loaded all the data from a single
instrument type on my sandbox multiple times, and printed the value of a single
field this (unsurprisingly) resulted in significant performance gains.

The naive way where factory is called multiple times took approximately
12 seconds:

```
foreach($commentIDs as $cid) {
    $inst = \NDB_BVL_Instrument::factory('adi_r_proband', $cid);
    print $inst->getFieldValue('informant_relation');
}
```

The slightly smarter way where we avoid the overhead of re-parsing the
instrument took approximately 1.2 seconds:

```
$inst = \NDB_BVL_Instrument::factory('adi_r_proband');
foreach($commentIDs as $cid) {
    // note: must hack NDB_BVL_Instrument to make instanceData
    // public for this to work.
    $inst->instanceData = null;
    $inst->commentID = $cid;
    print $inst->getFieldValue('informant_relation');
}
```

The new function took approximately 0.2s:

```
$inst = \NDB_BVL_Instrument::factory('adi_r_proband');
$data = $inst->bulkLoadInstanceData($commentIDs);
foreach($data as $i) {
    print $i->getFieldValue('informant_relation');
}
```
@ridz1208 ridz1208 added this to the 24.0.0 milestone Aug 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Category: Feature PR or issue that aims to introduce a new feature Meta PR does something that organizes, upgrades, or manages the functionality of the codebase Passed manual tests PR has been successfully tested by at least one peer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants