-
Notifications
You must be signed in to change notification settings - Fork 11.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[5.6] Optimize query builder's pluck() method #23482
[5.6] Optimize query builder's pluck() method #23482
Conversation
What's the status of fetch code? It seems as if it cannot be changed in full-stack Laravel, whereas it can be changed when using illuminate/database as a stand-alone package (via So please tell me which is the correct way forward:
Even if the second option is preferred, taking care of this change in the tests may be worthwhile as well, as this would make the mocks more realistic (because full-stack Laravel always returns objects). |
* @param callable $callback | ||
* @return mixed | ||
*/ | ||
protected function onceWithColumns($columns, $callback) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the correct location for the helper method? Or should it be placed below get()
?
Thank you for your work on this. Much appreciated. Ah ah yes, fetch mode has been removed, see notably #17557. Still, users can modify the fetch mode, so currently your code would break with "array fetch modes". Off the top of my head:
|
Also, keep in mind the (as a reminder: at some time, laravel was using |
foreach ($queryResult as $row) { | ||
$itemValue = $row->$column; | ||
|
||
if (is_null($key)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move the is_null($key)
function call outside of the loop.
(replacing with $key === null
would be fine too, but you know the laravel convention...)
Can you provide some benchmarks? I believe this kind of changes does not get merged without some kind of benchmarks. |
I will gladly do that once I get some official buy-in. ;) |
07201ac
to
f6894c0
Compare
Okay, I refactored a bit, tests are working. BenchmarkTable with 10885 rows, 1000 iterations So, almost 2x speedup. |
} else { | ||
$results[$row->$key] = $row->$column; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move the is_null()
function call outside of the loop:
if (is_null($key)) {
foreach ($queryResult as $row) {
$results[] = $row->$column;
}
} else {
foreach ($queryResult as $row) {
$results[$row->$key] = $row->$column;
}
}
It's a bit longer, but saving thousands of function calls is worth it, and the code is still readable, maybe even a bit better.
} else { | ||
$results[$row[$key]] = $row[$column]; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same is_null()
unlooping as in pluckFromObjectColumn()
.
|
||
return tap($callback(), function () use ($original) { | ||
$this->columns = $original; | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this one is "Laravel style", but sorry it's just incredibly painful to understand, especially if you are not used to this voodoo helper. Can't we just stay with the regular and simple lines:
$results = $callback();
$this->columns = $original;
return $results;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, you're right. (I'm used to it from Ruby land, where it's far more readable.) I've advocated against the overuse of tap
in Laravel in the past myself, whoops. Thanks for the feedback.
Strong +1 for this. Sure there is a bit of added code, but it's not that much, and it's still very understandable. The DB Also, would you rerun your benchmark with the unlooped |
This allows us to reimplement pluck() without calling get(), while avoiding the duplication of a lot of code. As a next step, pluck() can be re-implemented without relying on the (comparatively expensive) same method on the Collection class.
This prepares for the final step of removing (without replacement) the expensive parts of its implementation (from which, I suspect, the code will get shorter again).
Because we don't need all the features of data_get() - such as dot access - there is no need to use the function, which can become expensive when doing this with many, many rows in a loop.
f6894c0
to
5e444b7
Compare
@vlakoff I've taken care of your feedback, thank you! Updated BenchmarkTable with 10885 rows, 1000 iterations |
@franzliedke @vlakoff |
Comparing this bench with the previous one, the |
@carusogabriel Indeed. If it were up to me, I would definitely use this. But the Laravel codebase uses |
Well, since we're not doing it in a loop, this type of micro-optimization won't have any measurable effect anyway. |
@vlakoff Maybe is some topic we can put at @franzliedke Isn't micro, we reduce from 3 opcodes to 1, see:
|
The difference in number of opcodes is because of (what I think is called) constant folding. An optimization that is performed by PHP because
|
Totally like the PR. But shouldn't we be cautious and rather target master/5.7 with this? |
|
||
$result = $callback(); | ||
|
||
$this->columns = $original; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be wrapped in a try
/finally
block? We want to make sure we always reset them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It wasn't before my refactoring, so I don't want to change this here...
So… it seems to be ready. Anything remaining to do? |
Replying to mfn:
The change is supposed to be transparent ;-) If ever it breaks something, postponing the change to 5.7 wouldn't have made it better. In such a scenario, we would have to fix it asap or revert the change if not quickly fixable. Would be the same for any branch. edit: Applying it to 5.7, it would have some "beta testing" phase, indeed. But as it's not changing the API, I still think it should be applied the soonest (and fixed the soonest, in worst case scenario). |
Optimizing pluck (on our own fork) on a project resulted in a previous ~2-minute process optimizing to around 20 seconds. Glad to see someone has also encountered this as well and raised attention to it. |
Thanks. Sorry for the delay 😄 |
Proposed by @vlakoff in laravel/ideas#98 and yesterday acknowledged by Taylor...
This refactors the
pluck
method (and, to stay DRY, theget
method) ofIlluminate\Database\Query\Builder
to no longer rely onobject_get
internally. As reported in the aforementioned issue, this can become a bottleneck when looping over a large number of results.I've inlined
Collection::pluck
and thenArr::pluck
, subsequently throwing away any overly generic code that is not necessary in this use case - because we know the data type of query results. Also, we don't need the other features ofobject_get
, such as dot notation and default values.