[5.6] Faster implementation for Collection::mapToDictionary #22774

Lapayo · 2018-01-13T15:26:20Z

I just ran into huge performance problems when eager loading multiple thousand rows (~20k in my case).

I tracked it down to the Dictionary::mapToDictionary method, which somehow is very slow when dealing with this amount of data.

I rewrote the method without map-reduce and it is much faster now. Let me know what you think!

The following benchmark was taken on an i7-6700 machine. (Benchmark script attached).
benchmap.zip

time for 100 (current): 5.4161071777344E-5
time for 20k (current): 17.208889961243
time for 25k (current): 35.498338937759
time for (new) 100: 2.5948047637939E-5
time for 20k (new): 0.012021064758301
time for 25k (new): 0.016914844512939

…ld one had problems when eager loading multiple thousands of rows

thecrypticace · 2018-01-13T15:58:31Z

Building arrays with reduce is quadratic so this isn't surprising. 👍

I'd suggest using a foreach loop if performance is the ultimate goal here. On a million items it ranges between 4x to 15x faster.

In the typical case, using foreach in mapToDictionary on a million items runs in 0.8s–4.1s for me (in PHP 7.1 and 7.2) whereas using $collection->each runs in 12s–17s.

(All benchmarks on homestead resulted in < 1s for a million items for foreach. On macOS it's ~4s)

mfn · 2018-01-13T19:31:30Z

I'd suggest using a foreach loop if performance is the ultimate goal here.

Would like to see this too. @Lapayo can you benchmark with plain foreach?

Lapayo · 2018-01-13T20:22:29Z

I just did, its indeed a lot faster - depending on the use case (y)
If the plain version is more likely merged I can commit that one. I was not sure if ->each() might be preferred.

time for 20k (plain): 0.0091784229278564
time for 20k (->each): 0.011384341001511
time for 100k (plain): 0.05624255490303
time for 100k (->each): 0.1215734539032
benchplainvseach.zip

mfn · 2018-01-13T21:44:12Z

@Lapayo very much appreciated!

I can't comment on the likelyhood, I would definitely favor the plain foreach variant.

Collections are by definition "things of many" and thus almost any operation has to loop in some way and it's a hot code path: even the tiniest slow down accumulates on larger sets, as you've demonstrated.

deleugpn · 2018-01-14T13:54:13Z

I wouldn't go as far as say that the second level of improvement (plain foreach over each) would be that important. Imagine 100k items of an actual eloquent record, you'd need a lot of memory available in order to not get a memory exhaustion, which would lead to a chunk, instead.

taylorotwell · 2018-01-14T15:02:11Z

Used foreach. Thanks.

mfn · 2018-01-14T18:46:28Z

@deleugpn

I wouldn't go as far as say that the second level of improvement (plain foreach over each) would be that important. Imagine 100k items of an actual eloquent record, you'd need a lot of memory available in order to not get a memory exhaustion, which would lead to a chunk, instead.

I see were you coming from. But I think you've to think more outside that.

This is a PR for the base Support\Collection, not Eloquent\Collection. So 100k of non-Eloquent models may will make sense for some, too.
Sure, at some point if performance is such an issue, you'd probably forgo using such a wrapper class.

But my take is: having identifier a hot code path and found a solution which a) is 100% backwards compatible and b) gives so much better performance characteristics is nothing to disagree with; on the contrary.

gabrielkoerich · 2018-01-17T17:35:42Z

Why not 5.5?

tnorthcutt · 2018-01-29T23:24:05Z

+1, would love to see this merged into 5.5 as well.

Lapayo added 4 commits January 13, 2018 16:15

Reimplementation of Collection::mapToDictionary to be much faster - o…

5457d98

…ld one had problems when eager loading multiple thousands of rows

Added key to mapToDictionary callback

149934e

Fixed style

5857ecf

Fixed style

d8b2eb1

GrahamCampbell changed the title ~~Faster implementation for Collection::mapToDictionary~~ [5.6] Faster implementation for Collection::mapToDictionary Jan 14, 2018

GrahamCampbell changed the title ~~[5.6] Faster implementation for Collection::mapToDictionary~~ [5.7] Faster implementation for Collection::mapToDictionary Jan 14, 2018

taylorotwell changed the base branch from master to 5.6 January 14, 2018 14:56

taylorotwell merged commit d8b2eb1 into laravel:5.6 Jan 14, 2018

GrahamCampbell changed the title ~~[5.7] Faster implementation for Collection::mapToDictionary~~ [5.6] Faster implementation for Collection::mapToDictionary Jan 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[5.6] Faster implementation for Collection::mapToDictionary #22774

[5.6] Faster implementation for Collection::mapToDictionary #22774

Lapayo commented Jan 13, 2018

thecrypticace commented Jan 13, 2018

mfn commented Jan 13, 2018

Lapayo commented Jan 13, 2018

mfn commented Jan 13, 2018

deleugpn commented Jan 14, 2018

taylorotwell commented Jan 14, 2018

mfn commented Jan 14, 2018

gabrielkoerich commented Jan 17, 2018

tnorthcutt commented Jan 29, 2018

[5.6] Faster implementation for Collection::mapToDictionary #22774

[5.6] Faster implementation for Collection::mapToDictionary #22774

Conversation

Lapayo commented Jan 13, 2018

thecrypticace commented Jan 13, 2018

mfn commented Jan 13, 2018

Lapayo commented Jan 13, 2018

mfn commented Jan 13, 2018

deleugpn commented Jan 14, 2018

taylorotwell commented Jan 14, 2018

mfn commented Jan 14, 2018

gabrielkoerich commented Jan 17, 2018

tnorthcutt commented Jan 29, 2018