Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iterating with ForEach over Immutable collections is very slow #29092

Open
adamsitnik opened this issue Mar 27, 2019 · 12 comments
Open

Iterating with ForEach over Immutable collections is very slow #29092

adamsitnik opened this issue Mar 27, 2019 · 12 comments
Labels
area-System.Collections help wanted [up-for-grabs] Good issue for external contributors tenet-performance Performance related issue
Milestone

Comments

@adamsitnik
Copy link
Member

How to run the benchmarks:

git clone https://github.com/dotnet/performance.git
# if you have .NET Core 3.0 installed
dotnet run -c Release -f netcoreapp3.0 -p .\performance\src\benchmarks\micro\MicroBenchmarks.csproj --filter *IterateForEach<* --join
# if you don't have .NET Core 3.0 installed
py .\performance\scripts\benchmarks_ci.py -f netcoreapp3.0 --filter *IterateForEach<* --bdn-arguments="--join true"
Type Method Size Mean
IterateForEach<Int32> Array 512 222.1 ns
IterateForEach<String> Array 512 216.8 ns
IterateForEach<Int32> Span 512 213.7 ns
IterateForEach<String> Span 512 230.2 ns
IterateForEach<Int32> ReadOnlySpan 512 210.6 ns
IterateForEach<String> ReadOnlySpan 512 227.8 ns
IterateForEach<Int32> IEnumerable 512 2,247.3 ns
IterateForEach<String> IEnumerable 512 2,465.8 ns
IterateForEach<Int32> List 512 1,125.1 ns
IterateForEach<String> List 512 2,401.7 ns
IterateForEach<Int32> LinkedList 512 1,972.2 ns
IterateForEach<String> LinkedList 512 3,284.3 ns
IterateForEach<Int32> HashSet 512 1,565.0 ns
IterateForEach<String> HashSet 512 2,344.5 ns
IterateForEach<Int32> Dictionary 512 3,432.4 ns
IterateForEach<String> Dictionary 512 3,159.9 ns
IterateForEach<Int32> Queue 512 1,539.8 ns
IterateForEach<String> Queue 512 3,834.0 ns
IterateForEach<Int32> Stack 512 1,806.3 ns
IterateForEach<String> Stack 512 3,876.8 ns
IterateForEach<Int32> SortedList 512 5,771.9 ns
IterateForEach<String> SortedList 512 7,221.3 ns
IterateForEach<Int32> SortedSet 512 8,825.5 ns
IterateForEach<String> SortedSet 512 9,210.3 ns
IterateForEach<Int32> SortedDictionary 512 9,784.2 ns
IterateForEach<String> SortedDictionary 512 13,343.0 ns
IterateForEach<Int32> ConcurrentDictionary 512 15,821.2 ns
IterateForEach<String> ConcurrentDictionary 512 21,915.9 ns
IterateForEach<Int32> ConcurrentQueue 512 4,228.2 ns
IterateForEach<String> ConcurrentQueue 512 5,546.4 ns
IterateForEach<Int32> ConcurrentStack 512 3,286.0 ns
IterateForEach<String> ConcurrentStack 512 4,345.6 ns
IterateForEach<Int32> ConcurrentBag 512 2,682.9 ns
IterateForEach<String> ConcurrentBag 512 4,924.7 ns
IterateForEach<Int32> ImmutableArray 512 1,137.9 ns
IterateForEach<String> ImmutableArray 512 1,137.7 ns
IterateForEach<Int32> ImmutableDictionary 512 57,456.2 ns
IterateForEach<String> ImmutableDictionary 512 88,192.5 ns
IterateForEach<Int32> ImmutableHashSet 512 66,958.3 ns
IterateForEach<String> ImmutableHashSet 512 107,369.1 ns
IterateForEach<Int32> ImmutableList 512 25,530.7 ns
IterateForEach<String> ImmutableList 512 45,287.9 ns
IterateForEach<Int32> ImmutableQueue 512 4,073.7 ns
IterateForEach<String> ImmutableQueue 512 4,510.7 ns
IterateForEach<Int32> ImmutableStack 512 3,891.0 ns
IterateForEach<String> ImmutableStack 512 4,181.2 ns
IterateForEach<Int32> ImmutableSortedDictionary 512 27,616.2 ns
IterateForEach<String> ImmutableSortedDictionary 512 44,664.0 ns
IterateForEach<Int32> ImmutableSortedSet 512 27,055.9 ns
IterateForEach<String> ImmutableSortedSet 512 45,142.9 ns

Full docs for the new benchmarking workflow: https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-corefx.md

@danmoseley
Copy link
Member

Do we allow benchmarks in the perf repo for corefxlab types? (I'm curious where DictionarySlim fits in this)

It's interesting that enumerating HashSet is signficantly faster than Dictionary. And the size is sufficiently small, that I would think (?) the better locality (smaller Entries) would not be relevant.

@adamsitnik
Copy link
Member Author

Do we allow benchmarks in the perf repo for corefxlab types?

It would prolong the time it takes to run all the benchmarks. Also, the main focus in the perf repo is on the things we ship.

But we have a BDN based benchmarking solution in corefxlab which can be used to compare the perf https://github.com/dotnet/corefxlab/tree/master/tests/Benchmarks

@yahorsi
Copy link

yahorsi commented Mar 29, 2019

Wait, iterating ofer Array is SO faster than over List? >5x times faster for Int's and omg 10 times for strings? Hell List is the base collection used basically 90% of the time, e.g. when we query database and neeed results. If List is Array based why it is so slower. Don't you think this is actually bigger issue than Immutable* ? I beleive List is used and iterated in the 100% of .NET programs unlike Immutable* collections

@huoyaoyuan
Copy link
Member

@yahorsi List iterator must check that you haven't modified the list during the loop. Only rust-style immutable/mutable ownership checking can solve this gracefully. C++ chooses for speed but lacks safety.

@PiotrKowalski93
Copy link

It seems like interesting topic for me. Can work on it? I mean, I want to be a contributor but i have never worked on OpenSource. I can work in cooperation too :)

@Wraith2
Copy link
Contributor

Wraith2 commented Jun 1, 2019

@PiotrKowalski93 no permission needed just go for it. The code standard here are high so be prepared to be able to justify and measure everything but improvements are always welcome.

@PiotrKowalski93
Copy link

Great, I wanted to debug the code but I hit the wall. Build went great. I wanted to see how the code is working in System.Collections.Immutable so I created .sln with simple console app in which I create ImmutableList and iterate through it.

Separately when I open solution System.Collections.Immutable.sln i can build code etc. but when I add System.Collections.Immutable.csproj to my "testing" .sln i see:

The TargetFramework value '' was not recognized. It may be misspelled. If not, then the TargetFrameworkIdentifier and/or TargetFrameworkVersion properties must be specified explicitly.

And indeed it is not specified and in properties of csproj i see empty list in TargetFramework property.

How can Debug this specific dll?

@Wraith2
Copy link
Contributor

Wraith2 commented Jun 7, 2019

the build system relies on a lot of complex logic so you won't be able to easily get a project that needs it to compile without it. I'd suggest you take the code logic from the collections you want to debug and add that code directly to your project then debug it that way.

@PiotrKowalski93
Copy link

@Wraith2 Thank you, fair point, I will do that

@msftgits msftgits transferred this issue from dotnet/corefx Feb 1, 2020
@msftgits msftgits added this to the Future milestone Feb 1, 2020
@maryamariyan maryamariyan added the untriaged New issue has not been triaged by the area owner label Feb 23, 2020
@layomia layomia removed the untriaged New issue has not been triaged by the area owner label Jun 24, 2020
@simonthum
Copy link

simonthum commented Mar 20, 2021

Hi, I just saw this issue after investigating observed slowness of immutable collection iteration. I used benchmarkdotnet, I share the numbers below, but the first part (no parallel prefix) is similar to the opening table. I only use Int32 collections.

What really worries me is how much slower the parallel code (explicit Task.Run with #CPUs) seems to be.

I would have expected this to show the benefit of concurrent/immutable collections as there is no locking, but instead I seem to be looking at some obscure bottleneck. I am confident that real world usage will be less impacted, but apparently there is a lot of room for improvement.

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.18363.1198 (1909/November2018Update/19H2)
Intel Core i5-6500 CPU 3.20GHz (Skylake), 1 CPU, 4 logical and 4 physical cores
.NET Core SDK=5.0.201
  [Host]     : .NET Core 5.0.4 (CoreCLR 5.0.421.11614, CoreFX 5.0.421.11614), X64 RyuJIT
  Job-CMPGYI : .NET Core 5.0.4 (CoreCLR 5.0.421.11614, CoreFX 5.0.421.11614), X64 RyuJIT

InvocationCount=1  UnrollFactor=1  
Method Mean Error StdDev Median Ratio RatioSD
'enumeration of plain list' 0.2561 ns 0.0057 ns 0.0162 ns 0.2500 ns 4.06 1.06
'enumeration of immutable list' 8.7836 ns 0.1629 ns 0.3060 ns 8.7750 ns 137.01 30.59
'enumeration of immutable array' 0.0679 ns 0.0062 ns 0.0177 ns 0.0700 ns 1.00 0.00
'enumeration of concurrent bag' 0.9049 ns 0.0285 ns 0.0790 ns 0.8700 ns 14.36 4.01
'enumeration of async locked list' 0.3947 ns 0.0192 ns 0.0536 ns 0.3800 ns 6.23 1.67
'enumeration of sync locked list' 0.2616 ns 0.0064 ns 0.0180 ns 0.2600 ns 4.13 1.09
'inline enumeration of async locked list' 4.2055 ns 0.2361 ns 0.6660 ns 4.0950 ns 66.13 17.81
'parallel enumeration of async locked list' 3.9957 ns 0.2384 ns 0.6916 ns 3.8600 ns 62.32 18.28
'parallel enumeration of sync locked list' 2.9224 ns 0.1702 ns 0.4855 ns 2.8150 ns 46.06 14.57
'parallel enumeration of immutable list (non-volatile)' 13.7143 ns 1.2159 ns 3.5660 ns 11.7200 ns 211.85 77.57
'parallel enumeration of immutable array (non-volatile)' 2.7011 ns 0.1569 ns 0.4425 ns 2.6450 ns 42.74 12.49
'parallel enumeration of concurrent bag' 4.2211 ns 0.2558 ns 0.7338 ns 3.9900 ns 65.77 18.50

@eiriktsarpalis
Copy link
Member

Most immutable collections use class enumerators, so for example the difference in performance between for example T[] and ImmutableArray<T> is to be expected.

The real outliers here seem to be immutable dictionaries and sets: without having profiled the benchmarks, the enumerator implementations seem to be allocating nested enumerator objects for each bucket within the data structure. I'm sure we could optimize this somewhat.

@simonthum
Copy link

@eiriktsarpalis Now that you highlighted it, sharing iteration logic between immutable and mutable code also seems likely to impact performance:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Collections help wanted [up-for-grabs] Good issue for external contributors tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests