Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CompareExchange_long benchmark sometimes reports very long execution time on x86 #1497

Open
adamsitnik opened this issue Sep 2, 2020 · 3 comments

Comments

@adamsitnik
Copy link
Member

By looking at the data I got for x86 the CompareExchange_long benchmark typically reports time around 10ns, but sometimes it's even x100 more.

It's either a very weird hardware issue or a BenchmarkDotNet bug, possibly: dotnet/BenchmarkDotNet#837

The problem is that I can't reproduce it locally... so if anyone ever faces this problem please copy-paste the content of BDN .log file here.

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f netcoreapp5.0 --filter System.Threading.Tests.Perf_Interlocked.CompareExchange_long --architecture x86
@AndyAyersMS
Copy link
Member

Here's a repro... issue is lock cmpxchg8 splitting a cache line, so it requires a particular data alignment.

The code below has a 50/50 chance that one of those long fields does this.

public class X
{
     int x;
     long l1;
     long l2;
     long l3;
     long l4;
     long l5;
     long l6;
     long l7;
     long l8;
     
     [Benchmark]
     public long CompareExchange_long1() => Interlocked.CompareExchange(ref l1, 1, 0);
     
     [Benchmark]
     public long CompareExchange_long2() => Interlocked.CompareExchange(ref l2, 1, 0);
     
     [Benchmark]
     public long CompareExchange_long3() => Interlocked.CompareExchange(ref l3, 1, 0);
     
     [Benchmark]
     public long CompareExchange_long4() => Interlocked.CompareExchange(ref l4, 1, 0);
     
     [Benchmark]
     public long CompareExchange_long5() => Interlocked.CompareExchange(ref l5, 1, 0);
     
     [Benchmark]
     public long CompareExchange_long6() => Interlocked.CompareExchange(ref l6, 1, 0);
     
     [Benchmark]
     public long CompareExchange_long7() => Interlocked.CompareExchange(ref l7, 1, 0);
     
     [Benchmark]
     public long CompareExchange_long8() => Interlocked.CompareExchange(ref l8, 1, 0);
}

Sample data

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19041.572 (2004/?/20H1)
Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=5.0.100-rc.2.20479.15
  [Host]     : .NET Core 5.0.0 (CoreCLR 5.0.20.47505, CoreFX 5.0.20.47505), X86 RyuJIT
  DefaultJob : .NET Core 5.0.0 (CoreCLR 5.0.20.47505, CoreFX 5.0.20.47505), X86 RyuJIT
Method Mean Error StdDev Median
CompareExchange_long1 9.083 ns 0.5248 ns 0.5615 ns 9.009 ns
CompareExchange_long2 11.070 ns 0.4912 ns 0.4102 ns 11.115 ns
CompareExchange_long3 11.995 ns 0.5210 ns 0.9785 ns 11.782 ns
CompareExchange_long4 11.636 ns 0.5162 ns 0.6891 ns 11.585 ns
CompareExchange_long5 1,273.176 ns 25.7278 ns 71.7188 ns 1,254.676 ns
CompareExchange_long6 10.818 ns 0.7369 ns 2.1380 ns 9.710 ns
CompareExchange_long7 9.389 ns 0.2530 ns 0.2243 ns 9.346 ns
CompareExchange_long8 9.397 ns 0.3290 ns 0.3078 ns 9.449 ns

@AndyAyersMS
Copy link
Member

Also explains why we don't see this on x64, because classes there are 8 byte aligned. On x86, classes are only 4 byte aligned.

@danmoseley
Copy link
Member

cc @kunalspathak

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants