WorkerThreadStart volatile read+cmpxchg loop #6516

benaadams · 2016-07-28T22:37:28Z

Currently ThreadPoolMgr::WorkerThreadStart and ThreadPoolMgr::MaybeAddWorkingWorker use FastInterlockCompareExchangeLong to do read and couple with a FastInterlockCompareExchangeLong loop to do set.

Current Code:

counts = WorkerCounter.GetCleanCounts(); 
// return FastInterlockCompareExchangeLong(&counts.AsLongLong,0,0);
while (true)
{
    newCounts = counts;
    // adjust newCounts to wanted values

    Counts oldCounts = WorkerCounter.CompareExchangeCounts(newCounts, counts);

    if (oldCounts == counts)
        break;

    counts = oldCounts;
}

It looks like the lock cmpxchg "write read" causes a pipeline stall. As its then followed by a lock cmpxchg loop to set the value it should be able to use a volatile read to get the initial value as that will then be validated by the set.

Changed code:

counts = WorkerCounter.DangerousGetDirtyCounts(); 
// return VolatileLoad(&counts.AsLongLong);
while (true)
{
    newCounts = counts;
    // adjust newCounts to wanted values

    Counts oldCounts = WorkerCounter.CompareExchangeCounts(newCounts, counts);

    if (oldCounts == counts)
        break;

    counts = oldCounts;
}

Also changed GetCleanCounts to use the VolatileLoad on x64.

Before WorkerThreadStart is 2nd most expensive function and MaybeAddWorkingWorker is 10th for multicore contested QUWI

After (3 commits) WorkerThreadStart drops to 17th and the rest of the native functions are also pushed out of the top hot spots:

Resolves #6132
Resolves #6476

Questions:

Is _WIN64 the correct ifdef to used for x64? (Or _TARGET_64BIT_ or AMD64 etc)

/cc @kouvel @jkotas @stephentoub

benaadams · 2016-07-29T01:45:36Z

With last commit (3) it pushes the native functions out of the top function hotspots so more "work" is being done in the managed stack.

benaadams · 2016-07-29T13:19:40Z

Still getting some false sharing

benaadams · 2016-07-29T17:04:53Z

jkotas · 2016-07-29T21:13:51Z

@benaadams Thank you for fine tuning thread pool scalability. I would like @kouvel to take a look at this since he has been looking into similar scalability issues. He is on vacation, but he will be back next week,

benaadams · 2016-07-29T22:38:00Z

Last commit needs a triple check that the logic is right. I believe it to be correct, but it may be wrong.

It changes 3x Volatile<T> to normal vars +VolatileStore & VolatileLoad.

More significantly, it changes what happens when they are read and set a group. It uses 2x regular = and a single VolatileX to create a memory barrier, rather than using one for each item.

The two spots are:

Store fence comes first which I believe matches VolatileStore in volatile.h

and:

Load fence comes last which I believe matches VolatileLoad in volatile.h

benaadams · 2016-07-29T23:12:12Z

Also VTune is still reporting false sharing for static variable access, which doesn't seem to be eliminated by ordered definition of with static padding blocks. Will try a static POD struct with member items with padding and see if that works instead.

benaadams · 2016-07-31T13:25:06Z

src/vm/threadpoolrequest.cpp


-    LONG count = (LONG) s_appDomainIndexList.GetCount();


Does this need another read? Its not an atomic read+update

It looks like s_appDomainIndexList.GetCount() can increase but never decrease. I think the extra call to GetCount() here doesn't help much. If s_ADHint is initially -1 or 0, and a new app domain is created during the loop above, it would have a chance to set s_ADHint to the new app domain's index upon exiting, and that app domain would get a chance to perform its work next time. If s_ADHint started with any other value (the common case), it would skip over the new app domain anyway. I think it's fine to remove the extra call to GetCount().

benaadams · 2016-07-31T17:38:23Z

These changes mostly effect "twilight" threading where there is a enough fine grained work to keep the system busy (enqueue and dequeue both pulse the threadpool); yet not enough work to keep the threads busy so there is "waiting for work" churn; rather than top end throughput.

It has a bigger effect on 8xHT than 4x regular; but the HT take a general perf dive, it just takes less of one, however is faster on slower items with more work (e.g. Tasks vs QUWI), which also the HT starts to preform better on.

Before (on 8xHT some variances due to GC)

Operations per second on 8 Cores
                                                                        Parallelism
                             Serial          2x         16x         64x        512x
QUWI No Queues              2.800 M     4.920 M     6.661 M     8.127 M     9.391 M
- Depth    2                4.683 M     5.043 M     6.357 M     7.335 M     8.307 M
- Depth   16                7.289 M     7.139 M     7.031 M     7.103 M     7.086 M
- Depth   64                7.406 M     7.359 M     7.250 M     7.162 M     7.251 M
- Depth  512                7.441 M     7.307 M     7.223 M     7.154 M     7.231 M

SubTask Chain Return      589.459 k   929.141 k     5.572 M     5.446 M     5.542 M
- Depth    2              648.375 k     1.021 M     6.026 M     6.106 M     6.139 M
- Depth   16              684.114 k     1.077 M     6.637 M     6.957 M     7.049 M
- Depth   64              710.492 k     1.189 M     6.715 M     7.025 M     7.181 M
- Depth  512              737.985 k     1.120 M     6.890 M     7.169 M     7.201 M

SubTask Chain Awaited     472.738 k   788.315 k     3.381 M     3.537 M     3.594 M
- Depth    2              495.108 k   899.553 k     3.700 M     3.789 M     3.868 M
- Depth   16              539.167 k     1.168 M     3.854 M     4.094 M     4.271 M
- Depth   64              591.786 k     1.257 M     3.920 M     4.217 M     4.300 M
- Depth  512              675.243 k     1.231 M     3.439 M     3.217 M     3.577 M

SubTask Fanout Awaited    273.797 k   493.430 k     2.146 M     2.210 M     2.220 M
- Depth    2              425.957 k   949.842 k     2.597 M     2.704 M     2.738 M
- Depth   16                1.307 M     2.601 M     3.526 M     3.586 M     3.491 M
- Depth   64                1.839 M     2.996 M     3.635 M     3.672 M     3.562 M
- Depth  512                2.080 M     3.138 M     3.663 M     3.684 M     3.758 M

Continuation Chain        189.059 k   306.420 k     2.163 M     2.228 M     2.326 M
- Depth    2              307.732 k   477.830 k     3.281 M     3.867 M     3.861 M
- Depth   16              614.255 k     1.020 M     5.864 M     8.765 M     8.813 M
- Depth   64              703.114 k     1.268 M     7.130 M    10.154 M    10.323 M
- Depth  512              760.961 k     1.386 M    10.458 M    10.248 M    10.198 M

Continuation Fanout       164.989 k   274.129 k     1.634 M     1.725 M     1.745 M
- Depth    2              243.712 k   437.674 k     2.472 M     2.700 M     2.676 M
- Depth   16                1.201 M     2.668 M     5.792 M     5.784 M     5.748 M
- Depth   64                1.751 M     4.635 M     6.688 M     6.679 M     6.635 M
- Depth  512                1.574 M     4.580 M     6.916 M     6.948 M     6.916 M

Yield Chain Awaited       680.350 k     1.124 M     5.049 M     5.248 M     6.380 M
- Depth    2              853.439 k     1.694 M     4.185 M     5.232 M     7.100 M
- Depth   16                1.182 M     2.380 M     6.021 M     7.499 M     7.575 M
- Depth   64                1.197 M     2.783 M     7.480 M     7.242 M     7.213 M
- Depth  512                1.160 M     2.661 M     6.807 M     6.963 M     6.086 M

Async Chain Awaited       559.638 k   835.947 k     4.721 M     4.708 M     4.662 M
- Depth    2                1.243 M     1.486 M     6.847 M     6.803 M     6.728 M
- Depth   16                2.249 M     3.231 M    10.810 M    10.930 M    10.887 M
- Depth   64                2.754 M     3.873 M    11.644 M    11.669 M    11.649 M
- Depth  512                2.449 M     4.774 M    10.912 M    10.772 M    11.710 M

Async Chain Return        560.923 k   867.999 k     4.695 M     4.679 M     4.626 M
- Depth    2                1.122 M     1.733 M     9.148 M     9.297 M     9.198 M
- Depth   16                8.928 M    13.636 M    69.268 M    73.932 M    68.111 M
- Depth   64               34.377 M    57.176 M   280.009 M   273.212 M   222.535 M
- Depth  512              269.277 M   392.267 M     1.527 B     1.412 B   746.743 M

Sync Chain Awaited         34.313 M    65.060 M   147.869 M   146.696 M   142.954 M
- Depth    2               36.458 M    70.458 M   146.170 M   148.597 M   144.891 M
- Depth   16               23.122 M    43.002 M   114.361 M   113.983 M   110.910 M
- Depth   64               21.009 M    39.894 M   107.280 M   106.560 M   102.989 M
- Depth  512               20.466 M    40.276 M    97.061 M    97.623 M    94.193 M

CachedTask Chain Await     29.260 M    55.867 M   130.331 M   130.237 M   125.066 M
- Depth    2               32.716 M    65.154 M   141.308 M   140.204 M   133.480 M
- Depth   16               22.634 M    44.378 M   114.640 M   113.413 M   109.924 M
- Depth   64               21.083 M    41.588 M   105.911 M   106.424 M   102.389 M
- Depth  512               20.161 M    39.999 M    96.826 M    97.542 M    94.137 M

CachedTask Chain Check    167.803 M   304.673 M   671.389 M   667.067 M   555.802 M
- Depth    2              190.419 M   318.879 M   755.741 M   750.871 M   605.399 M
- Depth   16              200.909 M   360.787 M   896.311 M   840.286 M   665.070 M
- Depth   64              100.044 M   197.845 M   585.234 M   580.273 M   487.465 M
- Depth  512               88.279 M   168.067 M   485.182 M   531.172 M   453.725 M

CachedTask Chain Return   188.896 M   334.423 M   682.010 M   674.187 M   581.909 M
- Depth    2              296.936 M   542.809 M     1.276 B     1.186 B   843.911 M
- Depth   16              785.521 M     1.346 B     3.251 B     2.859 B     1.251 B
- Depth   64                1.134 B     1.833 B     4.158 B     3.576 B     1.110 B
- Depth  512                1.577 B     2.267 B     4.788 B     4.293 B     1.100 B

QUWI Local Queues           3.060 M     4.806 M     8.251 M     8.681 M     9.518 M
- Depth    2                4.632 M     4.352 M     6.148 M     7.613 M     7.818 M
- Depth   16                7.180 M     7.072 M     7.020 M     7.116 M     7.121 M
- Depth   64                7.400 M     7.344 M     7.260 M     7.291 M     7.282 M
- Depth  512                7.398 M     7.295 M     7.296 M     7.267 M     7.243 M

After

Operations per second on 8 Cores
                                                                        Parallelism
                             Serial          2x         16x         64x        512x
QUWI No Queues              3.826 M     4.880 M     6.510 M     8.928 M     9.460 M
- Depth    2                4.619 M     4.625 M     6.452 M     7.793 M     7.807 M
- Depth   16                7.240 M     7.108 M     7.038 M     7.098 M     7.054 M
- Depth   64                7.428 M     7.293 M     7.250 M     7.281 M     7.223 M
- Depth  512                7.587 M     7.342 M     7.283 M     7.284 M     7.244 M

SubTask Chain Return      608.552 k   865.269 k     5.443 M     5.713 M     5.744 M
- Depth    2              689.674 k     1.004 M     6.079 M     6.196 M     6.287 M
- Depth   16              667.785 k     1.120 M     5.334 M     6.830 M     6.713 M
- Depth   64              684.408 k     1.076 M     6.525 M     6.822 M     6.897 M
- Depth  512              649.393 k   989.665 k     7.064 M     7.079 M     7.110 M

SubTask Chain Awaited     499.936 k   724.529 k     3.587 M     3.669 M     3.681 M
- Depth    2              497.207 k   857.612 k     3.663 M     3.764 M     3.867 M
- Depth   16              543.215 k     1.125 M     4.183 M     4.221 M     4.308 M
- Depth   64              594.584 k     1.195 M     4.094 M     4.256 M     4.272 M
- Depth  512              680.809 k     1.141 M     3.127 M     3.258 M     3.582 M

SubTask Fanout Awaited    286.650 k   519.578 k     2.275 M     2.264 M     2.269 M
- Depth    2              433.175 k   987.226 k     2.657 M     2.728 M     2.770 M
- Depth   16                1.334 M     2.519 M     3.494 M     3.568 M     3.471 M
- Depth   64                1.775 M     2.939 M     3.582 M     3.612 M     3.531 M
- Depth  512                2.047 M     3.082 M     3.586 M     3.587 M     3.652 M

Continuation Chain        200.000 k   302.937 k     2.219 M     2.300 M     2.310 M
- Depth    2              319.767 k   469.465 k     3.831 M     3.644 M     3.799 M
- Depth   16              664.492 k   938.928 k     8.722 M     8.736 M     8.672 M
- Depth   64              776.596 k     1.082 M    10.111 M    10.102 M    10.005 M
- Depth  512              828.501 k     1.198 M    10.506 M     9.430 M    10.258 M

Continuation Fanout       168.369 k   274.199 k     1.627 M     1.678 M     1.684 M
- Depth    2              218.214 k   414.465 k     2.544 M     2.562 M     2.635 M
- Depth   16                1.303 M     2.468 M     5.692 M     5.639 M     5.647 M
- Depth   64                1.964 M     4.629 M     6.445 M     6.442 M     6.390 M
- Depth  512                1.935 M     4.700 M     6.475 M     6.675 M     6.429 M

Yield Chain Awaited       725.388 k     1.236 M     5.097 M     5.176 M     6.193 M
- Depth    2              866.275 k     1.837 M     4.364 M     5.302 M     6.964 M
- Depth   16                1.329 M     2.312 M     6.025 M     7.358 M     7.419 M
- Depth   64                1.326 M     2.997 M     7.338 M     7.188 M     7.097 M
- Depth  512                1.362 M     3.057 M     6.597 M     6.878 M     5.991 M

Async Chain Awaited       545.851 k   803.757 k     4.422 M     4.455 M     4.509 M
- Depth    2                1.206 M     1.466 M     6.386 M     6.648 M     6.568 M
- Depth   16                2.247 M     3.206 M    10.515 M    10.773 M    10.900 M
- Depth   64                2.802 M     3.865 M    11.694 M    11.731 M    11.677 M
- Depth  512                2.476 M     4.723 M    10.794 M    10.734 M    11.617 M

Async Chain Return        553.119 k   783.503 k     4.738 M     4.763 M     4.675 M
- Depth    2                1.108 M     1.581 M     9.587 M     9.315 M     9.337 M
- Depth   16                8.767 M    13.599 M    75.642 M    73.879 M    68.638 M
- Depth   64               33.785 M    30.217 M   277.134 M   272.148 M   222.550 M
- Depth  512              296.630 M   399.269 M     1.627 B     1.423 B   720.493 M

Sync Chain Awaited         35.359 M    67.688 M   150.735 M   148.840 M   143.647 M
- Depth    2               36.056 M    71.934 M   147.102 M   146.794 M   144.095 M
- Depth   16               23.019 M    46.335 M   112.907 M   112.545 M   110.190 M
- Depth   64               21.374 M    42.189 M    99.402 M   104.070 M   102.108 M
- Depth  512               20.429 M    38.056 M    96.802 M    97.843 M    93.141 M

CachedTask Chain Await     32.043 M    57.911 M   129.447 M   129.120 M   124.923 M
- Depth    2               33.137 M    63.677 M   138.845 M   140.853 M   135.170 M
- Depth   16               22.579 M    45.287 M   110.994 M   108.911 M   109.477 M
- Depth   64               20.996 M    39.917 M    98.982 M   104.753 M   101.393 M
- Depth  512               20.121 M    40.112 M    94.603 M    97.553 M    93.199 M

CachedTask Chain Check    168.168 M   300.156 M   688.874 M   665.661 M   552.976 M
- Depth    2              185.312 M   314.978 M   772.899 M   739.997 M   596.921 M
- Depth   16              194.375 M   376.714 M   897.385 M   826.979 M   649.643 M
- Depth   64              100.109 M   198.033 M   568.704 M   579.068 M   475.726 M
- Depth  512               88.636 M   176.177 M   532.196 M   531.980 M   439.492 M

CachedTask Chain Return   193.780 M   338.050 M   657.233 M   703.384 M   575.496 M
- Depth    2              290.024 M   544.760 M     1.276 B     1.161 B   818.944 M
- Depth   16              790.853 M     1.332 B     3.002 B     2.955 B     1.274 B
- Depth   64                1.127 B     1.923 B     4.145 B     3.729 B     1.257 B
- Depth  512                1.578 B     2.171 B     4.599 B     3.895 B     1.240 B

QUWI Local Queues           4.528 M     4.902 M     6.816 M     8.679 M     9.410 M
- Depth    2                4.622 M     4.469 M     6.292 M     7.161 M     8.231 M
- Depth   16                7.240 M     7.095 M     7.017 M     7.162 M     7.121 M
- Depth   64                7.408 M     7.300 M     7.218 M     7.236 M     7.270 M
- Depth  512                7.451 M     7.390 M     7.252 M     7.244 M     7.201 M

Most significant changes are top left corners, which are single producer multiple consumer.

kouvel · 2016-08-09T01:12:28Z

src/vm/comthreadpool.cpp

@@ -266,15 +267,15 @@ FCIMPL0(FC_BOOL_RET, ThreadPoolNative::NotifyRequestComplete)
    {
        HELPER_METHOD_FRAME_BEGIN_RET_0();

+        if (needReset)


What's the purpose of this change?

It gave better throughput, I'm not 100% convinced by it as was moving things around that were showing up as hot - so is an experimental change rather than one with a well thought out reason.

If I was to guess, its because needReset is set before shouldAdjustWorkers so its result is available earlier in the pipeline?

Could probably get the same effect by swapping bool needReset = and bool shouldAdjustWorkers = except shouldAdjustWorkers is memory fenced - so I'm not sure they'd swap.

I don't see any dependencies on the order, so it's probably fine. But moving pThread->InternalReset() up also moves it into the 'tal' lock which may or may not make any difference. I would prefer to keep this as before until this perf boost is understood better.

How significant is the increase in throughput?

Wasn't significant for throughput, just improved the CPI a little; will revert.

kouvel · 2016-08-09T01:26:37Z

Sorry for the delay on the review, and thanks for doing this, those are some nice improvements!

Is _WIN64 the correct ifdef to used for x64? (Or TARGET_64BIT or AMD64 etc)

_WIN64 is intended to cover all 64-bit platforms (currently amd64 and arm64), see CMakeLists.txt in the root.

It looks like the lock cmpxchg "write read" causes a pipeline stall. As its then followed by a lock cmpxchg loop to set the value it should be able to use a volatile read to get the initial value as that will then be validated by the set.

I'm still wondering why there is a pipeline stall before the loop and not inside the loop, thoughts? What you have is still better, but just curious.

kouvel · 2016-08-09T01:49:51Z

src/vm/win32threadpool.h

+            // VolatileLoad x64 bit read is atomic
+            return DangerousGetDirtyCounts();
+#else // !_WIN64
+            // VolatileLoad may result in torn read
            LIMITED_METHOD_CONTRACT;


This contract should be at the top of the function for all architectures

benaadams · 2016-08-09T01:57:57Z

I'm still wondering why there is a pipeline stall before the loop and not inside the loop, thoughts?

There is if it loops (contested); but not if it doesn't. Its the two lock cmpxchg immediately after each other where the second is read dependant on the first.

So currently when not contested it still has a dependency as if it was contested:

lock cmpxchg (22 clocks) -> lock cmpxchg (22 clocks)

This change is to break the first dependency

mov (< 2 clocks) -> lock cmpxchg 22 clocks

Obviously if it gets a dirty read or it is contested then it will hit the same dependency e.g.

mov (< 2 clocks) -> lock cmpxchg (22 clocks) -> lock cmpxchg (22 clocks)

but not in the common case.

I assume its a historic behaviour as its a 64bit field so it uses the 64bit cmpxchg on x86 to achieve an atomic read; whereas on x64 mov is an atomic read so it isn't necessary.

Also for the other places I've changed GetCleanCounts->DangerousGetDirtyCounts its because they are immediately followed by a lock cmpxchg loop, so even on x86 if it got a torn read lock cmpxchg would reject it so its a safe optimization so it doesn't need the slower defensive read.

Interestingly the other/IOCP threadpool already does this and it uses the same data structure for the counts.

Use GetCleanCounts where result is used directly and DangerousGetDirtyCounts when used as part of compare exchange loop

benaadams · 2016-08-09T20:53:05Z

Feedback incorporated. Still question on the memory barrier

kouvel · 2016-08-09T22:08:31Z

lock cmpxchg (22 clocks) -> lock cmpxchg (22 clocks)
mov (< 2 clocks) -> lock cmpxchg 22 clocks

I see, makes sense.

I assume its a historic behaviour as its a 64bit field so it uses the 64bit cmpxchg on x86 to achieve an atomic read; whereas on x64 mov is an atomic read so it isn't necessary.

Makes sense as well, thanks

benaadams · 2016-08-09T23:27:23Z

@kouvel is the last commit what you meant?

kouvel · 2016-08-09T23:48:27Z

I meant like this:

        DWORD priorTime = PriorCompletedWorkRequestsTime; 
        MemoryBarrier(); // read fresh value for NextCompletedWorkRequestsTime below
        DWORD requiredInterval = NextCompletedWorkRequestsTime - priorTime;

And:

        PriorCompletedWorkRequests = totalNumCompletions;
        NextCompletedWorkRequestsTime = currentTicks + ThreadAdjustmentInterval;
        MemoryBarrier(); // flush previous writes (especially NextCompletedWorkRequestsTime)
        PriorCompletedWorkRequestsTime = currentTicks;
        CurrentSampleStartTime = endTime;

kouvel · 2016-08-10T00:16:47Z

LGTM, thanks!

benaadams · 2016-08-12T23:07:41Z

@dotnet-bot retest Linux ARM Emulator Cross Debug Build

benaadams · 2016-08-12T23:25:25Z

No space left on device :(

- MSVC seems to require alignment specification to be on the declaration as well as the definition - Ignore warning about padding parent struct due to __declspec(align()), as that is intentional Original change PR: dotnet#6516 [tfs-changeset: 1622589]

WorkerThreadStart volatile read+cmpxchg loop Commit migrated from dotnet/coreclr@a521502

- MSVC seems to require alignment specification to be on the declaration as well as the definition - Ignore warning about padding parent struct due to __declspec(align()), as that is intentional Original change PR: dotnet/coreclr#6516 [tfs-changeset: 1622589] Commit migrated from dotnet/coreclr@b6be3a0

dnfclas added the cla-already-signed label Jul 28, 2016

benaadams force-pushed the WorkerThreadStart branch 2 times, most recently from b41a828 to 7d72132 Compare July 29, 2016 01:26

benaadams force-pushed the WorkerThreadStart branch 2 times, most recently from 22850f1 to b8b6462 Compare July 29, 2016 13:18

benaadams force-pushed the WorkerThreadStart branch 2 times, most recently from 0a70101 to 7387b82 Compare July 29, 2016 17:25

benaadams force-pushed the WorkerThreadStart branch 2 times, most recently from 3f8f9dd to 9e5fbcd Compare July 31, 2016 13:10

benaadams reviewed Jul 31, 2016
View reviewed changes

benaadams force-pushed the WorkerThreadStart branch from 9e5fbcd to f96f4b4 Compare July 31, 2016 17:28

kouvel reviewed Aug 9, 2016
View reviewed changes

WorkerThreadStart volatile read+cmpxchg loop

bfef881

benaadams added 5 commits August 9, 2016 21:00

GetCleanCounts to Volatile read on x64

2de9a69

Use GetCleanCounts where result is used directly and DangerousGetDirtyCounts when used as part of compare exchange loop

Reduce false sharing in ManagedPerAppDomainTPCount

a033207

Adjust fences and add padding

b8220e4

Align to reduce false sharing

cd0d600

PR feedback

2e0332a

benaadams force-pushed the WorkerThreadStart branch from f96f4b4 to 2e0332a Compare August 9, 2016 20:52

Insert MemoryBarrier revert Comthreadpool

519dcde

Fix MemoryBarrier

a0597da

benaadams mentioned this pull request Aug 12, 2016

[Wip] Improve Threadpool QUWI throughput #5943

Closed

kouvel merged commit a521502 into dotnet:master Aug 15, 2016

benaadams deleted the WorkerThreadStart branch August 16, 2016 02:45

picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022

Merge pull request dotnet/coreclr#6516 from benaadams/WorkerThreadStart

c066292

WorkerThreadStart volatile read+cmpxchg loop Commit migrated from dotnet/coreclr@a521502

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WorkerThreadStart volatile read+cmpxchg loop #6516

WorkerThreadStart volatile read+cmpxchg loop #6516

benaadams commented Jul 28, 2016 •

edited

Loading

benaadams commented Jul 29, 2016 •

edited

Loading

benaadams commented Jul 29, 2016

benaadams commented Jul 29, 2016

jkotas commented Jul 29, 2016

benaadams commented Jul 29, 2016 •

edited

Loading

benaadams commented Jul 29, 2016

benaadams Jul 31, 2016

kouvel Aug 9, 2016

benaadams commented Jul 31, 2016

kouvel Aug 9, 2016

benaadams Aug 9, 2016 •

edited

Loading

kouvel Aug 9, 2016

kouvel Aug 9, 2016

benaadams Aug 9, 2016

kouvel commented Aug 9, 2016

kouvel Aug 9, 2016

benaadams Aug 9, 2016

benaadams commented Aug 9, 2016 •

edited

Loading

benaadams commented Aug 9, 2016

kouvel commented Aug 9, 2016

benaadams commented Aug 9, 2016

kouvel commented Aug 9, 2016

kouvel commented Aug 10, 2016

benaadams commented Aug 12, 2016

benaadams commented Aug 12, 2016

WorkerThreadStart volatile read+cmpxchg loop #6516

WorkerThreadStart volatile read+cmpxchg loop #6516

Conversation

benaadams commented Jul 28, 2016 • edited Loading

benaadams commented Jul 29, 2016 • edited Loading

benaadams commented Jul 29, 2016

benaadams commented Jul 29, 2016

jkotas commented Jul 29, 2016

benaadams commented Jul 29, 2016 • edited Loading

benaadams commented Jul 29, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benaadams commented Jul 31, 2016

Choose a reason for hiding this comment

benaadams Aug 9, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kouvel commented Aug 9, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benaadams commented Aug 9, 2016 • edited Loading

benaadams commented Aug 9, 2016

kouvel commented Aug 9, 2016

benaadams commented Aug 9, 2016

kouvel commented Aug 9, 2016

kouvel commented Aug 10, 2016

benaadams commented Aug 12, 2016

benaadams commented Aug 12, 2016

benaadams commented Jul 28, 2016 •

edited

Loading

benaadams commented Jul 29, 2016 •

edited

Loading

benaadams commented Jul 29, 2016 •

edited

Loading

benaadams Aug 9, 2016 •

edited

Loading

benaadams commented Aug 9, 2016 •

edited

Loading