Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync: add a Map to replace RWLock+map usage #18177

Closed
bcmills opened this issue Dec 3, 2016 · 49 comments
Closed

sync: add a Map to replace RWLock+map usage #18177

bcmills opened this issue Dec 3, 2016 · 49 comments

Comments

@bcmills
Copy link
Contributor

bcmills commented Dec 3, 2016

Per #17973, RWMutex has some scaling issues. One option is to fix RWMutex, but for maps with high read:write ratios we can often do better with a carefully-managed atomic.Value anyway. The standard library contains many such maps, especially in the reflect package.

The actual details of such a map are a bit subtle to implement correctly and efficiently. We should provide a Map implementation in the standard library. Since the implementation may need to use both sync and sync/atomic and the former depends on the latter, it should either go in sync or in a new, separate sync subpackage.

(Presumably this should be targeted to 1.9 due to the 1.8 freeze, but I'd like to add a draft API to x/sync in the meantime.)

@bradfitz bradfitz added this to the Proposal milestone Dec 3, 2016
@bradfitz
Copy link
Contributor

bradfitz commented Dec 3, 2016

The standard answer (https://golang.org/doc/faq#x_in_std) is always to prototype it elsewhere first.

So x/sync sounds good to start.

@rsc
Copy link
Contributor

rsc commented Dec 5, 2016

Yes, having such a thing would be nice. There are no details here but I think everyone agrees this comes up repeatedly and would be well served by a library in a standard place. When your code is ready please send a CL.

@rsc rsc modified the milestones: Go1.9Early, Proposal Dec 5, 2016
@rsc rsc changed the title proposal: sync: add a Map to replace RWLock+map usage in the standard library sync: add a Map to replace RWLock+map usage Dec 5, 2016
@josharian
Copy link
Contributor

https://go-review.googlesource.com/c/33912/ has a prototype implementation and API discussion.

Here are some concrete use cases I know of, for use in API discussion:

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/33912 mentions this issue.

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/36717 mentions this issue.

gopherbot pushed a commit that referenced this issue Feb 10, 2017
The other expvar tests are already parallelized, and this will help to
measure the impact of potential implementations for #18177.

updates #18177

Change-Id: I0f4f1a16a0285556cbcc8339855b6459af412675
Reviewed-on: https://go-review.googlesource.com/36717
Reviewed-by: Russ Cox <rsc@golang.org>
@gopherbot
Copy link
Contributor

CL https://golang.org/cl/36719 mentions this issue.

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/36722 mentions this issue.

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/36723 mentions this issue.

gopherbot pushed a commit that referenced this issue Feb 10, 2017
bradfitz noted in change 36717 that the new behavior was no longer
comparable with the old.  This change restores comparable behavior
for -cpu=1.

BenchmarkMapAddSame                 909           909           +0.00%
BenchmarkMapAddSame-6               1309          262           -79.98%
BenchmarkMapAddDifferent            2856          3030          +6.09%
BenchmarkMapAddDifferent-6          3803          581           -84.72%

updates #18177

Change-Id: Ifaff5a1f48be92002d86c296220313b7efdc81d6
Reviewed-on: https://go-review.googlesource.com/36723
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
@gopherbot
Copy link
Contributor

CL https://golang.org/cl/36724 mentions this issue.

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/36810 mentions this issue.

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/36831 mentions this issue.

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/36959 mentions this issue.

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/36964 mentions this issue.

gopherbot pushed a commit that referenced this issue Feb 14, 2017
Add a benchmark for setting a String value, which we may
want to treat differently from Int or Float due to the need to support
Add methods for the latter.

Update tests to use only the exported API instead of making (fragile)
assumptions about unexported fields.

The existing Map benchmarks construct a new Map for each iteration, which
focuses the benchmark results on the initial allocation costs for the
Map and its entries. This change adds variants of the benchmarks which
use a long-lived map in order to measure steady-state performance for
Map updates on existing keys.

Updates #18177

Change-Id: I62c920991d17d5898c592446af382cd5c04c528a
Reviewed-on: https://go-review.googlesource.com/36959
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@gopherbot
Copy link
Contributor

CL https://golang.org/cl/36617 mentions this issue.

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/41991 mentions this issue.

gopherbot pushed a commit that referenced this issue Apr 28, 2017
This simplifies the code a bit and provides a modest speedup for
Marshal with many CPUs.

updates #17973
updates #18177

name          old time/op    new time/op    delta
Marshal         15.8µs ± 1%    15.9µs ± 1%   +0.67%  (p=0.021 n=8+7)
Marshal-6       5.76µs ±11%    5.17µs ± 2%  -10.36%  (p=0.002 n=8+8)
Marshal-48      9.88µs ± 5%    7.31µs ± 6%  -26.04%  (p=0.000 n=8+8)
Unmarshal       44.7µs ± 3%    45.1µs ± 5%     ~     (p=0.645 n=8+8)
Unmarshal-6     12.1µs ± 7%    11.8µs ± 8%     ~     (p=0.442 n=8+8)
Unmarshal-48    18.7µs ± 3%    18.2µs ± 4%     ~     (p=0.054 n=7+8)

name          old alloc/op   new alloc/op   delta
Marshal         5.78kB ± 0%    5.78kB ± 0%     ~     (all equal)
Marshal-6       5.78kB ± 0%    5.78kB ± 0%     ~     (all equal)
Marshal-48      5.78kB ± 0%    5.78kB ± 0%     ~     (all equal)
Unmarshal       8.58kB ± 0%    8.58kB ± 0%     ~     (all equal)
Unmarshal-6     8.58kB ± 0%    8.58kB ± 0%     ~     (all equal)
Unmarshal-48    8.58kB ± 0%    8.58kB ± 0%     ~     (p=1.000 n=8+8)

name          old allocs/op  new allocs/op  delta
Marshal           23.0 ± 0%      23.0 ± 0%     ~     (all equal)
Marshal-6         23.0 ± 0%      23.0 ± 0%     ~     (all equal)
Marshal-48        23.0 ± 0%      23.0 ± 0%     ~     (all equal)
Unmarshal          189 ± 0%       189 ± 0%     ~     (all equal)
Unmarshal-6        189 ± 0%       189 ± 0%     ~     (all equal)
Unmarshal-48       189 ± 0%       189 ± 0%     ~     (all equal)

https://perf.golang.org/search?q=upload:20170427.5

Change-Id: I4ee95a99540d3e4e47e056fff18357efd2cd340a
Reviewed-on: https://go-review.googlesource.com/41991
Run-TryBot: Bryan Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@gopherbot
Copy link
Contributor

CL https://golang.org/cl/42110 mentions this issue.

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/42112 mentions this issue.

gopherbot pushed a commit that referenced this issue Apr 28, 2017
This provides a moderate speedup for encoding when using many CPU cores.

name                    old time/op    new time/op    delta
CodeEncoder               14.1ms ±10%    13.5ms ± 4%      ~     (p=0.867 n=8+7)
CodeEncoder-6             2.58ms ± 8%    2.72ms ± 6%      ~     (p=0.065 n=8+8)
CodeEncoder-48             629µs ± 1%     629µs ± 1%      ~     (p=0.867 n=8+7)
CodeMarshal               14.9ms ± 5%    14.9ms ± 5%      ~     (p=0.721 n=8+8)
CodeMarshal-6             3.28ms ±11%    3.24ms ±12%      ~     (p=0.798 n=8+8)
CodeMarshal-48             739µs ± 1%     745µs ± 2%      ~     (p=0.328 n=8+8)
CodeDecoder               49.7ms ± 4%    49.2ms ± 4%      ~     (p=0.463 n=7+8)
CodeDecoder-6             10.1ms ± 8%    10.4ms ± 3%      ~     (p=0.232 n=7+8)
CodeDecoder-48            2.60ms ± 3%    2.61ms ± 2%      ~     (p=1.000 n=8+8)
DecoderStream              352ns ± 5%     344ns ± 4%      ~     (p=0.077 n=8+8)
DecoderStream-6            485ns ± 8%     503ns ± 6%      ~     (p=0.123 n=8+8)
DecoderStream-48           522ns ± 7%     520ns ± 5%      ~     (p=0.959 n=8+8)
CodeUnmarshal             52.2ms ± 5%    54.4ms ±18%      ~     (p=0.955 n=7+8)
CodeUnmarshal-6           12.4ms ± 6%    12.3ms ± 6%      ~     (p=0.878 n=8+8)
CodeUnmarshal-48          3.46ms ± 7%    3.40ms ± 9%      ~     (p=0.442 n=8+8)
CodeUnmarshalReuse        48.9ms ± 6%    50.3ms ± 7%      ~     (p=0.279 n=8+8)
CodeUnmarshalReuse-6      10.3ms ±11%    10.3ms ±10%      ~     (p=0.959 n=8+8)
CodeUnmarshalReuse-48     2.68ms ± 3%    2.67ms ± 4%      ~     (p=0.878 n=8+8)
UnmarshalString            476ns ± 7%     474ns ± 7%      ~     (p=0.644 n=8+8)
UnmarshalString-6          164ns ± 9%     160ns ±10%      ~     (p=0.556 n=8+8)
UnmarshalString-48         181ns ± 0%     177ns ± 2%    -2.36%  (p=0.001 n=7+7)
UnmarshalFloat64           414ns ± 4%     418ns ± 4%      ~     (p=0.382 n=8+8)
UnmarshalFloat64-6         147ns ± 9%     143ns ±16%      ~     (p=0.457 n=8+8)
UnmarshalFloat64-48        176ns ± 2%     174ns ± 2%      ~     (p=0.118 n=8+8)
UnmarshalInt64             369ns ± 4%     354ns ± 1%    -3.85%  (p=0.005 n=8+7)
UnmarshalInt64-6           132ns ±11%     132ns ±10%      ~     (p=0.982 n=8+8)
UnmarshalInt64-48          177ns ± 3%     174ns ± 2%    -1.84%  (p=0.028 n=8+7)
Issue10335                 540ns ± 5%     535ns ± 0%      ~     (p=0.330 n=7+7)
Issue10335-6               159ns ± 8%     164ns ± 8%      ~     (p=0.246 n=8+8)
Issue10335-48              186ns ± 1%     182ns ± 2%    -1.89%  (p=0.010 n=8+8)
Unmapped                  1.74µs ± 2%    1.76µs ± 6%      ~     (p=0.181 n=6+8)
Unmapped-6                 414ns ± 5%     402ns ±10%      ~     (p=0.244 n=7+8)
Unmapped-48                226ns ± 2%     224ns ± 2%      ~     (p=0.144 n=7+8)
NumberIsValid             20.1ns ± 4%    19.7ns ± 3%      ~     (p=0.204 n=8+8)
NumberIsValid-6           20.4ns ± 8%    22.2ns ±16%      ~     (p=0.129 n=7+8)
NumberIsValid-48          23.1ns ±12%    23.8ns ± 8%      ~     (p=0.104 n=8+8)
NumberIsValidRegexp        629ns ± 5%     622ns ± 0%      ~     (p=0.148 n=7+7)
NumberIsValidRegexp-6      757ns ± 2%     725ns ±14%      ~     (p=0.351 n=8+7)
NumberIsValidRegexp-48     757ns ± 2%     723ns ±13%      ~     (p=0.521 n=8+8)
SkipValue                 13.2ms ± 9%    13.3ms ± 1%      ~     (p=0.130 n=8+8)
SkipValue-6               15.1ms ±10%    14.8ms ± 2%      ~     (p=0.397 n=7+8)
SkipValue-48              13.9ms ±12%    14.3ms ± 1%      ~     (p=0.694 n=8+7)
EncoderEncode              433ns ± 4%     410ns ± 3%    -5.48%  (p=0.001 n=8+8)
EncoderEncode-6            221ns ±15%      75ns ± 5%   -66.15%  (p=0.000 n=7+8)
EncoderEncode-48           161ns ± 4%      19ns ± 7%   -88.29%  (p=0.000 n=7+8)

name                    old speed      new speed      delta
CodeEncoder              139MB/s ±10%   144MB/s ± 4%      ~     (p=0.844 n=8+7)
CodeEncoder-6            756MB/s ± 8%   714MB/s ± 6%      ~     (p=0.065 n=8+8)
CodeEncoder-48          3.08GB/s ± 1%  3.09GB/s ± 1%      ~     (p=0.867 n=8+7)
CodeMarshal              130MB/s ± 5%   130MB/s ± 5%      ~     (p=0.721 n=8+8)
CodeMarshal-6            594MB/s ±10%   601MB/s ±11%      ~     (p=0.798 n=8+8)
CodeMarshal-48          2.62GB/s ± 1%  2.60GB/s ± 2%      ~     (p=0.328 n=8+8)
CodeDecoder             39.0MB/s ± 4%  39.5MB/s ± 4%      ~     (p=0.463 n=7+8)
CodeDecoder-6            189MB/s ±13%   187MB/s ± 3%      ~     (p=0.505 n=8+8)
CodeDecoder-48           746MB/s ± 2%   745MB/s ± 2%      ~     (p=1.000 n=8+8)
CodeUnmarshal           37.2MB/s ± 5%  35.9MB/s ±16%      ~     (p=0.955 n=7+8)
CodeUnmarshal-6          157MB/s ± 6%   158MB/s ± 6%      ~     (p=0.878 n=8+8)
CodeUnmarshal-48         561MB/s ± 7%   572MB/s ±10%      ~     (p=0.442 n=8+8)
SkipValue                141MB/s ±10%   139MB/s ± 1%      ~     (p=0.130 n=8+8)
SkipValue-6              131MB/s ± 3%   133MB/s ± 2%      ~     (p=0.662 n=6+8)
SkipValue-48             138MB/s ±11%   132MB/s ± 1%      ~     (p=0.281 n=8+7)

name                    old alloc/op   new alloc/op   delta
CodeEncoder               45.9kB ± 0%    45.9kB ± 0%    -0.02%  (p=0.002 n=7+8)
CodeEncoder-6             55.1kB ± 0%    55.1kB ± 0%    -0.01%  (p=0.002 n=7+8)
CodeEncoder-48             110kB ± 0%     110kB ± 0%    -0.00%  (p=0.030 n=7+8)
CodeMarshal               4.59MB ± 0%    4.59MB ± 0%    -0.00%  (p=0.000 n=8+8)
CodeMarshal-6             4.59MB ± 0%    4.59MB ± 0%    -0.00%  (p=0.000 n=8+8)
CodeMarshal-48            4.59MB ± 0%    4.59MB ± 0%    -0.00%  (p=0.001 n=7+8)
CodeDecoder               2.28MB ± 5%    2.21MB ± 0%      ~     (p=0.257 n=8+7)
CodeDecoder-6             2.43MB ±11%    2.51MB ± 0%      ~     (p=0.473 n=8+8)
CodeDecoder-48            2.93MB ± 0%    2.93MB ± 0%      ~     (p=0.554 n=7+8)
DecoderStream              16.0B ± 0%     16.0B ± 0%      ~     (all equal)
DecoderStream-6            16.0B ± 0%     16.0B ± 0%      ~     (all equal)
DecoderStream-48           16.0B ± 0%     16.0B ± 0%      ~     (all equal)
CodeUnmarshal             3.28MB ± 0%    3.28MB ± 0%      ~     (p=1.000 n=7+7)
CodeUnmarshal-6           3.28MB ± 0%    3.28MB ± 0%      ~     (p=0.593 n=8+8)
CodeUnmarshal-48          3.28MB ± 0%    3.28MB ± 0%      ~     (p=0.670 n=8+8)
CodeUnmarshalReuse        1.87MB ± 0%    1.88MB ± 1%    +0.48%  (p=0.011 n=7+8)
CodeUnmarshalReuse-6      1.90MB ± 1%    1.90MB ± 1%      ~     (p=0.589 n=8+8)
CodeUnmarshalReuse-48     1.96MB ± 0%    1.96MB ± 0%    +0.00%  (p=0.002 n=7+8)
UnmarshalString             304B ± 0%      304B ± 0%      ~     (all equal)
UnmarshalString-6           304B ± 0%      304B ± 0%      ~     (all equal)
UnmarshalString-48          304B ± 0%      304B ± 0%      ~     (all equal)
UnmarshalFloat64            292B ± 0%      292B ± 0%      ~     (all equal)
UnmarshalFloat64-6          292B ± 0%      292B ± 0%      ~     (all equal)
UnmarshalFloat64-48         292B ± 0%      292B ± 0%      ~     (all equal)
UnmarshalInt64              289B ± 0%      289B ± 0%      ~     (all equal)
UnmarshalInt64-6            289B ± 0%      289B ± 0%      ~     (all equal)
UnmarshalInt64-48           289B ± 0%      289B ± 0%      ~     (all equal)
Issue10335                  312B ± 0%      312B ± 0%      ~     (all equal)
Issue10335-6                312B ± 0%      312B ± 0%      ~     (all equal)
Issue10335-48               312B ± 0%      312B ± 0%      ~     (all equal)
Unmapped                    344B ± 0%      344B ± 0%      ~     (all equal)
Unmapped-6                  344B ± 0%      344B ± 0%      ~     (all equal)
Unmapped-48                 344B ± 0%      344B ± 0%      ~     (all equal)
NumberIsValid              0.00B          0.00B           ~     (all equal)
NumberIsValid-6            0.00B          0.00B           ~     (all equal)
NumberIsValid-48           0.00B          0.00B           ~     (all equal)
NumberIsValidRegexp        0.00B          0.00B           ~     (all equal)
NumberIsValidRegexp-6      0.00B          0.00B           ~     (all equal)
NumberIsValidRegexp-48     0.00B          0.00B           ~     (all equal)
SkipValue                  0.00B          0.00B           ~     (all equal)
SkipValue-6                0.00B          0.00B           ~     (all equal)
SkipValue-48              15.0B ±167%      0.0B           ~     (p=0.200 n=8+8)
EncoderEncode              8.00B ± 0%     0.00B       -100.00%  (p=0.000 n=8+8)
EncoderEncode-6            8.00B ± 0%     0.00B       -100.00%  (p=0.000 n=8+8)
EncoderEncode-48           8.00B ± 0%     0.00B       -100.00%  (p=0.000 n=8+8)

name                    old allocs/op  new allocs/op  delta
CodeEncoder                 1.00 ± 0%      0.00       -100.00%  (p=0.000 n=8+8)
CodeEncoder-6               1.00 ± 0%      0.00       -100.00%  (p=0.000 n=8+8)
CodeEncoder-48              1.00 ± 0%      0.00       -100.00%  (p=0.000 n=8+8)
CodeMarshal                 17.0 ± 0%      16.0 ± 0%    -5.88%  (p=0.000 n=8+8)
CodeMarshal-6               17.0 ± 0%      16.0 ± 0%    -5.88%  (p=0.000 n=8+8)
CodeMarshal-48              17.0 ± 0%      16.0 ± 0%    -5.88%  (p=0.000 n=8+8)
CodeDecoder                89.6k ± 0%     89.5k ± 0%      ~     (p=0.154 n=8+7)
CodeDecoder-6              89.8k ± 0%     89.9k ± 0%      ~     (p=0.467 n=8+8)
CodeDecoder-48             90.5k ± 0%     90.5k ± 0%      ~     (p=0.533 n=8+7)
DecoderStream               2.00 ± 0%      2.00 ± 0%      ~     (all equal)
DecoderStream-6             2.00 ± 0%      2.00 ± 0%      ~     (all equal)
DecoderStream-48            2.00 ± 0%      2.00 ± 0%      ~     (all equal)
CodeUnmarshal               105k ± 0%      105k ± 0%      ~     (all equal)
CodeUnmarshal-6             105k ± 0%      105k ± 0%      ~     (all equal)
CodeUnmarshal-48            105k ± 0%      105k ± 0%      ~     (all equal)
CodeUnmarshalReuse         89.5k ± 0%     89.6k ± 0%      ~     (p=0.246 n=7+8)
CodeUnmarshalReuse-6       89.8k ± 0%     89.8k ± 0%      ~     (p=1.000 n=8+8)
CodeUnmarshalReuse-48      90.5k ± 0%     90.5k ± 0%      ~     (all equal)
UnmarshalString             2.00 ± 0%      2.00 ± 0%      ~     (all equal)
UnmarshalString-6           2.00 ± 0%      2.00 ± 0%      ~     (all equal)
UnmarshalString-48          2.00 ± 0%      2.00 ± 0%      ~     (all equal)
UnmarshalFloat64            2.00 ± 0%      2.00 ± 0%      ~     (all equal)
UnmarshalFloat64-6          2.00 ± 0%      2.00 ± 0%      ~     (all equal)
UnmarshalFloat64-48         2.00 ± 0%      2.00 ± 0%      ~     (all equal)
UnmarshalInt64              2.00 ± 0%      2.00 ± 0%      ~     (all equal)
UnmarshalInt64-6            2.00 ± 0%      2.00 ± 0%      ~     (all equal)
UnmarshalInt64-48           2.00 ± 0%      2.00 ± 0%      ~     (all equal)
Issue10335                  3.00 ± 0%      3.00 ± 0%      ~     (all equal)
Issue10335-6                3.00 ± 0%      3.00 ± 0%      ~     (all equal)
Issue10335-48               3.00 ± 0%      3.00 ± 0%      ~     (all equal)
Unmapped                    4.00 ± 0%      4.00 ± 0%      ~     (all equal)
Unmapped-6                  4.00 ± 0%      4.00 ± 0%      ~     (all equal)
Unmapped-48                 4.00 ± 0%      4.00 ± 0%      ~     (all equal)
NumberIsValid               0.00           0.00           ~     (all equal)
NumberIsValid-6             0.00           0.00           ~     (all equal)
NumberIsValid-48            0.00           0.00           ~     (all equal)
NumberIsValidRegexp         0.00           0.00           ~     (all equal)
NumberIsValidRegexp-6       0.00           0.00           ~     (all equal)
NumberIsValidRegexp-48      0.00           0.00           ~     (all equal)
SkipValue                   0.00           0.00           ~     (all equal)
SkipValue-6                 0.00           0.00           ~     (all equal)
SkipValue-48                0.00           0.00           ~     (all equal)
EncoderEncode               1.00 ± 0%      0.00       -100.00%  (p=0.000 n=8+8)
EncoderEncode-6             1.00 ± 0%      0.00       -100.00%  (p=0.000 n=8+8)
EncoderEncode-48            1.00 ± 0%      0.00       -100.00%  (p=0.000 n=8+8)

https://perf.golang.org/search?q=upload:20170427.2

updates #17973
updates #18177

Change-Id: I5881c7a2bfad1766e6aa3444bb630883e0be467b
Reviewed-on: https://go-review.googlesource.com/41931
Run-TryBot: Bryan Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
gopherbot pushed a commit that referenced this issue Apr 28, 2017
This has no measurable impact on performance, but somewhat simplifies
the code.

updates #18177

name                  old time/op    new time/op    delta
EndToEnd                54.3µs ±10%    55.7µs ±12%    ~     (p=0.505 n=8+8)
EndToEnd-6              31.4µs ± 9%    32.7µs ± 6%    ~     (p=0.130 n=8+8)
EndToEnd-48             25.5µs ±12%    26.4µs ± 6%    ~     (p=0.195 n=8+8)
EndToEndHTTP            53.7µs ± 8%    51.2µs ±15%    ~     (p=0.463 n=7+8)
EndToEndHTTP-6          30.9µs ±18%    31.2µs ±14%    ~     (p=0.959 n=8+8)
EndToEndHTTP-48         24.9µs ±11%    25.7µs ± 6%    ~     (p=0.382 n=8+8)
EndToEndAsync           23.6µs ± 7%    24.2µs ± 6%    ~     (p=0.383 n=7+7)
EndToEndAsync-6         21.0µs ±23%    22.0µs ±20%    ~     (p=0.574 n=8+8)
EndToEndAsync-48        22.8µs ±16%    23.3µs ±13%    ~     (p=0.721 n=8+8)
EndToEndAsyncHTTP       25.8µs ± 7%    24.7µs ±14%    ~     (p=0.161 n=8+8)
EndToEndAsyncHTTP-6     22.1µs ±19%    22.6µs ±12%    ~     (p=0.645 n=8+8)
EndToEndAsyncHTTP-48    22.9µs ±13%    22.1µs ±20%    ~     (p=0.574 n=8+8)

name                  old alloc/op   new alloc/op   delta
EndToEnd                  320B ± 0%      321B ± 0%    ~     (p=1.000 n=8+8)
EndToEnd-6                320B ± 0%      321B ± 0%  +0.20%  (p=0.037 n=8+7)
EndToEnd-48               326B ± 0%      326B ± 0%    ~     (p=0.124 n=8+8)
EndToEndHTTP              320B ± 0%      320B ± 0%    ~     (all equal)
EndToEndHTTP-6            320B ± 0%      321B ± 0%    ~     (p=0.077 n=8+8)
EndToEndHTTP-48           324B ± 0%      324B ± 0%    ~     (p=1.000 n=8+8)
EndToEndAsync             227B ± 0%      227B ± 0%    ~     (p=0.154 n=8+7)
EndToEndAsync-6           226B ± 0%      226B ± 0%    ~     (all equal)
EndToEndAsync-48          230B ± 1%      229B ± 1%    ~     (p=0.072 n=8+8)
EndToEndAsyncHTTP         227B ± 0%      227B ± 0%    ~     (all equal)
EndToEndAsyncHTTP-6       226B ± 0%      226B ± 0%    ~     (p=0.400 n=8+7)
EndToEndAsyncHTTP-48      228B ± 0%      228B ± 0%    ~     (p=0.949 n=8+6)

name                  old allocs/op  new allocs/op  delta
EndToEnd                  9.00 ± 0%      9.00 ± 0%    ~     (all equal)
EndToEnd-6                9.00 ± 0%      9.00 ± 0%    ~     (all equal)
EndToEnd-48               9.00 ± 0%      9.00 ± 0%    ~     (all equal)
EndToEndHTTP              9.00 ± 0%      9.00 ± 0%    ~     (all equal)
EndToEndHTTP-6            9.00 ± 0%      9.00 ± 0%    ~     (all equal)
EndToEndHTTP-48           9.00 ± 0%      9.00 ± 0%    ~     (all equal)
EndToEndAsync             8.00 ± 0%      8.00 ± 0%    ~     (all equal)
EndToEndAsync-6           8.00 ± 0%      8.00 ± 0%    ~     (all equal)
EndToEndAsync-48          8.00 ± 0%      8.00 ± 0%    ~     (all equal)
EndToEndAsyncHTTP         8.00 ± 0%      8.00 ± 0%    ~     (all equal)
EndToEndAsyncHTTP-6       8.00 ± 0%      8.00 ± 0%    ~     (all equal)
EndToEndAsyncHTTP-48      8.00 ± 0%      8.00 ± 0%    ~     (all equal)

https://perf.golang.org/search?q=upload:20170428.2

Change-Id: I8ef7f71a7602302aa78c144327270dfce9211539
Reviewed-on: https://go-review.googlesource.com/42112
Run-TryBot: Bryan Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
gopherbot pushed a commit that referenced this issue Apr 28, 2017
This provides a significant speedup for TypeByExtension and
ExtensionsByType when using many CPU cores.

updates #17973
updates #18177

name                                          old time/op    new time/op    delta
QEncodeWord                                      526ns ± 3%     525ns ± 3%     ~     (p=0.990 n=15+28)
QEncodeWord-6                                    945ns ± 7%     913ns ±20%     ~     (p=0.220 n=14+28)
QEncodeWord-48                                  1.02µs ± 2%    1.00µs ± 6%   -2.22%  (p=0.036 n=13+27)
QDecodeWord                                      311ns ±18%     323ns ±20%     ~     (p=0.107 n=16+28)
QDecodeWord-6                                    595ns ±12%     612ns ±11%     ~     (p=0.093 n=15+27)
QDecodeWord-48                                   592ns ± 6%     606ns ± 8%   +2.39%  (p=0.045 n=16+26)
QDecodeHeader                                    389ns ± 4%     394ns ± 8%     ~     (p=0.161 n=12+26)
QDecodeHeader-6                                  685ns ±12%     674ns ±20%     ~     (p=0.773 n=14+27)
QDecodeHeader-48                                 658ns ±13%     669ns ±14%     ~     (p=0.457 n=16+28)
TypeByExtension/.html                           77.4ns ±15%    55.5ns ±13%  -28.35%  (p=0.000 n=8+8)
TypeByExtension/.html-6                          263ns ± 9%      10ns ±21%  -96.29%  (p=0.000 n=8+8)
TypeByExtension/.html-48                         175ns ± 5%       2ns ±16%  -98.88%  (p=0.000 n=8+8)
TypeByExtension/.HTML                            113ns ± 6%      97ns ± 6%  -14.37%  (p=0.000 n=8+8)
TypeByExtension/.HTML-6                          273ns ± 7%      17ns ± 4%  -93.93%  (p=0.000 n=7+8)
TypeByExtension/.HTML-48                         175ns ± 4%       4ns ± 4%  -97.73%  (p=0.000 n=8+8)
TypeByExtension/.unused                          116ns ± 4%      90ns ± 4%  -22.89%  (p=0.001 n=7+7)
TypeByExtension/.unused-6                        262ns ± 5%      15ns ± 4%  -94.17%  (p=0.000 n=8+8)
TypeByExtension/.unused-48                       176ns ± 4%       3ns ±10%  -98.10%  (p=0.000 n=8+8)
ExtensionsByType/text/html                       630ns ± 5%     522ns ± 5%  -17.19%  (p=0.000 n=8+7)
ExtensionsByType/text/html-6                     314ns ±20%     136ns ± 6%  -56.80%  (p=0.000 n=8+8)
ExtensionsByType/text/html-48                    298ns ± 4%     104ns ± 6%  -65.06%  (p=0.000 n=8+8)
ExtensionsByType/text/html;_charset=utf-8       1.12µs ± 3%    1.05µs ± 7%   -6.19%  (p=0.004 n=8+7)
ExtensionsByType/text/html;_charset=utf-8-6      402ns ±11%     307ns ± 4%  -23.77%  (p=0.000 n=8+8)
ExtensionsByType/text/html;_charset=utf-8-48     422ns ± 3%     309ns ± 4%  -26.86%  (p=0.000 n=8+8)
ExtensionsByType/application/octet-stream        810ns ± 2%     747ns ± 5%   -7.74%  (p=0.000 n=8+8)
ExtensionsByType/application/octet-stream-6      289ns ± 9%     185ns ± 8%  -36.15%  (p=0.000 n=7+8)
ExtensionsByType/application/octet-stream-48     267ns ± 6%      94ns ± 2%  -64.91%  (p=0.000 n=8+7)

name                                          old alloc/op   new alloc/op   delta
QEncodeWord                                      48.0B ± 0%     48.0B ± 0%     ~     (all equal)
QEncodeWord-6                                    48.0B ± 0%     48.0B ± 0%     ~     (all equal)
QEncodeWord-48                                   48.0B ± 0%     48.0B ± 0%     ~     (all equal)
QDecodeWord                                      48.0B ± 0%     48.0B ± 0%     ~     (all equal)
QDecodeWord-6                                    48.0B ± 0%     48.0B ± 0%     ~     (all equal)
QDecodeWord-48                                   48.0B ± 0%     48.0B ± 0%     ~     (all equal)
QDecodeHeader                                    48.0B ± 0%     48.0B ± 0%     ~     (all equal)
QDecodeHeader-6                                  48.0B ± 0%     48.0B ± 0%     ~     (all equal)
QDecodeHeader-48                                 48.0B ± 0%     48.0B ± 0%     ~     (all equal)
TypeByExtension/.html                            0.00B          0.00B          ~     (all equal)
TypeByExtension/.html-6                          0.00B          0.00B          ~     (all equal)
TypeByExtension/.html-48                         0.00B          0.00B          ~     (all equal)
TypeByExtension/.HTML                            0.00B          0.00B          ~     (all equal)
TypeByExtension/.HTML-6                          0.00B          0.00B          ~     (all equal)
TypeByExtension/.HTML-48                         0.00B          0.00B          ~     (all equal)
TypeByExtension/.unused                          0.00B          0.00B          ~     (all equal)
TypeByExtension/.unused-6                        0.00B          0.00B          ~     (all equal)
TypeByExtension/.unused-48                       0.00B          0.00B          ~     (all equal)
ExtensionsByType/text/html                        192B ± 0%      176B ± 0%   -8.33%  (p=0.000 n=8+8)
ExtensionsByType/text/html-6                      192B ± 0%      176B ± 0%   -8.33%  (p=0.000 n=8+8)
ExtensionsByType/text/html-48                     192B ± 0%      176B ± 0%   -8.33%  (p=0.000 n=8+8)
ExtensionsByType/text/html;_charset=utf-8         480B ± 0%      464B ± 0%   -3.33%  (p=0.000 n=8+8)
ExtensionsByType/text/html;_charset=utf-8-6       480B ± 0%      464B ± 0%   -3.33%  (p=0.000 n=8+8)
ExtensionsByType/text/html;_charset=utf-8-48      480B ± 0%      464B ± 0%   -3.33%  (p=0.000 n=8+8)
ExtensionsByType/application/octet-stream         160B ± 0%      160B ± 0%     ~     (all equal)
ExtensionsByType/application/octet-stream-6       160B ± 0%      160B ± 0%     ~     (all equal)
ExtensionsByType/application/octet-stream-48      160B ± 0%      160B ± 0%     ~     (all equal)

name                                          old allocs/op  new allocs/op  delta
QEncodeWord                                       1.00 ± 0%      1.00 ± 0%     ~     (all equal)
QEncodeWord-6                                     1.00 ± 0%      1.00 ± 0%     ~     (all equal)
QEncodeWord-48                                    1.00 ± 0%      1.00 ± 0%     ~     (all equal)
QDecodeWord                                       2.00 ± 0%      2.00 ± 0%     ~     (all equal)
QDecodeWord-6                                     2.00 ± 0%      2.00 ± 0%     ~     (all equal)
QDecodeWord-48                                    2.00 ± 0%      2.00 ± 0%     ~     (all equal)
QDecodeHeader                                     2.00 ± 0%      2.00 ± 0%     ~     (all equal)
QDecodeHeader-6                                   2.00 ± 0%      2.00 ± 0%     ~     (all equal)
QDecodeHeader-48                                  2.00 ± 0%      2.00 ± 0%     ~     (all equal)
TypeByExtension/.html                             0.00           0.00          ~     (all equal)
TypeByExtension/.html-6                           0.00           0.00          ~     (all equal)
TypeByExtension/.html-48                          0.00           0.00          ~     (all equal)
TypeByExtension/.HTML                             0.00           0.00          ~     (all equal)
TypeByExtension/.HTML-6                           0.00           0.00          ~     (all equal)
TypeByExtension/.HTML-48                          0.00           0.00          ~     (all equal)
TypeByExtension/.unused                           0.00           0.00          ~     (all equal)
TypeByExtension/.unused-6                         0.00           0.00          ~     (all equal)
TypeByExtension/.unused-48                        0.00           0.00          ~     (all equal)
ExtensionsByType/text/html                        3.00 ± 0%      3.00 ± 0%     ~     (all equal)
ExtensionsByType/text/html-6                      3.00 ± 0%      3.00 ± 0%     ~     (all equal)
ExtensionsByType/text/html-48                     3.00 ± 0%      3.00 ± 0%     ~     (all equal)
ExtensionsByType/text/html;_charset=utf-8         4.00 ± 0%      4.00 ± 0%     ~     (all equal)
ExtensionsByType/text/html;_charset=utf-8-6       4.00 ± 0%      4.00 ± 0%     ~     (all equal)
ExtensionsByType/text/html;_charset=utf-8-48      4.00 ± 0%      4.00 ± 0%     ~     (all equal)
ExtensionsByType/application/octet-stream         2.00 ± 0%      2.00 ± 0%     ~     (all equal)
ExtensionsByType/application/octet-stream-6       2.00 ± 0%      2.00 ± 0%     ~     (all equal)
ExtensionsByType/application/octet-stream-48      2.00 ± 0%      2.00 ± 0%     ~     (all equal)

https://perf.golang.org/search?q=upload:20170427.4

Change-Id: I35438be087ad6eb3d5da9119b395723ea5babaf6
Reviewed-on: https://go-review.googlesource.com/41990
Run-TryBot: Bryan Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
gopherbot pushed a commit that referenced this issue Apr 28, 2017
Int and Float already used atomics.

When many goroutines on many CPUs concurrently update a StringSet or a
Map with different keys per goroutine, this change results in dramatic
steady-state speedups.

This change does add some overhead for single-CPU and ephemeral maps.
I believe that is mostly due to an increase in allocations per call
(to pack the map keys and values into interface{} values that may
escape into the heap). With better inlining and/or escape analysis,
the single-CPU penalty may decline somewhat.

There are still two RWMutexes in the package: one for the keys in the
global "vars" map, and one for the keys in individual Map variables.

Those RWMutexes could also be eliminated, but avoiding excessive
allocations when adding new keys would require care. The remaining
RWMutexes are only acquired in Do functions, which I believe are not
typically on the fast path.

updates #17973
updates #18177

name             old time/op    new time/op    delta
StringSet          65.9ns ± 8%    55.7ns ± 1%   -15.46%  (p=0.000 n=8+7)
StringSet-6         416ns ±22%     127ns ±19%   -69.37%  (p=0.000 n=8+8)
StringSet-48        309ns ± 8%      94ns ± 3%   -69.43%  (p=0.001 n=7+7)

name             old alloc/op   new alloc/op   delta
StringSet           0.00B         16.00B ± 0%     +Inf%  (p=0.000 n=8+8)
StringSet-6         0.00B         16.00B ± 0%     +Inf%  (p=0.000 n=8+8)
StringSet-48        0.00B         16.00B ± 0%     +Inf%  (p=0.000 n=8+8)

name             old allocs/op  new allocs/op  delta
StringSet            0.00           1.00 ± 0%     +Inf%  (p=0.000 n=8+8)
StringSet-6          0.00           1.00 ± 0%     +Inf%  (p=0.000 n=8+8)
StringSet-48         0.00           1.00 ± 0%     +Inf%  (p=0.000 n=8+8)

https://perf.golang.org/search?q=upload:20170427.3

name                           old time/op    new time/op    delta
IntAdd                           5.64ns ± 3%    5.58ns ± 1%      ~     (p=0.185 n=8+8)
IntAdd-6                         18.6ns ±32%    21.4ns ±21%      ~     (p=0.078 n=8+8)
IntAdd-48                        19.6ns ±13%    20.6ns ±19%      ~     (p=0.702 n=8+8)
IntSet                           5.50ns ± 1%    5.48ns ± 0%      ~     (p=0.222 n=7+8)
IntSet-6                         18.5ns ±16%    20.4ns ±30%      ~     (p=0.314 n=8+8)
IntSet-48                        19.7ns ±12%    20.4ns ±16%      ~     (p=0.522 n=8+8)
FloatAdd                         14.5ns ± 1%    14.6ns ± 2%      ~     (p=0.237 n=7+8)
FloatAdd-6                       69.9ns ±13%    68.4ns ± 7%      ~     (p=0.557 n=7+7)
FloatAdd-48                       110ns ± 9%     109ns ± 6%      ~     (p=0.667 n=8+8)
FloatSet                         7.62ns ± 3%    7.64ns ± 5%      ~     (p=0.939 n=8+8)
FloatSet-6                       20.7ns ±22%    21.0ns ±23%      ~     (p=0.959 n=8+8)
FloatSet-48                      20.4ns ±24%    20.8ns ±19%      ~     (p=0.899 n=8+8)
MapSet                           88.1ns ±15%   200.9ns ± 7%  +128.11%  (p=0.000 n=8+8)
MapSet-6                          453ns ±12%     202ns ± 8%   -55.43%  (p=0.000 n=8+8)
MapSet-48                         432ns ±12%     240ns ±15%   -44.49%  (p=0.000 n=8+8)
MapSetDifferent                   349ns ± 1%     876ns ± 2%  +151.08%  (p=0.001 n=6+7)
MapSetDifferent-6                1.74µs ±32%    0.25µs ±17%   -85.71%  (p=0.000 n=8+8)
MapSetDifferent-48               1.77µs ±10%    0.14µs ± 2%   -91.84%  (p=0.000 n=8+8)
MapSetString                     88.1ns ± 7%   205.3ns ± 5%  +132.98%  (p=0.001 n=7+7)
MapSetString-6                    438ns ±30%     205ns ± 9%   -53.15%  (p=0.000 n=8+8)
MapSetString-48                   419ns ±14%     241ns ±15%   -42.39%  (p=0.000 n=8+8)
MapAddSame                        686ns ± 9%    1010ns ± 5%   +47.41%  (p=0.000 n=8+8)
MapAddSame-6                      238ns ±10%     300ns ±11%   +26.22%  (p=0.000 n=8+8)
MapAddSame-48                     366ns ± 4%     483ns ± 3%   +32.06%  (p=0.000 n=8+8)
MapAddDifferent                  1.96µs ± 4%    3.24µs ± 6%   +65.58%  (p=0.000 n=8+8)
MapAddDifferent-6                 553ns ± 3%     948ns ± 8%   +71.43%  (p=0.000 n=7+8)
MapAddDifferent-48                548ns ± 4%    1242ns ±10%  +126.81%  (p=0.000 n=8+8)
MapAddSameSteadyState            31.5ns ± 7%    41.7ns ± 6%   +32.61%  (p=0.000 n=8+8)
MapAddSameSteadyState-6           239ns ± 7%     101ns ±30%   -57.53%  (p=0.000 n=7+8)
MapAddSameSteadyState-48          152ns ± 4%      85ns ±13%   -43.84%  (p=0.000 n=8+7)
MapAddDifferentSteadyState        151ns ± 5%     177ns ± 1%   +17.32%  (p=0.001 n=8+6)
MapAddDifferentSteadyState-6      861ns ±15%      62ns ±23%   -92.85%  (p=0.000 n=8+8)
MapAddDifferentSteadyState-48     617ns ± 2%      20ns ±14%   -96.75%  (p=0.000 n=8+8)
RealworldExpvarUsage             4.33µs ± 4%    4.48µs ± 6%      ~     (p=0.336 n=8+7)
RealworldExpvarUsage-6           2.12µs ±20%    2.28µs ±10%      ~     (p=0.228 n=8+6)
RealworldExpvarUsage-48          1.23µs ±19%    1.36µs ±16%      ~     (p=0.152 n=7+8)

name                           old alloc/op   new alloc/op   delta
IntAdd                            0.00B          0.00B           ~     (all equal)
IntAdd-6                          0.00B          0.00B           ~     (all equal)
IntAdd-48                         0.00B          0.00B           ~     (all equal)
IntSet                            0.00B          0.00B           ~     (all equal)
IntSet-6                          0.00B          0.00B           ~     (all equal)
IntSet-48                         0.00B          0.00B           ~     (all equal)
FloatAdd                          0.00B          0.00B           ~     (all equal)
FloatAdd-6                        0.00B          0.00B           ~     (all equal)
FloatAdd-48                       0.00B          0.00B           ~     (all equal)
FloatSet                          0.00B          0.00B           ~     (all equal)
FloatSet-6                        0.00B          0.00B           ~     (all equal)
FloatSet-48                       0.00B          0.00B           ~     (all equal)
MapSet                            0.00B         48.00B ± 0%     +Inf%  (p=0.000 n=8+8)
MapSet-6                          0.00B         48.00B ± 0%     +Inf%  (p=0.000 n=8+8)
MapSet-48                         0.00B         48.00B ± 0%     +Inf%  (p=0.000 n=8+8)
MapSetDifferent                   0.00B        192.00B ± 0%     +Inf%  (p=0.000 n=8+8)
MapSetDifferent-6                 0.00B        192.00B ± 0%     +Inf%  (p=0.000 n=8+8)
MapSetDifferent-48                0.00B        192.00B ± 0%     +Inf%  (p=0.000 n=8+8)
MapSetString                      0.00B         48.00B ± 0%     +Inf%  (p=0.000 n=8+8)
MapSetString-6                    0.00B         48.00B ± 0%     +Inf%  (p=0.000 n=8+8)
MapSetString-48                   0.00B         48.00B ± 0%     +Inf%  (p=0.000 n=8+8)
MapAddSame                         456B ± 0%      480B ± 0%    +5.26%  (p=0.000 n=8+8)
MapAddSame-6                       456B ± 0%      480B ± 0%    +5.26%  (p=0.000 n=8+8)
MapAddSame-48                      456B ± 0%      480B ± 0%    +5.26%  (p=0.000 n=8+8)
MapAddDifferent                    672B ± 0%     1088B ± 0%   +61.90%  (p=0.000 n=8+8)
MapAddDifferent-6                  672B ± 0%     1088B ± 0%   +61.90%  (p=0.000 n=8+8)
MapAddDifferent-48                 672B ± 0%     1088B ± 0%   +61.90%  (p=0.000 n=8+8)
MapAddSameSteadyState             0.00B          0.00B           ~     (all equal)
MapAddSameSteadyState-6           0.00B          0.00B           ~     (all equal)
MapAddSameSteadyState-48          0.00B          0.00B           ~     (all equal)
MapAddDifferentSteadyState        0.00B          0.00B           ~     (all equal)
MapAddDifferentSteadyState-6      0.00B          0.00B           ~     (all equal)
MapAddDifferentSteadyState-48     0.00B          0.00B           ~     (all equal)
RealworldExpvarUsage              0.00B          0.00B           ~     (all equal)
RealworldExpvarUsage-6            0.00B          0.00B           ~     (all equal)
RealworldExpvarUsage-48           0.00B          0.00B           ~     (all equal)

name                           old allocs/op  new allocs/op  delta
IntAdd                             0.00           0.00           ~     (all equal)
IntAdd-6                           0.00           0.00           ~     (all equal)
IntAdd-48                          0.00           0.00           ~     (all equal)
IntSet                             0.00           0.00           ~     (all equal)
IntSet-6                           0.00           0.00           ~     (all equal)
IntSet-48                          0.00           0.00           ~     (all equal)
FloatAdd                           0.00           0.00           ~     (all equal)
FloatAdd-6                         0.00           0.00           ~     (all equal)
FloatAdd-48                        0.00           0.00           ~     (all equal)
FloatSet                           0.00           0.00           ~     (all equal)
FloatSet-6                         0.00           0.00           ~     (all equal)
FloatSet-48                        0.00           0.00           ~     (all equal)
MapSet                             0.00           3.00 ± 0%     +Inf%  (p=0.000 n=8+8)
MapSet-6                           0.00           3.00 ± 0%     +Inf%  (p=0.000 n=8+8)
MapSet-48                          0.00           3.00 ± 0%     +Inf%  (p=0.000 n=8+8)
MapSetDifferent                    0.00          12.00 ± 0%     +Inf%  (p=0.000 n=8+8)
MapSetDifferent-6                  0.00          12.00 ± 0%     +Inf%  (p=0.000 n=8+8)
MapSetDifferent-48                 0.00          12.00 ± 0%     +Inf%  (p=0.000 n=8+8)
MapSetString                       0.00           3.00 ± 0%     +Inf%  (p=0.000 n=8+8)
MapSetString-6                     0.00           3.00 ± 0%     +Inf%  (p=0.000 n=8+8)
MapSetString-48                    0.00           3.00 ± 0%     +Inf%  (p=0.000 n=8+8)
MapAddSame                         6.00 ± 0%     11.00 ± 0%   +83.33%  (p=0.000 n=8+8)
MapAddSame-6                       6.00 ± 0%     11.00 ± 0%   +83.33%  (p=0.000 n=8+8)
MapAddSame-48                      6.00 ± 0%     11.00 ± 0%   +83.33%  (p=0.000 n=8+8)
MapAddDifferent                    14.0 ± 0%      31.0 ± 0%  +121.43%  (p=0.000 n=8+8)
MapAddDifferent-6                  14.0 ± 0%      31.0 ± 0%  +121.43%  (p=0.000 n=8+8)
MapAddDifferent-48                 14.0 ± 0%      31.0 ± 0%  +121.43%  (p=0.000 n=8+8)
MapAddSameSteadyState              0.00           0.00           ~     (all equal)
MapAddSameSteadyState-6            0.00           0.00           ~     (all equal)
MapAddSameSteadyState-48           0.00           0.00           ~     (all equal)
MapAddDifferentSteadyState         0.00           0.00           ~     (all equal)
MapAddDifferentSteadyState-6       0.00           0.00           ~     (all equal)
MapAddDifferentSteadyState-48      0.00           0.00           ~     (all equal)
RealworldExpvarUsage               0.00           0.00           ~     (all equal)
RealworldExpvarUsage-6             0.00           0.00           ~     (all equal)
RealworldExpvarUsage-48            0.00           0.00           ~     (all equal)

https://perf.golang.org/search?q=upload:20170427.1

Change-Id: I388b2e8a3cadb84fc1418af8acfc27338f799273
Reviewed-on: https://go-review.googlesource.com/41930
Run-TryBot: Bryan Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@gopherbot
Copy link
Contributor

CL https://golang.org/cl/42113 mentions this issue.

gopherbot pushed a commit that referenced this issue Apr 29, 2017
This change replaces the compressors and decompressors maps with
instances of sync.Map, eliminating the need for Mutex locking in
NewReader and NewWriter.

The impact for encoding large payloads is miniscule, but as the
payload size decreases, the reduction in setup costs becomes
measurable.

updates #17973
updates #18177

name                        old time/op    new time/op    delta
CompressedZipGarbage          13.6ms ± 3%    13.8ms ± 4%    ~     (p=0.275 n=14+16)
CompressedZipGarbage-6        2.81ms ±10%    2.80ms ± 9%    ~     (p=0.616 n=16+16)
CompressedZipGarbage-48        606µs ± 4%     600µs ± 3%    ~     (p=0.110 n=16+15)
Zip64Test                     88.7ms ± 5%    87.5ms ± 5%    ~     (p=0.150 n=14+14)
Zip64Test-6                   88.6ms ± 8%    94.5ms ±13%    ~     (p=0.070 n=14+16)
Zip64Test-48                   102ms ±19%     101ms ±19%    ~     (p=0.599 n=16+15)
Zip64TestSizes/4096           21.7µs ±10%    23.0µs ± 2%    ~     (p=0.076 n=14+12)
Zip64TestSizes/4096-6         7.58µs ±13%    7.49µs ±18%    ~     (p=0.752 n=16+16)
Zip64TestSizes/4096-48        19.5µs ± 8%    18.0µs ± 4%  -7.74%  (p=0.000 n=16+15)
Zip64TestSizes/1048576        1.36ms ± 9%    1.40ms ± 8%  +2.79%  (p=0.029 n=24+25)
Zip64TestSizes/1048576-6       262µs ±11%     260µs ±10%    ~     (p=0.506 n=24+24)
Zip64TestSizes/1048576-48      120µs ± 7%     116µs ± 7%  -3.05%  (p=0.006 n=24+25)
Zip64TestSizes/67108864       86.8ms ± 6%    85.1ms ± 5%    ~     (p=0.149 n=14+17)
Zip64TestSizes/67108864-6     15.9ms ± 2%    16.1ms ± 6%    ~     (p=0.279 n=14+17)
Zip64TestSizes/67108864-48    4.51ms ± 5%    4.53ms ± 4%    ~     (p=0.766 n=15+17)

name                        old alloc/op   new alloc/op   delta
CompressedZipGarbage          5.63kB ± 0%    5.63kB ± 0%    ~     (all equal)
CompressedZipGarbage-6        15.4kB ± 0%    15.4kB ± 0%    ~     (all equal)
CompressedZipGarbage-48       25.5kB ± 3%    25.6kB ± 2%    ~     (p=0.450 n=16+16)
Zip64Test                     20.0kB ± 0%    20.0kB ± 0%    ~     (p=0.060 n=16+13)
Zip64Test-6                   20.0kB ± 0%    20.0kB ± 0%    ~     (p=0.136 n=16+14)
Zip64Test-48                  20.0kB ± 0%    20.0kB ± 0%    ~     (p=1.000 n=16+16)
Zip64TestSizes/4096           20.0kB ± 0%    20.0kB ± 0%    ~     (all equal)
Zip64TestSizes/4096-6         20.0kB ± 0%    20.0kB ± 0%    ~     (all equal)
Zip64TestSizes/4096-48        20.0kB ± 0%    20.0kB ± 0%  -0.00%  (p=0.002 n=16+13)
Zip64TestSizes/1048576        20.0kB ± 0%    20.0kB ± 0%    ~     (all equal)
Zip64TestSizes/1048576-6      20.0kB ± 0%    20.0kB ± 0%    ~     (all equal)
Zip64TestSizes/1048576-48     20.1kB ± 0%    20.1kB ± 0%    ~     (p=0.775 n=24+25)
Zip64TestSizes/67108864       20.0kB ± 0%    20.0kB ± 0%    ~     (all equal)
Zip64TestSizes/67108864-6     20.0kB ± 0%    20.0kB ± 0%    ~     (p=0.272 n=16+17)
Zip64TestSizes/67108864-48    20.1kB ± 0%    20.1kB ± 0%    ~     (p=0.098 n=14+15)

name                        old allocs/op  new allocs/op  delta
CompressedZipGarbage            44.0 ± 0%      44.0 ± 0%    ~     (all equal)
CompressedZipGarbage-6          44.0 ± 0%      44.0 ± 0%    ~     (all equal)
CompressedZipGarbage-48         44.0 ± 0%      44.0 ± 0%    ~     (all equal)
Zip64Test                       53.0 ± 0%      53.0 ± 0%    ~     (all equal)
Zip64Test-6                     53.0 ± 0%      53.0 ± 0%    ~     (all equal)
Zip64Test-48                    53.0 ± 0%      53.0 ± 0%    ~     (all equal)
Zip64TestSizes/4096             53.0 ± 0%      53.0 ± 0%    ~     (all equal)
Zip64TestSizes/4096-6           53.0 ± 0%      53.0 ± 0%    ~     (all equal)
Zip64TestSizes/4096-48          53.0 ± 0%      53.0 ± 0%    ~     (all equal)
Zip64TestSizes/1048576          53.0 ± 0%      53.0 ± 0%    ~     (all equal)
Zip64TestSizes/1048576-6        53.0 ± 0%      53.0 ± 0%    ~     (all equal)
Zip64TestSizes/1048576-48       53.0 ± 0%      53.0 ± 0%    ~     (all equal)
Zip64TestSizes/67108864         53.0 ± 0%      53.0 ± 0%    ~     (all equal)
Zip64TestSizes/67108864-6       53.0 ± 0%      53.0 ± 0%    ~     (all equal)
Zip64TestSizes/67108864-48      53.0 ± 0%      53.0 ± 0%    ~     (all equal)

https://perf.golang.org/search?q=upload:20170428.4

Change-Id: Idb7bec091a210aba833066f8d083d66e27788286
Reviewed-on: https://go-review.googlesource.com/42113
Run-TryBot: Bryan Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@bradfitz
Copy link
Contributor

bradfitz commented May 3, 2017

This happened. Closing.

CLs can continue to reference this, but no reason to keep this open.

@bradfitz bradfitz closed this as completed May 3, 2017
@tmm1
Copy link
Contributor

tmm1 commented May 3, 2017

Very happy to see this land in stdlib.

I wonder though if Delete() should return the deleted value from the Map? Am I missing some simpler way to atomically pull an entry out of the map to operate on it?

@bcmills
Copy link
Contributor Author

bcmills commented May 4, 2017

There is currently no way to atomically remove and return a value from the map. There wasn't a use-case for it in the standard library, and it would make the synchronization a bit trickier. If you have a concrete use for it, though, we should by all means consider it.

(Also note that the delete operation for built-in maps doesn't return the deleted element: when possible, I was aiming for consistency with the built-in map API.)

@tmm1
Copy link
Contributor

tmm1 commented May 4, 2017

If you have a concrete use for it, though, we should by all means consider it.

In my case I need to cleanup some resources associated with the object in the map. Currently I have the following, but to make it atomic I would need to add a lock or sync.Once inside my Cleanup() method:

v, ok := streams.Load(key)
if ok {
  v.(*Stream).Cleanup()
  streams.Delete(key)
}

Something like this would be preferable:

v, ok := streams.Delete(key)
if ok {
  v.(*Stream).Cleanup()
}

@bcmills
Copy link
Contributor Author

bcmills commented May 4, 2017

I've seen that use-case before.

On the other hand, having Delete return the value would potentially constrain future optimization of sync.Map. The current implementation is reasonably scalable, but it's also very simple. I do not pretend to claim that it is optimal.

I suspect we would want to address this use-case with a separate method. In particular, I think a CompareAndDelete would allow you to do your cleanup without a separate lock:

v, ok := streams.Load(key)
if ok && streams.CompareAndDelete(key, v) {
  v.(*Stream).Cleanup()
}

Since we're after the Go 1.9 feature freeze, let's consider that as a possible API extension for 1.10.

@tmm1
Copy link
Contributor

tmm1 commented May 4, 2017

I like the CompareAndDelete proposal. The additional method would prevent any backwards compat issues.

I'm still on go 1.8, but I imported x/sync/syncmap and it's made my application code significantly cleaner and simpler. Thanks again for this great new type.

I'm also using your lazyLoad technique above to create entries in my map, and I have to say it's quite an elegant implementation. I would echo @akrylysov's sentiment for an API like this in stdlib, as there is quite a lot of boilerplate required at the moment. Maybe a new wrapper type like sync.LazyMap?

@seh
Copy link
Contributor

seh commented May 5, 2017

@bcmills, it looks like Load followed by the hypothetical CompareAndDelete still requires two lookup operations into the map. I understood @tmm1's original request to imply seeking the key's position in the map once, and taking action once there in right place.

Is the problem that you can't atomically remove an entry from that position and capture the value corresponding to the key?

@bcmills
Copy link
Contributor Author

bcmills commented May 9, 2017

it looks like Load followed by the hypothetical CompareAndDelete still requires two lookup operations into the map.

Indeed, although a Sufficiently Clever Compiler† could convert them to a single lookup as long as there are no other reads that could introduce a happens-before relationship with some other atomic store.

I understood @tmm1's original request to imply seeking the key's position in the map once, and taking action once there in right place.

The main point of sync.Map is to reduce cross-CPU contention, and it only really needs to be in the standard library at all so that we can use it to reduce contention elsewhere in the standard library. Reducing the number of atomic operations is only a secondary goal. Even with the two atomic operations, CompareAndDelete would avoid contention for @tmm1's use-case.

That said, if you're deleting from the map at all, your use-case is a bit different from the ones in the standard library and you'll likely get better performance from a different data structure, which does not necessarily need to live in the Go standard library.

That's not to say that we can't consider other possible optimizations to the API, of course. A Remove that combines the sequence of Load and CompareAndDelete into a single operation would be worth adding to sync.Map if the use-case turns out to be common. (It's probably worth filing a separate issue to track that use-case, since the immediate goal of this issue, "replace sync.RWMutex usage in the standard library", has been addressed.)


† Constructing a Sufficiently Clever Compiler for this usage is left as an exercise for the reader.

@liaoliaopro
Copy link

no equivalent method of len(map)?

@tarcieri
Copy link

Is this going to ship without any type safety? interface{} everywhere?

It seems like that will make it very hard to add type safety retroactively. What's the plan if Go ever does get generics? Add a new type that uses them and deprecate this?

@bcmills
Copy link
Contributor Author

bcmills commented Jul 20, 2017

Is this going to ship without any type safety? interface{} everywhere?

Run-time type safety, but not compile-time.

It seems like that will make it very hard to add type safety retroactively. What's the plan if Go ever does get generics? Add a new type that uses them and deprecate this?

Presumably, yes.
(See also https://blog.golang.org/toward-go2.)

@pciet
Copy link
Contributor

pciet commented Dec 9, 2017

I brought up having this implementation as a built-in type in the generics discussion for Go 2 due to lack of compile-time type safety: #15292

@golang golang locked and limited conversation to collaborators Dec 9, 2018
@rsc rsc unassigned bcmills Jun 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests