Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use non-pointer receiver for Marshal and Size #155

Closed
hectorj opened this issue May 11, 2016 · 12 comments
Closed

Use non-pointer receiver for Marshal and Size #155

hectorj opened this issue May 11, 2016 · 12 comments

Comments

@hectorj
Copy link
Contributor

hectorj commented May 11, 2016

I see from this part of the code that the generator prefer pointer receivers for struct with > 3 fields and arrays.

This seems unnecessary (this methods do not modify the data they operate on, so they do not require a pointer) and is inconvenient (I'd like to be able to Marshal my structs without referencing them, which possibly increases the generated garbage).

Folks at Easyjson accepted my PR after checking benchmarks. Would you accept something similar for the msgp generator?.

Or is there some benchmark showing that the use of a pointer receiver actually improves performances?

@philhofer
Copy link
Member

This has been tried before, and demonstrated worse performance. See :#72

You shouldn't have to address your locals to call a method; in the expression a.foo(), foo can take a as a pointer receiver. Conversely, if you're addressing the struct in order to cast it to an interface, it's already being boxed, so it will live on the heap regardless of the method receiver. In any case, escape analysis should rarely (perhaps never?) conclude that the MarshalMsg or Size methods cause the receiver to escape.

@hectorj
Copy link
Contributor Author

hectorj commented May 12, 2016

Your point about escape analysis seems right.
So it is mostly about convenience: at some place in my code I take an i interface{}, and later have a type switch checking if i is msgp.Encodable.

With pointer receivers if I pass just my struct, it is not. I have to pass a pointer to my struct, which doesn't fit my needs.

About performances, I see @zond said

Never mind, I just made it all work with my branch, and my first trivial timing of the new performance showed worse performance.

But running the benchmarks in master, performances do not seem to be affected. (I only get some small variance going both ways, which was expected):

$ go version
go version go1.5.3 linux/amd64
# master
$ git rev-parse HEAD
cf4d6d402b01d9b359f52fc88be0f582402177c0
$ go install ./...
$ go generate ./...
======== MessagePack Code Generator =======
>>> Input: "defs_test.go"
>>> Wrote and formatted "defgen_test.go"
$ go test -v -cpu=2 ./... -bench .
# [All tests pass]
PASS
BenchmarkLocate-2               20000000            97.8 ns/op   531.83 MB/s           0 B/op          0 allocs/op
BenchmarkReadWriteFloat32-2     20000000            80.0 ns/op
BenchmarkReadWriteFloat64-2     20000000            82.3 ns/op
BenchmarkUnmarshalAsJSON-2       1000000          1698 ns/op      93.59 MB/s          16 B/op          1 allocs/op
BenchmarkCopyToJSON-2            1000000          1989 ns/op      79.92 MB/s          48 B/op          1 allocs/op
BenchmarkStdlibJSON-2             200000          5783 ns/op      29.40 MB/s         920 B/op         36 allocs/op
BenchmarkReadMapHeaderBytes-2   200000000            8.32 ns/op  360.43 MB/s           0 B/op          0 allocs/op
BenchmarkReadArrayHeaderBytes-2 200000000            7.61 ns/op  394.27 MB/s           0 B/op          0 allocs/op
BenchmarkReadNilByte-2          1000000000           2.72 ns/op  367.11 MB/s           0 B/op          0 allocs/op
BenchmarkReadFloat64Bytes-2     200000000            9.20 ns/op  978.47 MB/s           0 B/op          0 allocs/op
BenchmarkReadFloat32Bytes-2     300000000            5.27 ns/op  948.08 MB/s           0 B/op          0 allocs/op
BenchmarkReadBoolBytes-2        200000000            6.24 ns/op  160.23 MB/s           0 B/op          0 allocs/op
BenchmarkReadTimeBytes-2        100000000           16.3 ns/op   920.73 MB/s           0 B/op          0 allocs/op
BenchmarkSkipBytes-2            10000000           164 ns/op     908.50 MB/s           0 B/op          0 allocs/op
BenchmarkReadMapHeader-2        100000000           20.9 ns/op    95.71 MB/s           0 B/op          0 allocs/op
BenchmarkReadArrayHeader-2      100000000           20.6 ns/op    96.94 MB/s           0 B/op          0 allocs/op
BenchmarkReadNil-2              100000000           16.9 ns/op    59.09 MB/s           0 B/op          0 allocs/op
BenchmarkReadFloat64-2          50000000            25.6 ns/op   351.97 MB/s           0 B/op          0 allocs/op
BenchmarkReadFloat32-2          100000000           22.2 ns/op   225.27 MB/s           0 B/op          0 allocs/op
BenchmarkReadInt64-2            100000000           24.8 ns/op   161.34 MB/s           0 B/op          0 allocs/op
BenchmarkReadUint64-2           50000000            24.8 ns/op    80.63 MB/s           0 B/op          0 allocs/op
BenchmarkRead16Bytes-2          30000000            40.1 ns/op   448.85 MB/s           0 B/op          0 allocs/op
BenchmarkRead256Bytes-2         20000000           100 ns/op    2574.03 MB/s           0 B/op          0 allocs/op
BenchmarkRead2048Bytes-2         3000000           510 ns/op    4014.73 MB/s           0 B/op          0 allocs/op
BenchmarkRead16StringAsBytes-2  30000000            40.2 ns/op   422.87 MB/s           0 B/op          0 allocs/op
BenchmarkRead256StringAsBytes-2 20000000           107 ns/op    2418.58 MB/s           0 B/op          0 allocs/op
BenchmarkRead16String-2         10000000           127 ns/op     133.53 MB/s          16 B/op          1 allocs/op
BenchmarkRead256String-2         5000000           266 ns/op     971.61 MB/s         256 B/op          1 allocs/op
BenchmarkReadComplex64-2        50000000            31.5 ns/op   317.30 MB/s           0 B/op          0 allocs/op
BenchmarkReadComplex128-2       50000000            38.7 ns/op   464.71 MB/s           0 B/op          0 allocs/op
BenchmarkReadTime-2             30000000            38.9 ns/op   385.45 MB/s           0 B/op          0 allocs/op
BenchmarkSkip-2                  5000000           413 ns/op     360.76 MB/s           0 B/op          0 allocs/op
BenchmarkAppendMapHeader-2      200000000            8.02 ns/op        0 B/op          0 allocs/op
BenchmarkAppendArrayHeader-2    200000000            7.92 ns/op        0 B/op          0 allocs/op
BenchmarkAppendFloat64-2        100000000           12.2 ns/op   736.77 MB/s           0 B/op          0 allocs/op
BenchmarkAppendFloat32-2        100000000           10.4 ns/op   479.46 MB/s           0 B/op          0 allocs/op
BenchmarkAppendInt64-2          100000000           18.6 ns/op         0 B/op          0 allocs/op
BenchmarkAppendUint64-2         100000000           19.3 ns/op         0 B/op          0 allocs/op
BenchmarkAppend16Bytes-2        100000000           18.2 ns/op  1156.18 MB/s           0 B/op          0 allocs/op
BenchmarkAppend256Bytes-2       50000000            27.2 ns/op  9584.15 MB/s           0 B/op          0 allocs/op
BenchmarkAppend2048Bytes-2      10000000           109 ns/op    18743.55 MB/s          0 B/op          0 allocs/op
BenchmarkAppend16String-2       100000000           16.7 ns/op  1254.41 MB/s           0 B/op          0 allocs/op
BenchmarkAppend256String-2      50000000            28.5 ns/op  9154.87 MB/s           0 B/op          0 allocs/op
BenchmarkAppend2048String-2     20000000            81.2 ns/op  25293.70 MB/s          0 B/op          0 allocs/op
BenchmarkAppendBool-2           300000000            3.60 ns/op  277.61 MB/s           0 B/op          0 allocs/op
BenchmarkAppendTime-2           50000000            24.2 ns/op   620.99 MB/s           0 B/op          0 allocs/op
BenchmarkWriteMapHeader-2       200000000            8.64 ns/op        0 B/op          0 allocs/op
BenchmarkWriteArrayHeader-2     200000000            8.97 ns/op        0 B/op          0 allocs/op
BenchmarkWriteFloat64-2         100000000           14.4 ns/op   624.26 MB/s           0 B/op          0 allocs/op
BenchmarkWriteFloat32-2         100000000           12.2 ns/op   408.34 MB/s           0 B/op          0 allocs/op
BenchmarkWriteInt64-2           100000000           13.8 ns/op   652.07 MB/s           0 B/op          0 allocs/op
BenchmarkWriteUint64-2          100000000           13.9 ns/op   648.52 MB/s           0 B/op          0 allocs/op
BenchmarkWrite16Bytes-2         50000000            23.5 ns/op         0 B/op          0 allocs/op
BenchmarkWrite256Bytes-2        50000000            33.1 ns/op         0 B/op          0 allocs/op
BenchmarkWrite2048Bytes-2       20000000            86.8 ns/op         0 B/op          0 allocs/op
BenchmarkWriteTime-2            50000000            24.6 ns/op   610.03 MB/s           0 B/op          0 allocs/op
BenchmarkWriteReadFile-2         2000000           766 ns/op     105.71 MB/s
ok      github.com/tinylib/msgp/msgp    374.252s
# PR 156
$ git checkout avoid-pointers-receivers 
Switched to branch 'avoid-pointers-receivers'
Your branch is up-to-date with 'fork/avoid-pointers-receivers'.
$ git rev-parse HEAD
4416ec38a88dcd4b55b36ff34d92950d684edc1f
$ go install ./...
$ go generate ./...
======== MessagePack Code Generator =======
>>> Input: "defs_test.go"
>>> Wrote and formatted "defgen_test.go"
$ go test -v -cpu=2 ./... -bench .
# [All tests pass]
PASS
BenchmarkLocate-2               20000000            97.9 ns/op   531.23 MB/s           0 B/op          0 allocs/op
BenchmarkReadWriteFloat32-2     20000000            85.8 ns/op
BenchmarkReadWriteFloat64-2     20000000            81.5 ns/op
BenchmarkUnmarshalAsJSON-2       1000000          1891 ns/op      84.05 MB/s          16 B/op          1 allocs/op
BenchmarkCopyToJSON-2            1000000          2206 ns/op      72.05 MB/s          48 B/op          1 allocs/op
BenchmarkStdlibJSON-2             200000          7715 ns/op      22.03 MB/s         920 B/op         36 allocs/op
BenchmarkReadMapHeaderBytes-2   200000000            8.41 ns/op  356.77 MB/s           0 B/op          0 allocs/op
BenchmarkReadArrayHeaderBytes-2 200000000            7.77 ns/op  386.24 MB/s           0 B/op          0 allocs/op
BenchmarkReadNilByte-2          500000000            2.95 ns/op  339.48 MB/s           0 B/op          0 allocs/op
BenchmarkReadFloat64Bytes-2     200000000            9.36 ns/op  961.76 MB/s           0 B/op          0 allocs/op
BenchmarkReadFloat32Bytes-2     300000000            5.46 ns/op  916.55 MB/s           0 B/op          0 allocs/op
BenchmarkReadBoolBytes-2        200000000            6.66 ns/op  150.10 MB/s           0 B/op          0 allocs/op
BenchmarkReadTimeBytes-2        100000000           16.1 ns/op   930.48 MB/s           0 B/op          0 allocs/op
BenchmarkSkipBytes-2            10000000           171 ns/op     868.05 MB/s           0 B/op          0 allocs/op
BenchmarkReadMapHeader-2        100000000           20.3 ns/op    98.31 MB/s           0 B/op          0 allocs/op
BenchmarkReadArrayHeader-2      100000000           20.6 ns/op    96.97 MB/s           0 B/op          0 allocs/op
BenchmarkReadNil-2              100000000           16.6 ns/op    60.16 MB/s           0 B/op          0 allocs/op
BenchmarkReadFloat64-2          50000000            27.5 ns/op   327.17 MB/s           0 B/op          0 allocs/op
BenchmarkReadFloat32-2          50000000            24.0 ns/op   208.24 MB/s           0 B/op          0 allocs/op
BenchmarkReadInt64-2            50000000            35.9 ns/op   111.41 MB/s           0 B/op          0 allocs/op
BenchmarkReadUint64-2           50000000            27.7 ns/op    72.26 MB/s           0 B/op          0 allocs/op
BenchmarkRead16Bytes-2          30000000            49.9 ns/op   360.56 MB/s           0 B/op          0 allocs/op
BenchmarkRead256Bytes-2         10000000           137 ns/op    1878.40 MB/s           0 B/op          0 allocs/op
BenchmarkRead2048Bytes-2         2000000           547 ns/op    3743.05 MB/s           0 B/op          0 allocs/op
BenchmarkRead16StringAsBytes-2  30000000            46.2 ns/op   368.24 MB/s           0 B/op          0 allocs/op
BenchmarkRead256StringAsBytes-2 10000000           124 ns/op    2075.45 MB/s           0 B/op          0 allocs/op
BenchmarkRead16String-2         10000000           110 ns/op     154.20 MB/s          16 B/op          1 allocs/op
BenchmarkRead256String-2         5000000           256 ns/op    1010.92 MB/s         256 B/op          1 allocs/op
BenchmarkReadComplex64-2        50000000            37.3 ns/op   267.92 MB/s           0 B/op          0 allocs/op
BenchmarkReadComplex128-2       30000000            49.4 ns/op   364.74 MB/s           0 B/op          0 allocs/op
BenchmarkReadTime-2             50000000            40.7 ns/op   368.44 MB/s           0 B/op          0 allocs/op
BenchmarkSkip-2                  3000000           398 ns/op     374.23 MB/s           0 B/op          0 allocs/op
BenchmarkAppendMapHeader-2      200000000            7.87 ns/op        0 B/op          0 allocs/op
BenchmarkAppendArrayHeader-2    200000000            7.82 ns/op        0 B/op          0 allocs/op
BenchmarkAppendFloat64-2        100000000           11.7 ns/op   768.19 MB/s           0 B/op          0 allocs/op
BenchmarkAppendFloat32-2        200000000            9.69 ns/op  515.96 MB/s           0 B/op          0 allocs/op
BenchmarkAppendInt64-2          100000000           20.2 ns/op         0 B/op          0 allocs/op
BenchmarkAppendUint64-2         100000000           18.9 ns/op         0 B/op          0 allocs/op
BenchmarkAppend16Bytes-2        100000000           19.2 ns/op  1093.52 MB/s           0 B/op          0 allocs/op
BenchmarkAppend256Bytes-2       50000000            27.5 ns/op  9494.18 MB/s           0 B/op          0 allocs/op
BenchmarkAppend2048Bytes-2      20000000           109 ns/op    18832.78 MB/s          0 B/op          0 allocs/op
BenchmarkAppend16String-2       100000000           16.5 ns/op  1275.48 MB/s           0 B/op          0 allocs/op
BenchmarkAppend256String-2      50000000            26.6 ns/op  9803.78 MB/s           0 B/op          0 allocs/op
BenchmarkAppend2048String-2     20000000            84.7 ns/op  24227.11 MB/s          0 B/op          0 allocs/op
BenchmarkAppendBool-2           300000000            3.59 ns/op  278.51 MB/s           0 B/op          0 allocs/op
BenchmarkAppendTime-2           50000000            23.9 ns/op   626.37 MB/s           0 B/op          0 allocs/op
BenchmarkWriteMapHeader-2       200000000            8.52 ns/op        0 B/op          0 allocs/op
BenchmarkWriteArrayHeader-2     200000000            8.88 ns/op        0 B/op          0 allocs/op
BenchmarkWriteFloat64-2         100000000           13.9 ns/op   645.32 MB/s           0 B/op          0 allocs/op
BenchmarkWriteFloat32-2         100000000           11.3 ns/op   443.03 MB/s           0 B/op          0 allocs/op
BenchmarkWriteInt64-2           100000000           14.4 ns/op   624.81 MB/s           0 B/op          0 allocs/op
BenchmarkWriteUint64-2          100000000           13.1 ns/op   686.05 MB/s           0 B/op          0 allocs/op
BenchmarkWrite16Bytes-2         100000000           22.0 ns/op         0 B/op          0 allocs/op
BenchmarkWrite256Bytes-2        50000000            31.8 ns/op         0 B/op          0 allocs/op
BenchmarkWrite2048Bytes-2       20000000            84.9 ns/op         0 B/op          0 allocs/op
BenchmarkWriteTime-2            50000000            28.0 ns/op   535.39 MB/s           0 B/op          0 allocs/op
BenchmarkWriteReadFile-2         2000000           730 ns/op     110.88 MB/s
ok      github.com/tinylib/msgp/msgp    300.308s

@hectorj
Copy link
Contributor Author

hectorj commented May 12, 2016

(To show that the small differences in my 2 benchmarks are just variance, I did a second run with #156 : )

PASS
BenchmarkLocate-2               20000000            97.2 ns/op   534.81 MB/s           0 B/op          0 allocs/op
BenchmarkReadWriteFloat32-2     20000000            80.6 ns/op
BenchmarkReadWriteFloat64-2     20000000            81.2 ns/op
BenchmarkUnmarshalAsJSON-2       1000000          1702 ns/op      93.37 MB/s          16 B/op          1 allocs/op
BenchmarkCopyToJSON-2            1000000          2000 ns/op      79.47 MB/s          48 B/op          1 allocs/op
BenchmarkStdlibJSON-2             200000          5663 ns/op      30.02 MB/s         920 B/op         36 allocs/op
BenchmarkReadMapHeaderBytes-2   200000000            7.80 ns/op  384.40 MB/s           0 B/op          0 allocs/op
BenchmarkReadArrayHeaderBytes-2 200000000            8.34 ns/op  359.69 MB/s           0 B/op          0 allocs/op
BenchmarkReadNilByte-2          1000000000           2.84 ns/op  352.63 MB/s           0 B/op          0 allocs/op
BenchmarkReadFloat64Bytes-2     200000000            9.26 ns/op  971.95 MB/s           0 B/op          0 allocs/op
BenchmarkReadFloat32Bytes-2     300000000            5.34 ns/op  936.68 MB/s           0 B/op          0 allocs/op
BenchmarkReadBoolBytes-2        200000000            6.70 ns/op  149.27 MB/s           0 B/op          0 allocs/op
BenchmarkReadTimeBytes-2        100000000           16.5 ns/op   908.66 MB/s           0 B/op          0 allocs/op
BenchmarkSkipBytes-2            10000000           168 ns/op     884.07 MB/s           0 B/op          0 allocs/op
BenchmarkReadMapHeader-2        100000000           21.2 ns/op    94.43 MB/s           0 B/op          0 allocs/op
BenchmarkReadArrayHeader-2      100000000           20.8 ns/op    96.05 MB/s           0 B/op          0 allocs/op
BenchmarkReadNil-2              100000000           17.1 ns/op    58.40 MB/s           0 B/op          0 allocs/op
BenchmarkReadFloat64-2          50000000            35.4 ns/op   254.51 MB/s           0 B/op          0 allocs/op
BenchmarkReadFloat32-2          50000000            35.3 ns/op   141.69 MB/s           0 B/op          0 allocs/op
BenchmarkReadInt64-2            50000000            23.2 ns/op   172.10 MB/s           0 B/op          0 allocs/op
BenchmarkReadUint64-2           100000000           25.0 ns/op    80.02 MB/s           0 B/op          0 allocs/op
BenchmarkRead16Bytes-2          20000000            52.1 ns/op   345.16 MB/s           0 B/op          0 allocs/op
BenchmarkRead256Bytes-2         20000000           113 ns/op    2271.98 MB/s           0 B/op          0 allocs/op
BenchmarkRead2048Bytes-2         3000000           468 ns/op    4379.13 MB/s           0 B/op          0 allocs/op
BenchmarkRead16StringAsBytes-2  30000000            46.2 ns/op   367.59 MB/s           0 B/op          0 allocs/op
BenchmarkRead256StringAsBytes-2 20000000           135 ns/op    1913.54 MB/s           0 B/op          0 allocs/op
BenchmarkRead16String-2         10000000           119 ns/op     142.13 MB/s          16 B/op          1 allocs/op
BenchmarkRead256String-2         5000000           312 ns/op     827.54 MB/s         256 B/op          1 allocs/op
BenchmarkReadComplex64-2        50000000            37.4 ns/op   267.07 MB/s           0 B/op          0 allocs/op
BenchmarkReadComplex128-2       50000000            42.5 ns/op   423.25 MB/s           0 B/op          0 allocs/op
BenchmarkReadTime-2             30000000            41.4 ns/op   362.65 MB/s           0 B/op          0 allocs/op
BenchmarkSkip-2                  3000000           435 ns/op     341.87 MB/s           0 B/op          0 allocs/op
BenchmarkAppendMapHeader-2      100000000           11.0 ns/op         0 B/op          0 allocs/op
BenchmarkAppendArrayHeader-2    100000000           11.7 ns/op         0 B/op          0 allocs/op
BenchmarkAppendFloat64-2        100000000           19.0 ns/op   472.99 MB/s           0 B/op          0 allocs/op
BenchmarkAppendFloat32-2        100000000           15.4 ns/op   325.56 MB/s           0 B/op          0 allocs/op
BenchmarkAppendInt64-2          50000000            33.7 ns/op         0 B/op          0 allocs/op
BenchmarkAppendUint64-2         50000000            29.4 ns/op         0 B/op          0 allocs/op
BenchmarkAppend16Bytes-2        100000000           21.1 ns/op   993.56 MB/s           0 B/op          0 allocs/op
BenchmarkAppend256Bytes-2       50000000            27.7 ns/op  9420.12 MB/s           0 B/op          0 allocs/op
BenchmarkAppend2048Bytes-2      20000000           110 ns/op    18523.14 MB/s          0 B/op          0 allocs/op
BenchmarkAppend16String-2       100000000           25.3 ns/op   829.57 MB/s           0 B/op          0 allocs/op
BenchmarkAppend256String-2      50000000            33.1 ns/op  7886.48 MB/s           0 B/op          0 allocs/op
BenchmarkAppend2048String-2     20000000            98.0 ns/op  20952.67 MB/s          0 B/op          0 allocs/op
BenchmarkAppendBool-2           300000000            4.31 ns/op  231.91 MB/s           0 B/op          0 allocs/op
BenchmarkAppendTime-2           50000000            26.1 ns/op   574.90 MB/s           0 B/op          0 allocs/op
BenchmarkWriteMapHeader-2       200000000            9.63 ns/op        0 B/op          0 allocs/op
BenchmarkWriteArrayHeader-2     200000000            9.30 ns/op        0 B/op          0 allocs/op
BenchmarkWriteFloat64-2         100000000           15.2 ns/op   593.26 MB/s           0 B/op          0 allocs/op
BenchmarkWriteFloat32-2         100000000           11.7 ns/op   427.58 MB/s           0 B/op          0 allocs/op
BenchmarkWriteInt64-2           100000000           15.0 ns/op   601.99 MB/s           0 B/op          0 allocs/op
BenchmarkWriteUint64-2          100000000           14.6 ns/op   615.61 MB/s           0 B/op          0 allocs/op
BenchmarkWrite16Bytes-2         100000000           23.5 ns/op         0 B/op          0 allocs/op
BenchmarkWrite256Bytes-2        50000000            36.7 ns/op         0 B/op          0 allocs/op
BenchmarkWrite2048Bytes-2       20000000            92.0 ns/op         0 B/op          0 allocs/op
BenchmarkWriteTime-2            50000000            29.4 ns/op   509.40 MB/s           0 B/op          0 allocs/op
BenchmarkWriteReadFile-2         2000000           822 ns/op      98.48 MB/s

@philhofer
Copy link
Member

You need to benchmark the code in ./_generated explicitly; the go tool
will ignore it otherwise. All of those benchmarks are for library support
code, not generated methods.

On Wed, May 11, 2016 at 9:24 PM, Hector Jusforgues <notifications@github.com

wrote:

(To show that the small differences in my 2 benchmarks are just variance,
I did a second run with #156 #156 :

PASS
BenchmarkLocate-2 20000000 97.2 ns/op 534.81 MB/s 0 B/op 0 allocs/op
BenchmarkReadWriteFloat32-2 20000000 80.6 ns/op
BenchmarkReadWriteFloat64-2 20000000 81.2 ns/op
BenchmarkUnmarshalAsJSON-2 1000000 1702 ns/op 93.37 MB/s 16 B/op 1 allocs/op
BenchmarkCopyToJSON-2 1000000 2000 ns/op 79.47 MB/s 48 B/op 1 allocs/op
BenchmarkStdlibJSON-2 200000 5663 ns/op 30.02 MB/s 920 B/op 36 allocs/op
BenchmarkReadMapHeaderBytes-2 200000000 7.80 ns/op 384.40 MB/s 0 B/op 0 allocs/op
BenchmarkReadArrayHeaderBytes-2 200000000 8.34 ns/op 359.69 MB/s 0 B/op 0 allocs/op
BenchmarkReadNilByte-2 1000000000 2.84 ns/op 352.63 MB/s 0 B/op 0 allocs/op
BenchmarkReadFloat64Bytes-2 200000000 9.26 ns/op 971.95 MB/s 0 B/op 0 allocs/op
BenchmarkReadFloat32Bytes-2 300000000 5.34 ns/op 936.68 MB/s 0 B/op 0 allocs/op
BenchmarkReadBoolBytes-2 200000000 6.70 ns/op 149.27 MB/s 0 B/op 0 allocs/op
BenchmarkReadTimeBytes-2 100000000 16.5 ns/op 908.66 MB/s 0 B/op 0 allocs/op
BenchmarkSkipBytes-2 10000000 168 ns/op 884.07 MB/s 0 B/op 0 allocs/op
BenchmarkReadMapHeader-2 100000000 21.2 ns/op 94.43 MB/s 0 B/op 0 allocs/op
BenchmarkReadArrayHeader-2 100000000 20.8 ns/op 96.05 MB/s 0 B/op 0 allocs/op
BenchmarkReadNil-2 100000000 17.1 ns/op 58.40 MB/s 0 B/op 0 allocs/op
BenchmarkReadFloat64-2 50000000 35.4 ns/op 254.51 MB/s 0 B/op 0 allocs/op
BenchmarkReadFloat32-2 50000000 35.3 ns/op 141.69 MB/s 0 B/op 0 allocs/op
BenchmarkReadInt64-2 50000000 23.2 ns/op 172.10 MB/s 0 B/op 0 allocs/op
BenchmarkReadUint64-2 100000000 25.0 ns/op 80.02 MB/s 0 B/op 0 allocs/op
BenchmarkRead16Bytes-2 20000000 52.1 ns/op 345.16 MB/s 0 B/op 0 allocs/op
BenchmarkRead256Bytes-2 20000000 113 ns/op 2271.98 MB/s 0 B/op 0 allocs/op
BenchmarkRead2048Bytes-2 3000000 468 ns/op 4379.13 MB/s 0 B/op 0 allocs/op
BenchmarkRead16StringAsBytes-2 30000000 46.2 ns/op 367.59 MB/s 0 B/op 0 allocs/op
BenchmarkRead256StringAsBytes-2 20000000 135 ns/op 1913.54 MB/s 0 B/op 0 allocs/op
BenchmarkRead16String-2 10000000 119 ns/op 142.13 MB/s 16 B/op 1 allocs/op
BenchmarkRead256String-2 5000000 312 ns/op 827.54 MB/s 256 B/op 1 allocs/op
BenchmarkReadComplex64-2 50000000 37.4 ns/op 267.07 MB/s 0 B/op 0 allocs/op
BenchmarkReadComplex128-2 50000000 42.5 ns/op 423.25 MB/s 0 B/op 0 allocs/op
BenchmarkReadTime-2 30000000 41.4 ns/op 362.65 MB/s 0 B/op 0 allocs/op
BenchmarkSkip-2 3000000 435 ns/op 341.87 MB/s 0 B/op 0 allocs/op
BenchmarkAppendMapHeader-2 100000000 11.0 ns/op 0 B/op 0 allocs/op
BenchmarkAppendArrayHeader-2 100000000 11.7 ns/op 0 B/op 0 allocs/op
BenchmarkAppendFloat64-2 100000000 19.0 ns/op 472.99 MB/s 0 B/op 0 allocs/op
BenchmarkAppendFloat32-2 100000000 15.4 ns/op 325.56 MB/s 0 B/op 0 allocs/op
BenchmarkAppendInt64-2 50000000 33.7 ns/op 0 B/op 0 allocs/op
BenchmarkAppendUint64-2 50000000 29.4 ns/op 0 B/op 0 allocs/op
BenchmarkAppend16Bytes-2 100000000 21.1 ns/op 993.56 MB/s 0 B/op 0 allocs/op
BenchmarkAppend256Bytes-2 50000000 27.7 ns/op 9420.12 MB/s 0 B/op 0 allocs/op
BenchmarkAppend2048Bytes-2 20000000 110 ns/op 18523.14 MB/s 0 B/op 0 allocs/op
BenchmarkAppend16String-2 100000000 25.3 ns/op 829.57 MB/s 0 B/op 0 allocs/op
BenchmarkAppend256String-2 50000000 33.1 ns/op 7886.48 MB/s 0 B/op 0 allocs/op
BenchmarkAppend2048String-2 20000000 98.0 ns/op 20952.67 MB/s 0 B/op 0 allocs/op
BenchmarkAppendBool-2 300000000 4.31 ns/op 231.91 MB/s 0 B/op 0 allocs/op
BenchmarkAppendTime-2 50000000 26.1 ns/op 574.90 MB/s 0 B/op 0 allocs/op
BenchmarkWriteMapHeader-2 200000000 9.63 ns/op 0 B/op 0 allocs/op
BenchmarkWriteArrayHeader-2 200000000 9.30 ns/op 0 B/op 0 allocs/op
BenchmarkWriteFloat64-2 100000000 15.2 ns/op 593.26 MB/s 0 B/op 0 allocs/op
BenchmarkWriteFloat32-2 100000000 11.7 ns/op 427.58 MB/s 0 B/op 0 allocs/op
BenchmarkWriteInt64-2 100000000 15.0 ns/op 601.99 MB/s 0 B/op 0 allocs/op
BenchmarkWriteUint64-2 100000000 14.6 ns/op 615.61 MB/s 0 B/op 0 allocs/op
BenchmarkWrite16Bytes-2 100000000 23.5 ns/op 0 B/op 0 allocs/op
BenchmarkWrite256Bytes-2 50000000 36.7 ns/op 0 B/op 0 allocs/op
BenchmarkWrite2048Bytes-2 20000000 92.0 ns/op 0 B/op 0 allocs/op
BenchmarkWriteTime-2 50000000 29.4 ns/op 509.40 MB/s 0 B/op 0 allocs/op
BenchmarkWriteReadFile-2 2000000 822 ns/op 98.48 MB/s


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#155 (comment)

@philhofer
Copy link
Member

You'll also take a perf hit when you turn a value into interface{}
because it will have to be both copied and boxed.

On Wed, May 11, 2016 at 9:42 PM, Philip Hofer phofer@umich.edu wrote:

You need to benchmark the code in ./_generated explicitly; the go tool
will ignore it otherwise. All of those benchmarks are for library support
code, not generated methods.

On Wed, May 11, 2016 at 9:24 PM, Hector Jusforgues <
notifications@github.com> wrote:

(To show that the small differences in my 2 benchmarks are just variance,
I did a second run with #156 #156
:

PASS
BenchmarkLocate-2 20000000 97.2 ns/op 534.81 MB/s 0 B/op 0 allocs/op
BenchmarkReadWriteFloat32-2 20000000 80.6 ns/op
BenchmarkReadWriteFloat64-2 20000000 81.2 ns/op
BenchmarkUnmarshalAsJSON-2 1000000 1702 ns/op 93.37 MB/s 16 B/op 1 allocs/op
BenchmarkCopyToJSON-2 1000000 2000 ns/op 79.47 MB/s 48 B/op 1 allocs/op
BenchmarkStdlibJSON-2 200000 5663 ns/op 30.02 MB/s 920 B/op 36 allocs/op
BenchmarkReadMapHeaderBytes-2 200000000 7.80 ns/op 384.40 MB/s 0 B/op 0 allocs/op
BenchmarkReadArrayHeaderBytes-2 200000000 8.34 ns/op 359.69 MB/s 0 B/op 0 allocs/op
BenchmarkReadNilByte-2 1000000000 2.84 ns/op 352.63 MB/s 0 B/op 0 allocs/op
BenchmarkReadFloat64Bytes-2 200000000 9.26 ns/op 971.95 MB/s 0 B/op 0 allocs/op
BenchmarkReadFloat32Bytes-2 300000000 5.34 ns/op 936.68 MB/s 0 B/op 0 allocs/op
BenchmarkReadBoolBytes-2 200000000 6.70 ns/op 149.27 MB/s 0 B/op 0 allocs/op
BenchmarkReadTimeBytes-2 100000000 16.5 ns/op 908.66 MB/s 0 B/op 0 allocs/op
BenchmarkSkipBytes-2 10000000 168 ns/op 884.07 MB/s 0 B/op 0 allocs/op
BenchmarkReadMapHeader-2 100000000 21.2 ns/op 94.43 MB/s 0 B/op 0 allocs/op
BenchmarkReadArrayHeader-2 100000000 20.8 ns/op 96.05 MB/s 0 B/op 0 allocs/op
BenchmarkReadNil-2 100000000 17.1 ns/op 58.40 MB/s 0 B/op 0 allocs/op
BenchmarkReadFloat64-2 50000000 35.4 ns/op 254.51 MB/s 0 B/op 0 allocs/op
BenchmarkReadFloat32-2 50000000 35.3 ns/op 141.69 MB/s 0 B/op 0 allocs/op
BenchmarkReadInt64-2 50000000 23.2 ns/op 172.10 MB/s 0 B/op 0 allocs/op
BenchmarkReadUint64-2 100000000 25.0 ns/op 80.02 MB/s 0 B/op 0 allocs/op
BenchmarkRead16Bytes-2 20000000 52.1 ns/op 345.16 MB/s 0 B/op 0 allocs/op
BenchmarkRead256Bytes-2 20000000 113 ns/op 2271.98 MB/s 0 B/op 0 allocs/op
BenchmarkRead2048Bytes-2 3000000 468 ns/op 4379.13 MB/s 0 B/op 0 allocs/op
BenchmarkRead16StringAsBytes-2 30000000 46.2 ns/op 367.59 MB/s 0 B/op 0 allocs/op
BenchmarkRead256StringAsBytes-2 20000000 135 ns/op 1913.54 MB/s 0 B/op 0 allocs/op
BenchmarkRead16String-2 10000000 119 ns/op 142.13 MB/s 16 B/op 1 allocs/op
BenchmarkRead256String-2 5000000 312 ns/op 827.54 MB/s 256 B/op 1 allocs/op
BenchmarkReadComplex64-2 50000000 37.4 ns/op 267.07 MB/s 0 B/op 0 allocs/op
BenchmarkReadComplex128-2 50000000 42.5 ns/op 423.25 MB/s 0 B/op 0 allocs/op
BenchmarkReadTime-2 30000000 41.4 ns/op 362.65 MB/s 0 B/op 0 allocs/op
BenchmarkSkip-2 3000000 435 ns/op 341.87 MB/s 0 B/op 0 allocs/op
BenchmarkAppendMapHeader-2 100000000 11.0 ns/op 0 B/op 0 allocs/op
BenchmarkAppendArrayHeader-2 100000000 11.7 ns/op 0 B/op 0 allocs/op
BenchmarkAppendFloat64-2 100000000 19.0 ns/op 472.99 MB/s 0 B/op 0 allocs/op
BenchmarkAppendFloat32-2 100000000 15.4 ns/op 325.56 MB/s 0 B/op 0 allocs/op
BenchmarkAppendInt64-2 50000000 33.7 ns/op 0 B/op 0 allocs/op
BenchmarkAppendUint64-2 50000000 29.4 ns/op 0 B/op 0 allocs/op
BenchmarkAppend16Bytes-2 100000000 21.1 ns/op 993.56 MB/s 0 B/op 0 allocs/op
BenchmarkAppend256Bytes-2 50000000 27.7 ns/op 9420.12 MB/s 0 B/op 0 allocs/op
BenchmarkAppend2048Bytes-2 20000000 110 ns/op 18523.14 MB/s 0 B/op 0 allocs/op
BenchmarkAppend16String-2 100000000 25.3 ns/op 829.57 MB/s 0 B/op 0 allocs/op
BenchmarkAppend256String-2 50000000 33.1 ns/op 7886.48 MB/s 0 B/op 0 allocs/op
BenchmarkAppend2048String-2 20000000 98.0 ns/op 20952.67 MB/s 0 B/op 0 allocs/op
BenchmarkAppendBool-2 300000000 4.31 ns/op 231.91 MB/s 0 B/op 0 allocs/op
BenchmarkAppendTime-2 50000000 26.1 ns/op 574.90 MB/s 0 B/op 0 allocs/op
BenchmarkWriteMapHeader-2 200000000 9.63 ns/op 0 B/op 0 allocs/op
BenchmarkWriteArrayHeader-2 200000000 9.30 ns/op 0 B/op 0 allocs/op
BenchmarkWriteFloat64-2 100000000 15.2 ns/op 593.26 MB/s 0 B/op 0 allocs/op
BenchmarkWriteFloat32-2 100000000 11.7 ns/op 427.58 MB/s 0 B/op 0 allocs/op
BenchmarkWriteInt64-2 100000000 15.0 ns/op 601.99 MB/s 0 B/op 0 allocs/op
BenchmarkWriteUint64-2 100000000 14.6 ns/op 615.61 MB/s 0 B/op 0 allocs/op
BenchmarkWrite16Bytes-2 100000000 23.5 ns/op 0 B/op 0 allocs/op
BenchmarkWrite256Bytes-2 50000000 36.7 ns/op 0 B/op 0 allocs/op
BenchmarkWrite2048Bytes-2 20000000 92.0 ns/op 0 B/op 0 allocs/op
BenchmarkWriteTime-2 50000000 29.4 ns/op 509.40 MB/s 0 B/op 0 allocs/op
BenchmarkWriteReadFile-2 2000000 822 ns/op 98.48 MB/s


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#155 (comment)

@hectorj
Copy link
Contributor Author

hectorj commented May 12, 2016

Ok, the benchmarks in _generated/ indeed show a difference. One more allocation for:

  • BenchmarkEncodeBlock
  • BenchmarkMarshalMsgThings
  • BenchmarkAppendMsgThings
  • BenchmarkEncodeThings
  • BenchmarkEncodeX

I'll see if I can improve that

@zond
Copy link

zond commented May 12, 2016

To be clear, my benchmark was of my own code, using my fork of msgp.

I was unable to compare my code with the fork vs my code with mainline msgp since I was unable to get my code working with mainline.

I didn't think to benchmark msgp in the fork on its own vs mainline.

@hectorj
Copy link
Contributor Author

hectorj commented May 12, 2016

That's making me wonder if I missed something with mailru/easyjson#15 or if some difference in the implementations makes it efficient with easyjson but not with msgp...

Gotta do some digging

@zond
Copy link

zond commented May 12, 2016

I'm not sure I understand what you mean, but just to make sure I'll clarify even more :)

I benchmarked my own code using https://github.com/vmihailenco/msgpack vs my own code using a fork of msgp that just added shims to some more types.

This benchmark showed the unreasonable result that the code became slower with msgp.

This was unreasonable because msgpack used reflection, while msgp uses generated hard coded coders.

This made me give up and forget all about it.

TL;DR

I don't believe msgp is slower than msgpack, and I don't necessarily think more indirection via pointers or shims in msgp would make things relevantly slower.

@hectorj
Copy link
Contributor Author

hectorj commented May 12, 2016

@zond: oh, thanks for the clarification. To clarify too, my last comment was not about your observations but about the results I get from my benchmarks run.

The non-pointer-receiver way is less efficient for msgp, but it did not seem to be for easyjson (which does something similar to msgp, just for faster json Marshaling/Unmarshaling)

@zond
Copy link

zond commented May 12, 2016

@hectorj Ah, thanks!

Weird, please explain what caused it if you find out :)

@hectorj
Copy link
Contributor Author

hectorj commented May 16, 2016

Closing for now, as I haven't been able to produce code with the same features & performances and non-pointers receivers for now, and I don't have enough time to keep trying.

Thanks all for your inputs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants