Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add benchmarks for UTF16 decoding #34435

Merged
merged 8 commits into from
Feb 23, 2023

Conversation

valeriyvan
Copy link
Contributor

Adds benchmarks for UTF16 decoding

Partially resolves SR-8905

@valeriyvan valeriyvan force-pushed the StringDecodeUTF16Benchmark branch from 169e18d to b5c14fd Compare October 26, 2020 17:02
Copy link
Collaborator

@xwu xwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@swift-ci Please benchmark

benchmark/single-source/UTF16Decode.swift Show resolved Hide resolved
benchmark/single-source/UTF16Decode.swift Show resolved Hide resolved
benchmark/single-source/UTF16Decode.swift Show resolved Hide resolved
benchmark/single-source/UTF16Decode.swift Show resolved Hide resolved
@valeriyvan valeriyvan requested a review from xwu October 26, 2020 20:50
@xwu
Copy link
Collaborator

xwu commented Oct 26, 2020

@swift-ci Please benchmark

@swift-ci
Copy link
Contributor

Performance: -O

Regression OLD NEW DELTA RATIO
SuffixAnyCollection 11 12 +9.1% 0.92x (?)
AngryPhonebook 350 378 +8.0% 0.93x (?)
 
Improvement OLD NEW DELTA RATIO
String.data.LargeUnicode 112 103 -8.0% 1.09x (?)
ObjectiveCBridgeStubToNSStringRef 122 114 -6.6% 1.07x (?)
 
Added MIN MAX MEAN MAX_RSS
UTF16Decode 191 193 192
UTF16Decode_InitDecoding 38417 39094 38857
UTF16Decode_InitDecoding_ascii 147873 151859 150462
UTF16Decode_InitFromCustom_contiguous 7511 7694 7600
UTF16Decode_InitFromCustom_contiguous_ascii 29035 29072 29054
UTF16Decode_InitFromCustom_noncontiguous 7749 7827 7796
UTF16Decode_InitFromCustom_noncontiguous_ascii 29974 29983 29980
UTF16Decode_InitFromData 319 322 321
UTF16Decode_InitFromData_ascii 1408 1483 1446
UTF16Decode_InitFromData_ascii_as_ascii 777 783 780

Code size: -O

Performance: -Osize

Improvement OLD NEW DELTA RATIO
UTF8Decode_InitFromData_ascii_as_ascii 772 630 -18.4% 1.23x (?)
 
Added MIN MAX MEAN MAX_RSS
UTF16Decode 197 200 198
UTF16Decode_InitDecoding 37922 38203 38041
UTF16Decode_InitDecoding_ascii 147469 147713 147624
UTF16Decode_InitFromCustom_contiguous 7700 7723 7711
UTF16Decode_InitFromCustom_contiguous_ascii 29380 29416 29402
UTF16Decode_InitFromCustom_noncontiguous 7948 8051 7983
UTF16Decode_InitFromCustom_noncontiguous_ascii 30597 30739 30644
UTF16Decode_InitFromData 251 256 254
UTF16Decode_InitFromData_ascii 1204 1267 1226
UTF16Decode_InitFromData_ascii_as_ascii 653 655 654

Code size: -Osize

Performance: -Onone

Improvement OLD NEW DELTA RATIO
DataAppendArray 6800 5800 -14.7% 1.17x (?)
 
Added MIN MAX MEAN MAX_RSS
UTF16Decode 38668 38973 38794
UTF16Decode_InitDecoding 38391 39237 38708
UTF16Decode_InitDecoding_ascii 148147 149508 148981
UTF16Decode_InitFromCustom_contiguous 31290 32075 31730
UTF16Decode_InitFromCustom_contiguous_ascii 120759 121332 121068
UTF16Decode_InitFromCustom_noncontiguous 31522 32079 31867
UTF16Decode_InitFromCustom_noncontiguous_ascii 121736 122049 121931
UTF16Decode_InitFromData 308 331 318
UTF16Decode_InitFromData_ascii 1404 1472 1440
UTF16Decode_InitFromData_ascii_as_ascii 916 917 916

Code size: -swiftlibs

Benchmark Check Report
⛔️🔤 UTF16Decode_InitFromCustom_contiguous_ascii name doesn`t conform to benchmark naming convention.
See http://bit.ly/BenchmarkNaming
⛔️🔤 UTF16Decode_InitFromCustom_contiguous_ascii name is 43 characters long.
Benchmark name should not be longer than 40 characters.
⛔️⏱ UTF16Decode_InitFromCustom_contiguous_ascii execution took at least 28839 μs.
Decrease the workload of UTF16Decode_InitFromCustom_contiguous_ascii by a factor of 32 (100), to be less than 1000 μs.
⛔️🔤 UTF16Decode_InitDecoding name doesn`t conform to benchmark naming convention.
See http://bit.ly/BenchmarkNaming
⛔️⏱ UTF16Decode_InitDecoding execution took at least 37711 μs.
Decrease the workload of UTF16Decode_InitDecoding by a factor of 64 (100), to be less than 1000 μs.
⛔️🔤 UTF16Decode_InitFromData name doesn`t conform to benchmark naming convention.
See http://bit.ly/BenchmarkNaming
⛔️🔤 UTF16Decode_InitFromCustom_noncontiguous_ascii name doesn`t conform to benchmark naming convention.
See http://bit.ly/BenchmarkNaming
⛔️🔤 UTF16Decode_InitFromCustom_noncontiguous_ascii name is 46 characters long.
Benchmark name should not be longer than 40 characters.
⛔️⏱ UTF16Decode_InitFromCustom_noncontiguous_ascii execution took at least 29787 μs.
Decrease the workload of UTF16Decode_InitFromCustom_noncontiguous_ascii by a factor of 32 (100), to be less than 1000 μs.
⛔️🔤 UTF16Decode_InitFromCustom_noncontiguous name doesn`t conform to benchmark naming convention.
See http://bit.ly/BenchmarkNaming
⚠️ UTF16Decode_InitFromCustom_noncontiguous execution took at least 7714 μs.
Decrease the workload of UTF16Decode_InitFromCustom_noncontiguous by a factor of 8 (10), to be less than 1000 μs.
⛔️🔤 UTF16Decode_InitFromData_ascii name doesn`t conform to benchmark naming convention.
See http://bit.ly/BenchmarkNaming
⛔️⏱ UTF16Decode_InitFromData_ascii has setup overhead of 256 μs (19.2%).
Move initialization of benchmark data to the setUpFunction registered in BenchmarkInfo.
⚠️ UTF16Decode_InitFromData_ascii execution took at least 1077 μs (excluding the setup overhead).
Decrease the workload of UTF16Decode_InitFromData_ascii by a factor of 2 (10), to be less than 1000 μs.
⛔️🔤 UTF16Decode_InitDecoding_ascii name doesn`t conform to benchmark naming convention.
See http://bit.ly/BenchmarkNaming
⛔️⏱ UTF16Decode_InitDecoding_ascii execution took at least 145108 μs.
Decrease the workload of UTF16Decode_InitDecoding_ascii by a factor of 256 (1000), to be less than 1000 μs.
⛔️🔤 UTF16Decode_InitFromData_ascii_as_ascii name doesn`t conform to benchmark naming convention.
See http://bit.ly/BenchmarkNaming
⛔️⏱ UTF16Decode_InitFromData_ascii_as_ascii has setup overhead of 100 μs (13.3%).
Move initialization of benchmark data to the setUpFunction registered in BenchmarkInfo.
⛔️🔤 UTF16Decode_InitFromCustom_contiguous name doesn`t conform to benchmark naming convention.
See http://bit.ly/BenchmarkNaming
⚠️ UTF16Decode_InitFromCustom_contiguous execution took at least 7483 μs.
Decrease the workload of UTF16Decode_InitFromCustom_contiguous by a factor of 8 (10), to be less than 1000 μs.
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

Copy link
Collaborator

@xwu xwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The suggestions on how to modify the benchmarks are pretty helpful.

@valeriyvan
Copy link
Contributor Author

valeriyvan commented Oct 27, 2020

The suggestions on how to modify the benchmarks are pretty helpful.

Suggestions absolutely make sense. But. Benchmarks which were added are copycats of UTF8 benchmarks. They use same names. They use same test strings. I think, it's very interesting to see that UTF8Decode_InitDecoding takes 195μs and UTF16Decode_InitDecoding takes 84358μs which is several orders of magnitude slower. If you agree comparison of these benchmarks is important for future, tests in this PR should follow names from UTF8Decode benchmark. For the same reason benchmarks shouldn't be optimised to be executed less times. Otherwise we lose base of comparison to UTF8. I am going to add UTF32Decode as well, as separate benchmark. So we will have overview of speed of all used UTF conversions.

What do you think of it, @xwu? Should performance team (@eeckstein) share their opinion on this?

@xwu
Copy link
Collaborator

xwu commented Oct 27, 2020

No, those are legacy names. Benchmarks now have a different naming scheme, and all new benchmarks should adhere to them. The workload and setup warnings should be adhered to as well precisely because it is important to have reliable benchmarks that can be compared over time.

@eeckstein
Copy link
Contributor

Is it possible to reduce the number of new benchmarks to a set which contains a representative coverage of the algorithms/code which it should test?
I know, it's just a copy of the UTF8 benchmark file (but also for this we should think of reducing the number of benchmarks).

If we end up adding benchmarks for all combinations of all language features, the benchmark runtime just gets out of bounds.

Note: it's possible to add the ".skip" benchmark tag, which excludes a benchmark from the regular run, but still enables someone to run it locally.

@valeriyvan valeriyvan force-pushed the StringDecodeUTF16Benchmark branch from 816e43d to 71e8288 Compare February 14, 2023 11:32
@valeriyvan
Copy link
Contributor Author

Is it possible to reduce the number of new benchmarks to a set which contains a representative coverage of the algorithms/code which it should test? I know, it's just a copy of the UTF8 benchmark file (but also for this we should think of reducing the number of benchmarks).

If we end up adding benchmarks for all combinations of all language features, the benchmark runtime just gets out of bounds.

Note: it's possible to add the ".skip" benchmark tag, which excludes a benchmark from the regular run, but still enables someone to run it locally.

done

@valeriyvan valeriyvan requested a review from xwu February 15, 2023 12:07
@valeriyvan valeriyvan changed the title Adds benchmarks for UTF16 decoding Add benchmarks for UTF16 decoding Feb 15, 2023
@valeriyvan
Copy link
Contributor Author

ping

@eeckstein
Copy link
Contributor

@swift-ci Please benchmark

@eeckstein
Copy link
Contributor

⛔️⏱ | UTF16Decode.initFromData has setup overhead of 9 μs (5.6%).
Move initialization of benchmark data to the setUpFunction registered in BenchmarkInfo

It would be good to force initialization of the global variables in the setUpFunction, e.g. add blackHole(allStringsData) to the setup function(s)

@valeriyvan
Copy link
Contributor Author

⛔️⏱ | UTF16Decode.initFromData has setup overhead of 9 μs (5.6%). Move initialization of benchmark data to the setUpFunction registered in BenchmarkInfo

It would be good to force initialization of the global variables in the setUpFunction, e.g. add blackHole(allStringsData) to the setup function(s)

done

@eeckstein
Copy link
Contributor

@swift-ci benchmark

@eeckstein
Copy link
Contributor

@swift-ci smoke test

@eeckstein
Copy link
Contributor

@swift-ci smoke test linux

@valeriyvan
Copy link
Contributor Author

@swift-ci smoke test linux

Should I bother about failed linux test?

@AnthonyLatsis
Copy link
Collaborator

@swift-ci please smoke test Linux

@eeckstein eeckstein merged commit 9d5dd75 into swiftlang:main Feb 23, 2023
@eeckstein
Copy link
Contributor

@valeriyvan The linux failure was unrelated. Thanks for the contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants