Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Rollup merge of rust-lang#119468 - notriddle:notriddle/compression, r=GuillaumeGomez rustdoc-search: tighter encoding for f index Depends on rust-lang#119457 Two optimizations for the function signature search: * Instead of using JSON arrays, like `[1,20]`, it uses VLQ hex with no commas, like `[aAd]`. * This also adds backrefs: if you have more than one function with exactly the same signature, it'll not only store it once, it'll *decode* it once, and store in the typeIdMap only once. Based partially on discussions on zulip: https://rust-lang.zulipchat.com/#narrow/stream/266220-t-rustdoc/topic/search.20index.20size Performance ----------- https://notriddle.com/rustdoc-html-demo-8/compression-perf-v2/index.html ### memory/time profiler output (for more details, consult the above link) <table> <thead><tr><th>benchmark<th>before<th>after</tr></thead> <tbody> <tr><th>arti<td> ``` user: 002.789 s sys: 000.390 s wall: 002.096 s child_RSS_high: 440796 KiB group_mem_high: 414924 KiB ``` </td><td> ``` user: 002.295 s sys: 000.278 s wall: 001.738 s child_RSS_high: 314588 KiB group_mem_high: 285220 KiB ``` </td></tr><tr><th>cortex-m<td> ``` user: 000.127 s sys: 000.030 s wall: 000.134 s child_RSS_high: 60264 KiB group_mem_high: 23824 KiB ``` </td><td> ``` user: 000.136 s sys: 000.038 s wall: 000.137 s child_RSS_high: 59204 KiB group_mem_high: 22712 KiB ``` </td></tr><tr><th>sqlx<td> ``` user: 000.887 s sys: 000.118 s wall: 000.592 s child_RSS_high: 190408 KiB group_mem_high: 157804 KiB ``` </td><td> ``` user: 000.798 s sys: 000.101 s wall: 000.525 s child_RSS_high: 159292 KiB group_mem_high: 126292 KiB ``` </td></tr><tr><th>stm32f4<td> ``` user: 013.884 s sys: 005.399 s wall: 013.149 s child_RSS_high: 1942244 KiB group_mem_high: 1954916 KiB ``` </td><td> ``` user: 006.128 s sys: 003.297 s wall: 007.994 s child_RSS_high: 1038108 KiB group_mem_high: 1023900 KiB ``` </td></tr><tr><th>ripgrep<td> ``` user: 000.441 s sys: 000.063 s wall: 000.264 s child_RSS_high: 109180 KiB group_mem_high: 74272 KiB ``` </td><td> ``` user: 000.408 s sys: 000.044 s wall: 000.238 s child_RSS_high: 101488 KiB group_mem_high: 66000 KiB ``` </td></tr></tbody></table> Size change ----------- standard library without gzip: ```console $ du -bs search-index-old.js search-index-new.js 4976370 search-index-old.js 4404391 search-index-new.js ``` ((4976370-4404391)/4404391)*100% = 12.9% with gzip: ```console $ du -hs search-index-old.js.gz search-index-new.js.gz 520K search-index-old.js.gz 504K search-index-new.js.gz $ du -bs search-index-old.js.gz search-index-new.js.gz 522092 search-index-old.js.gz 507654 search-index-new.js.gz ``` ((522092-507654)/507654)*100% = 2.8% Benchmarks are similarly shrunk. Without gzip: ```console $ du -hs tmp/{arti,cortex-m,sqlx,stm32f4,ripgrep}/toolchain_{old,new}/doc/search-index.js 10555067 tmp/arti/toolchain_old/doc/search-index.js 8921236 tmp/arti/toolchain_new/doc/search-index.js 77018 tmp/cortex-m/toolchain_old/doc/search-index.js 66676 tmp/cortex-m/toolchain_new/doc/search-index.js 2876330 tmp/sqlx/toolchain_old/doc/search-index.js 2436812 tmp/sqlx/toolchain_new/doc/search-index.js 63632890 tmp/stm32f4/toolchain_old/doc/search-index.js 52337438 tmp/stm32f4/toolchain_new/doc/search-index.js 631150 tmp/ripgrep/toolchain_old/doc/search-index.js 541646 tmp/ripgrep/toolchain_new/doc/search-index.js ``` With gzip: ```console $ du -bs tmp/{arti,cortex-m,sqlx,stm32f4,ripgrep}/toolchain_{old,new}/doc/search-index.js.gz 1618852 tmp/arti/toolchain_old/doc/search-index.js.gz 1582007 tmp/arti/toolchain_new/doc/search-index.js.gz 16109 tmp/cortex-m/toolchain_old/doc/search-index.js.gz 15831 tmp/cortex-m/toolchain_new/doc/search-index.js.gz 422257 tmp/sqlx/toolchain_old/doc/search-index.js.gz 411507 tmp/sqlx/toolchain_new/doc/search-index.js.gz 4454761 tmp/stm32f4/toolchain_old/doc/search-index.js.gz 4334924 tmp/stm32f4/toolchain_new/doc/search-index.js.gz 98312 tmp/ripgrep/toolchain_old/doc/search-index.js.gz 96864 tmp/ripgrep/toolchain_new/doc/search-index.js.gz $ du -hs tmp/{arti,cortex-m,sqlx,stm32f4,ripgrep}/toolchain_{old,new}/doc/search-index.j s.gz 1.6M tmp/arti/toolchain_old/doc/search-index.js.gz 1.6M tmp/arti/toolchain_new/doc/search-index.js.gz 24K tmp/cortex-m/toolchain_old/doc/search-index.js.gz 24K tmp/cortex-m/toolchain_new/doc/search-index.js.gz 424K tmp/sqlx/toolchain_old/doc/search-index.js.gz 412K tmp/sqlx/toolchain_new/doc/search-index.js.gz 4.3M tmp/stm32f4/toolchain_old/doc/search-index.js.gz 4.2M tmp/stm32f4/toolchain_new/doc/search-index.js.gz 108K tmp/ripgrep/toolchain_old/doc/search-index.js.gz 104K tmp/ripgrep/toolchain_new/doc/search-index.js.gz ```
- Loading branch information