Regression in v5 - Suboptimal packing due to incorrect assumed length of escape sequence #85

Siorki · 2018-10-09T20:35:12Z

Benchmark "2014 - Flappy Dragon" run with different revisions of RegPack :

v4.0.1 : 984b crushed, 994b packed (character class)
v5.0.1 : 984b crushed, 998b packed (character class)

Digging further, the crusher phase differs, v5 shaves a few bytes off the unpacking routine using ES6 (see PR #53)

v4.0.1 : string 904b, unpacking routine 80b. Tokens include _
v5.0.1 : string 911b, unpacking routine 73b.

The text was updated successfully, but these errors were encountered:

xem · 2018-10-09T20:38:04Z

(I thought you used my entry as a benchmark test :o https://js1k.com/2014-dragons/demo/1704 )

Siorki · 2018-10-09T21:03:58Z

Right, I keep uncompressed versions of the top 3 (or so) for each edition of js1k and use them as benchmarks, to illustrate progress among revisions of RegPack :

http://siorki.github.io/benchmarks.html

Siorki · 2018-10-13T21:24:40Z

There are two independent reasons for this difference :

The fix for #57, present since v5.0.0, which prevents _ from being (incorrectly) renamed to A inside a string. Using _ as an extra token was worth 2 bytes, yet it was definitely a bug. The sample packed with v4 would decode incorrectly, yielding a glitch in the dragon sprite.
Those 2 bytes will not be regained => won't fix.

The fix for #65 introduces a bias in the crusher. The output is correct (meaning it unpacks to the initial code) but suboptimal as far as compression is concerned.
The sample code contains a string with several escaped \ (thus represented as z="\\").
As RegPack stores the code in a (compressed) string, it also escapes all \, turning the already escaped backslash into G='z="\\\\"'.
v4 performs the escaping before running the crusher. Initial sequences of \\ are escaped as \\\\, counted as 4 bytes and packed as such.
v5 performs the escaping after running the crusher, because of #65. However, \\ is counted as 2 bytes, and therefore not considered worth replacing with a token.

Proposed solution : when computing string length in the crusher, count 2 instead of 1 for each \.

Siorki · 2018-10-15T20:44:04Z

Added a method getEscapedByteLength(), counts 2 for character \

Flappy Dragon down to 980b crushed, 994b packed.

4428a61

Siorki self-assigned this Oct 9, 2018

Siorki added this to the 5.0.2 milestone Oct 9, 2018

Siorki changed the title ~~Size regression on benchmark "Flappy Dragon" between v4 and v5~~ Regression in v5 - Suboptimal packing due to incorrect assumed length of escape sequence Oct 13, 2018

Siorki added bug regression labels Oct 13, 2018

Siorki added a commit that referenced this issue Oct 15, 2018

#85 : in the crusher, count each \ as 2 bytes

4428a61

Siorki closed this as completed Oct 15, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regression in v5 - Suboptimal packing due to incorrect assumed length of escape sequence #85

Regression in v5 - Suboptimal packing due to incorrect assumed length of escape sequence #85

Siorki commented Oct 9, 2018

xem commented Oct 9, 2018

Siorki commented Oct 9, 2018

Siorki commented Oct 13, 2018

Siorki commented Oct 15, 2018

Regression in v5 - Suboptimal packing due to incorrect assumed length of escape sequence #85

Regression in v5 - Suboptimal packing due to incorrect assumed length of escape sequence #85

Comments

Siorki commented Oct 9, 2018

xem commented Oct 9, 2018

Siorki commented Oct 9, 2018

Siorki commented Oct 13, 2018

Siorki commented Oct 15, 2018