Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression in v5 - Suboptimal packing due to incorrect assumed length of escape sequence #85

Closed
Siorki opened this issue Oct 9, 2018 · 4 comments
Assignees
Milestone

Comments

@Siorki
Copy link
Owner

Siorki commented Oct 9, 2018

Benchmark "2014 - Flappy Dragon" run with different revisions of RegPack :

  • v4.0.1 : 984b crushed, 994b packed (character class)
  • v5.0.1 : 984b crushed, 998b packed (character class)

Digging further, the crusher phase differs, v5 shaves a few bytes off the unpacking routine using ES6 (see PR #53)

  • v4.0.1 : string 904b, unpacking routine 80b. Tokens include _
  • v5.0.1 : string 911b, unpacking routine 73b.
@Siorki Siorki self-assigned this Oct 9, 2018
@Siorki Siorki added this to the 5.0.2 milestone Oct 9, 2018
@xem
Copy link

xem commented Oct 9, 2018

(I thought you used my entry as a benchmark test :o https://js1k.com/2014-dragons/demo/1704 )

@Siorki
Copy link
Owner Author

Siorki commented Oct 9, 2018

Right, I keep uncompressed versions of the top 3 (or so) for each edition of js1k and use them as benchmarks, to illustrate progress among revisions of RegPack :

http://siorki.github.io/benchmarks.html

@Siorki
Copy link
Owner Author

Siorki commented Oct 13, 2018

There are two independent reasons for this difference :

The fix for #57, present since v5.0.0, which prevents _ from being (incorrectly) renamed to A inside a string. Using _ as an extra token was worth 2 bytes, yet it was definitely a bug. The sample packed with v4 would decode incorrectly, yielding a glitch in the dragon sprite.
Those 2 bytes will not be regained => won't fix.

The fix for #65 introduces a bias in the crusher. The output is correct (meaning it unpacks to the initial code) but suboptimal as far as compression is concerned.
The sample code contains a string with several escaped \ (thus represented as z="\\").
As RegPack stores the code in a (compressed) string, it also escapes all \, turning the already escaped backslash into G='z="\\\\"'.
v4 performs the escaping before running the crusher. Initial sequences of \\ are escaped as \\\\, counted as 4 bytes and packed as such.
v5 performs the escaping after running the crusher, because of #65. However, \\ is counted as 2 bytes, and therefore not considered worth replacing with a token.

Proposed solution : when computing string length in the crusher, count 2 instead of 1 for each \.

@Siorki Siorki changed the title Size regression on benchmark "Flappy Dragon" between v4 and v5 Regression in v5 - Suboptimal packing due to incorrect assumed length of escape sequence Oct 13, 2018
@Siorki
Copy link
Owner Author

Siorki commented Oct 15, 2018

Added a method getEscapedByteLength(), counts 2 for character \

Flappy Dragon down to 980b crushed, 994b packed.

4428a61

@Siorki Siorki closed this as completed Oct 15, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants