-
Notifications
You must be signed in to change notification settings - Fork 14
Packing
RegPack uses a dictionary-based lossless compression algorithm to reduce the amount of statistical redundancy in the input code. An unpacking routine is then appended to the output, that will decompress it when run, retrieving the original code.
Packing consists in several modules run in sequence, elaborating on the output of the previous one. All intermediate results are valid compressed code, the shortest thereof being kept as the final result. The compression algorithm occurs first, then several schemes are attempted to reduce the size of the unpacking routine.
Creates the dictionary, replacing substrings by tokens, and appends the unpacking routine to the compressed string.
Rearranges the tokens to use consecutive characters. The matching regex, defining the tokens as a character class, is included in the unpacking routine.
Rearranges the tokens again, this time to use ranges for the complement, so that the tokens are defined using a negated character class (character class starting with a caret ^
).
Once the packing operation is complete, RegPack GUI shows several results :
- "Preprocessed" is the result after all the preprocessing modules are run
- "Crushed" is the result produced by the crusher
- "RegPack'ed" is the shortest result among the two regex modules In case the preprocessing stage produces several branches, only the one with the best compression (shortest output) is presented.
The entry point is RegPack.cmdRegPack(input, options)
, which returns a string containing the best result (shortest output) between preprocessor stage, crusher and regex modules.
The class RegPack
features the crusher and regex modules.
The main method is RegPack.runPacker()
which first invokes the preprocessor, then iterates all the branches it created, running the packer on each of them and returning an array containing all results.
RegPack.cmdRegPack()
is a wrapper function that extracts and returns the best compression from the different results, and is used as the main entry point when used from Node.js.