Skip to content

Module hash contexts

Siorki edited this page Mar 5, 2017 · 6 revisions

Hash contexts

Principle

The technique of creating shorter aliases for methods of a GraphicContext2D was initially developed by Marijn Haverbeke for his winning js1k 2010 entry Bouncing Beholder. Immediately after the context is created, a loop iterates on its methods, using a hash function to create new, shorter method names that will point to the original ones. This is equivalent to creating new function pointers with the same values, that can be called instead of the original ones.

Since contexts have quite long method names, the resulting code gets more concise, even though there is an overhead cost to add the loop in the first place. The original hashing function was i[0]+(""|i[6]), taking the first and - if it existed - the seventh character of the method name, then afterwards one could use c.cR() instead of c.createRadialGradient(). The drawback is that it works on methods only : while c.cR() points to the original method (think of it as a function pointer), c.fy is a value, not a pointer, which is thus not linked to the original c.fillStyle.

The module elaborates on this principle, choosing among a library of hashing functions the one yielding the shortest result. It also performs its duty on GraphicContext2D, GraphicContextGL and also AudioContext.

c=canvas.getContext("2d");
c.fillStyle="hsl(256,25%,"+Math.max(0,100-t)+"%";
c.fillRect(0,0,canvas.width,canvas.height);
c.fillStyle = "#fff";
c.save();
c.translate(a.width/2, a.height/2);
c.rotate(Math.cos(t/50));
r = c.createRadialGradient(0, 0, 256, 0, t, 25);

gets shortened to :

for(i in c=canvas.getContext("2d"))c[i[0]+i[6]]=c[i];
c.fillStyle="hsl(256,25%,"+Math.max(0,100-t)+"%";
c.fc(0,0,canvas.width,canvas.height);
c.fillStyle = "#fff";
c.save();
c.ta(a.width/2, a.height/2);
c.rotate(Math.cos(t/50));
r = c.cR(0, 0, 256, 0, t, 25);

It also features an improvement designed by @xem and @subzey to hash both methods and properties. Instead of pointing to the original method, the alias contains its name as a string. Accessing the original is achieved using the object member operator. With the same hashing function as above, c.fy contains "fillStyle", and is evaluated using c[c.fy]. Our initial example gets shortened to :

for(i in c=canvas.getContext("2d"))c[i[0]+i[6]]=i;
c[c.fy]="hsl(256,25%,"+Math.max(0,100-t)+"%";
c[c.fc](0,0,canvas.width,canvas.height);
c[c.fy] = "#fff";
c.save();
c[c.ta](a.width/2, a.height/2);
c.rotate(Math.cos(t/50));
r = c[c.cR](0, 0, 256, 0, t, 25);

The module creates multiple branches as it tries several algorithms on one context. Packing will be performed on all of them, keeping only the shortest one. No heuristic is known that would determine which branch would pack best, without running the packer itself.

For GraphicContext2D, three branches are created :

  • unchanged (no hashing) : c.createRadialGradient()
  • hash methods only : c.cR()
  • hash methods and properties : c[c.cR]()

For GraphicContextGL, four branches are created :

  • unchanged (no hashing) : g.depthFunc(g.LEQUAL)
  • hash methods only : g.dF(g.LEQUAL),
  • hash methods and replace GL constants by their value : g.dF(515)
  • hash methods and properties : g[g.dF](g[g.LL])

For AudioContext, two branches are created since they contain no properties

  • unchanged (no hashing) : d.createGain()
  • hash methods only : d.cG()

If the input code contains several types of contexts, the number of branches will be multiplied accordingly to explore all combinations. For both 2D and Audio, that is 6 branches. Multiple instances of a type of context share the same hashing function and will not create more branches than a single context.

Inner clockwork

Identifying contexts

The module first identifies the contexts inside the code. Options can be set to ignore - and therefore leave unchanged - each type of context:

  • 2D contexts declaration are recognized by looking for the string getContext("2d")
  • GL contexts declaration are recognized by looking for the strings getContext("webgl"), getContext("experimental-webgl"), or a combination thereof.
  • AudioContext declaration is a bit more complex. Both new AudioContext and new webkitAudioContext are recognized. Code attempting to create both and using whichever returns a value is commonly seen, so combinations using | are also accepted.
  • Tagged template literal syntax such as getContext`2d` is also supported (see #63)

For all identified contexts, the name of the variables containing them are stored. Some competitions such as js1k provide a shim, which is a web page with a pre-defined environment where the code will be run. The environment may already contain a context, which is added to the list if the option contextVariableName is set.

The different algorithms are then run on 2D contexts, WebGL contexts and AudioContexts, in that order. Any disabled or missing type is ignored. Branches are created at each step, one branch for each algorithm.

Hashing algorithm description

Hashing is performed on all contexts of the same type at once. The same hash function will be used for all of them, so that the can be intialized in the same loop.

First, all methods from those contexts actually used in the input code are listed. (Methods and properties in the second algorithm). Then, all hash functions are tested against this subset as described below. The byte gain obtained by replacing name by their hashes is then measured, subtracting the length of the initial loop. Methods called multiple times are only counted once, as it is assumed that duplicates will be flattened out by the packer.

The function with the best gain is kept, and the replacement performed on the code. The initialization loop is then inserted at the declaration of the last context, or at the beginning (PackerData.initialDeclarationOffset) if using only a context from the shim. The module looks for other for loops and reuses the most common variable name, to benefit from a possible compression in the packer stage.

Replacement for method-only hashing

Assuming contexts c1 and c2 defined earlier, the loop replaces c=a.getContext('2d') with :

for (i in c=a.getContext('2d'))c[hash(i)]=c1[hash(i)]=c2[hash(i)]=c[i]

Replacement for method and property hashing

The loop replaces c=a.getContext('2d') with :

for (i in c=a.getContext('2d'))c[hash(i)]=i

Even if there are multiple contexts, only one will contain the shortcuts. Other ones will still be able to use them, for instance c1[c.cR](). The selected context is the one with the shortest name, and may not be the one the loop is iterating on : a construct such as for (i in c1=a.getContext('2d'))c[hash(i)]=i is still valid.

Hash functions

A brute force algorithm determines the optimal hash function, testing all combinations of parameters and functions.

  • x : all values in [0-2], as the shortest method is 3 characters long : arc()
  • y : all values in [0-20]
  • w is a string, containing the name of the context method or property being shortened / hashed

Tested functions are :

 w[x]
 w[x]+w[y]          // w[y] may be undefined, see below
 w[x]+w.length
 w[x]+w[w.length-1]
 w[x]+[w[y]]        // [w[y]] can be coerced to an empty string, see below
 w[0]+w[x]+[w[y]]   // x in [0-20], y in [3-20]
 w[1]+w[x]+[w[y]]   // x in [0-20], y in [3-20]
 w[2]+w[x]+[w[y]]   // x in [0-20], y in [3-20]
 w[0]+[w[x]]+[w[y]] // x and y in [3-20]
 w.substr(x,3)

w[x] returns the x.th character of the string w (the method name) if x is lower than the string length, or undefined if beyond. As a consequence, the hash function w[0]+w[6] will turn "arc" into "aundefined". A safeguard reverts to the original function name if the hashed one is longer.

Hash functions such as w[x]+[w[y]] (notice the extra square brackets around the second member) work around this behavior, as [undefined] is coerced to an empty string. w[0]+[w[6]] will thus turn "arc" into "a".

Collisions happen when two different methods or properties are hashed to the same string. The hashing loop will overwrite the string, matching it to the latest method in order. However, the order of methods upon iterating the context is different among browsers, so it is unsafe to assume which method will come after the other one. In case of a collision, the original method names are therefore kept.

Context references - compatibility with Node.js

When the module is run inside a browser, an instance of a context of the same type is used as a reference to list all the existing methods and properties. To maintain a browser-agnostic environment and detect a collision that could happen within another browser, the following methods are added to the list, for 2D browsers only (see also issue #20).
As of today, the methods are the same amongst browsers for WebGL and Audio contexts.

// c is a 2D context
c.ellipse()                   // method in Chrome only
c.getContextAttributes()      // method in Chrome only
c.imageSmoothingEnabled       // property in Chrome only
c.mozCurrentTransform         // property in FF only
c.mozCurrentTransformInverse  // property in FF only
c.mozDash                     // property in FF only
c.mozDashOffset               // property in FF only
c.mozFillRule                 // property in FF only
c.mozImageSmoothingEnabled    // property in FF only
c.mozTextStyle                // property in FF only
c.webkitImageSmoothingEnabled // property in Webkit-based browsers only
c.drawImageFromRect           // method in Chrome up to v40		

Node.js does not come with built-in implementation of 2D, WebGL nor Audio contexts. To ensure a similar behavior with the browser environment, the methods and properties of the different contexts are hardcoded into the implementation. This has the downside of requiring an update on every major context API change in the browser.

Options

Parameter name Type Default Effect
hash2DContext boolean false enable or disable hashing for 2D contexts
hashWebGLContext boolean false enable or disable hashing for WebGL contexts
hashAudioContext boolean false enable or disable hashing for AudioContexts
contextType integer 0 Type of preset context, 0 for 2D, 1 for WebGL
contextVariableName boolean or String false variable containing the preset context, or false

Contexts may be defined outside of the input code, as is done by the js1k shim (c for 2D or g for WebGL). Define the preset to let the module know of the context.

Implementation

The module has three entry points, one for each context type :

  • Shapeshifter.preprocess2DContext() performs method and property hashing on 2D contexts
  • Shapeshifter.preprocessWebGLContext() performs method and property hashing on WebGL contexts, and replaces GL constants
  • Shapeshifter.preprocessAudioContext() performs method hashing on AudioContexts Those methods are called in sequence by Shapeshifter.preprocess(), providing there is at least one context of the matching type in the input code. All possible combinations are explored and kept as as many instances of PackerData.

In addition to the entry points, it features three internal methods :

  • ShapeShifter.renameObjectMethods() hashes methods only
  • ShapeShifter.hashObjectProperties() hashes methods and properties
  • ShapeShifter.replaceWebGLconstants() replaces constants by their numerical values in a WebGL context

The class ContextDescriptor is in charge of providing a list of methods and properties for 2D, WebGL and Audio contexts. The module features two distinct implementations :

  • one in ContextDescriptor_browser.js, included as a js source from regPack.html, using the browser environment,
  • one in ContextDescriptor_node.js, included by a require statement, for the Node.js environment.