Skip to content

Version 1.0 Optimization Notes

metafloor edited this page Aug 29, 2016 · 2 revisions

The following is an ad hoc listing of version 1.0 cross-compiled code and the potential to optimize using post-processing passes.

push-if-pop pattern

This pattern is seen when an if operator is emitted. Because of the execution branch, all trace state must be flushed prior to the if-conditional. This leads to the following pattern:

var _2Y = $get($1.encs, $1.j); /*2570*/
$k[$j++] = _2Y; /*2570*/
if ($eq($type(_2Y), "stringtype")) { /*2570*/
   var _2b = $get($k[--$j], 0); /*2570*/
   $k[$j++] = _2b; /*2570*/
}

The $j++ on the second line inhibits var-elimination, which is actually beneficial here. We can deduce that the push-expression on the second line can be eliminated and the _2Y temp-variable can be directly substituted for the $k[--$j] pop-expression on the forth line. This leads to the code:

var _2Y = $get($1.encs, $1.j); /*2570*/
if ($eq($type(_2Y), "stringtype")) { /*2570*/
   var _2b = $get(_2Y, 0); /*2570*/
   $k[$j++] = _2b; /*2570*/
}

That removes significant operand stack side-effects, allowing a var-elimination pass to further reduce code:

var _2Y = $get($1.encs, $1.j); /*2570*/
if ($eq($type(_2Y), "stringtype")) { /*2570*/
   $k[$j++] = $get(_2Y, 0); /*2570*/
}

A similar optimization is possible with ifelse branches:

var _5p = $k[--$j]; /*2696*/
$k[$j++] = _5p; /*2700*/
if ($ne($type(_5p), "arraytype")) { /*2699*/
   var _5t = $get($1.setc, $k[--$j]); /*2697*/
   $k[$j++] = _5t; /*2697*/
} else { /*2699*/
   $aload($k[--$j]); /*2699*/
   var _5v = $k[--$j]; /*2699*/
   var _5w = $k[--$j]; /*2699*/
   $k[$j++] = (_5v - 48) + ((_5w - 48) * 10); /*2699*/
} /*2699*/
$put($1.cws, $1.j, $k[--$j]); /*2701*/

The optimization must be constrained so that no operand-stack modifications are made between the push-expression and the pop-expressions in both branches. The aload operator is safe here as it executes after the pop. The above code will also benefit from the ifelse-push-pop optimization described below. Along with a final var-elimination pass, the above would be reduced to:

var _5p = $k[--$j]; /*2696*/
if ($ne($type(_5p), "arraytype")) { /*2699*/
   var __T = $get($1.setc, _5p); /*2697*/
} else { /*2699*/
   $aload(_5p); /*2699*/
   var _5v = $k[--$j]; /*2699*/
   var _5w = $k[--$j]; /*2699*/
   var __T = (_5v - 48) + ((_5w - 48) * 10); /*2699*/
} /*2699*/
$put($1.cws, $1.j, __T); /*2701*/

This optimization is both a performance and code-size win.

aload-to-vars

aload followed by a series of pops is a common pattern used to load a small parameter set onto the stack and then into temp-variables.

$aload(_5p); /*2699*/
var _5v = $k[--$j]; /*2699*/
var _5w = $k[--$j]; /*2699*/

With proper constraints and array-index tracking, we could replace the aload with the direct var assignments:

   var _5v = _5p[1]; /*2699*/
   var _5w = _5p[0]; /*2699*/
   var __T = (_5v - 48) + ((_5w - 48) * 10); /*2699*/

And that would reduce to a single line of code after var-elimination....

Similar pattern with forall:

$forall($geti($1.tab174, $1.i + 1, 7)); /*6262*/
$1.cte = $k[--$j]; /*6263*/
$1.cto = $k[--$j]; /*6263*/
$1.cmwe = $k[--$j]; /*6264*/
$1.cmwo = $k[--$j]; /*6264*/
$1.cele = $k[--$j]; /*6265*/
$1.celo = $k[--$j]; /*6265*/
$1.cgs = $k[--$j]; /*6266*/

Can we safely optimize these patterns given that we do not know state of the stack after the final pop-to-assignment? Would likely require instrumentation feedback to ensure safety.

ifelse-push-pop pattern

Common pattern seen due to the trace state flush when exiting a branch:

if ($1.ea) { /*2654*/
   $k[$j++] = $1.numSA; /*2654*/
} else { /*2654*/
   $k[$j++] = $1.numEA; /*2654*/
} /*2654*/
var _50 = $get($k[--$j], $1.i); /*2654*/

This optimization will look at the last line of both the if and else branches. If both contain a push-expression followed by a subsequent line with a pop-expression, a temp variable can be substituted for the push/pop sequences:

if ($1.ea) { /*2654*/
   var __T = $1.numSA; /*2654*/
} else { /*2654*/
   var __T = $1.numEA; /*2654*/
} /*2654*/
var _50 = $get(__T, $1.i); /*2654*/

Code size is not significantly reduced but it provides better opportunity for the js-engine to optimize.

ident-mark-def pattern

Implemented v1.0.5+ (2016-06-16)

Common code seen for building arrays on the operand stack:

$k[$j++] = "msgtmp"; /*2657*/
$k[$j++] = Infinity; /*2657*/
$aload($1.msgtmp); /*2657*/
$k[$j++] = $1.fn4; /*2657*/
var _56 = $a(); /*2657*/
$1[$k[--$j]] = _56; /*2657*/

Due to postscript's stack architecture, when defining a local dictionary entry, the name of the entry (msgtmp here) is pushed onto the stack long before it is actually used. We can safely optimize this pattern when:

  1. An identifier is pushed immediately prior to a stack-marker push (the Infinity).
  2. A subsequent construct-array call (the $a()) or construct-dictionary call ($d()). This call must be at the same block level as the ident-push and marker-push.
  3. An immediate assignment of the temp-variable to the local dictionary $1, with a pop-expression to retrieve the identifier.

Those constraints would result in:

$k[$j++] = Infinity; /*2657*/
$aload($1.msgtmp); /*2657*/
$k[$j++] = $1.fn4; /*2657*/
var _56 = $a(); /*2657*/
$1.msgtmp = _56; /*2657*/

A subsequent var-elimination pass would eliminate the _56 temp-variable and assign the constructor return directly, which was not possible before due to the pop-expression in the $1 dictionary expression.

This is primarily a code-size optimization.

unused variables pattern

Unused variables are created due to a series of exch, pop, copy and/or roll operators followed by a trace stack flush:

$astore($a($counttomark())); /*15204*/
var _BZ = $k[--$j]; /*15204*/
var _Ba = $k[--$j]; /*15204*/
$k[$j++] = _BZ; /*15204*/

The _Ba variable is unused and can be eliminated as:

$astore($a($counttomark())); /*15204*/
var _BZ = $k[--$j]; /*15204*/
$j--;
$k[$j++] = _BZ; /*15204*/

A further optimization could eliminate the $j-- followed by the $j++, leaving just:

$astore($a($counttomark())); /*15204*/
var _BZ = $k[--$j]; /*15204*/
$k[$j-1] = _BZ; /*15204*/

That optimization could be rolled into the pop-push pattern below.

Another unused variable example:

$astore($a($counttomark() - 1)); /*12357*/
var _6y = $k[--$j]; /*12357*/
var _6z = $k[--$j]; /*12357*/
var _70 = $k[--$j]; /*12357*/
$put($1.rowbits, $1.i, _6y); /*12358*/

Both the _6z and _70 are unused and could be eliminated as:

$astore($a($counttomark() - 1)); /*12357*/
var _6y = $k[--$j]; /*12357*/
$j-=2;
$put($1.rowbits, $1.i, _6y); /*12358*/

This optimization would need to duplicate the pop-merging logic in psc.js.

pop-push pattern

This is often seen inside loops where the code pops the loop value, then pushes an altered value.

$forall($1.row, function() { /*6349*/
   var _IE = $k[--$j]; /*6349*/
   $k[$j++] = 1 - _IE; /*6349*/
}) /*6349*/

This could be handled by var-elimination but there are too many operand stack side-effects to be safe to implement in that pass. The temp variable can be anywhere in the expression but must occur only once. The optimized code will be:

$forall($1.row, function() { /*6349*/
   $k[$j-1] = 1 - $k[$j-1]; /*6349*/
}) /*6349*/

multi-push pattern

$k[$j++] = $1.pixs; /*13754*/
$k[$j++] = $1.rwid * $1.i; /*13754*/
$k[$j++] = Infinity; /*13754*/
$k[$j++] = 1; /*13748*/
$k[$j++] = 1; /*13748*/
$k[$j++] = 1; /*13748*/
$k[$j++] = 1; /*13748*/
$k[$j++] = 1; /*13748*/
$k[$j++] = 1; /*13748*/
$k[$j++] = 1; /*13748*/
$k[$j++] = 1; /*13748*/
$k[$j++] = 0; /*13748*/
$k[$j++] = 1; /*13748*/
$k[$j++] = 0; /*13748*/
$k[$j++] = 1; /*13748*/
$k[$j++] = 0; /*13748*/
$k[$j++] = 1; /*13748*/
$k[$j++] = 0; /*13748*/
$k[$j++] = 0; /*13748*/
$k[$j++] = 0; /*13748*/
$k[$j++] = $1.lcw; /*13748*/
$k[$j++] = $1.i % 3; /*13748*/

No side-effects allowed in the expressions; only simple expressions, no function calls. JavaScript ensures left-to-right order of evaluation in function calls, so no change in evaluation semantics.

Replace with:

$k.splice($j, 0, <expression-list>);
$j += 22;

Or maybe an internal method:

$kcopy(<expression-list>);

Need a heuristic for when to employ. Five lines or more?

Primarily a code-size optimization.