You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
So I've wrote yet another type predicate (guard) generator for TypeScript. Cool, cool, cool. What's next?
As the Predicate Generator is all about trust, type safety and correctness, why not to go the extra mile and also generate a test suite for the resulting type predicate function right next to the predicate itself!
Before you say it, I tried AI make it all for me first, but the results were unstable though still ok.
Why
Ok, we're generating tests for a type predicate function. Where to start? Let's first take a look at a simple predicate function.
typeMyNumber=numberfunctionisMyNumber(value: unknown): value is MyNumber{if(typeofvalue!=="number"){returnfalse}valuesatisfiesMyNumberreturntrue}
You can see that this function uses a typeof operator to check if value is of the primitive type number. In JS/TS all numbers are of this type. Then the function uses the satisfies type operator to verify that the type of value is properly narrowed down to number which satisfies the desired type MyNumber.
That would be great to end the story here, but in TypeScript the body of a type predicate function is not checked to be actually correct in regards of examining the given value. The only requirement is that it returns a boolean. One can easily write a function like below and get no errors from the TypeScript compiler:
functionisMyNumber(value: unknown): value is MyNumber{// Just trust me!returntrue}
This is why we have to check the type predicate functions thoroughly to avoid misguiding ourselves that the code does what it does. One way, as you have seen, is to always use the satisfies type operator before actually returning true (even better if there is only one return true in the whole function to be sure). The other way is the topic of this post: generating test values for a given type.
What
Generating test values boils down to creating two sets of values: valid and invalid. These values are then passed to the type predicate function. The function has to return true for all the valid values and false for all the invalid values.
For our simple example the sets of values are as follows:
Test frameworks offer a way to iterate over a list of values, but for the sake of readability I'm using the explicit syntax.
So far so good. As a next example let's manually generate a set of test values for a more complex type.
typeMyObject={a: number,b: string}
To avoid the combinatorial explosion of possible values here we're gonna pick only a few values for every type:
for the number type we are checking that 0 and 42 are valid and "42" and undefined are invalid
for the string type we're checking that "" and "foo" are valid and undefined and 1 are invalid
for the object type itself the valid values are any combination of both valid properties, and the invalid values are where any of a and b attributes are missing or any of them have an invalid value
How
Now we need to somehow generate the test values for the types the predicates check. Given that in the process of the predicates generation we already parse input types into a simple nested structure (see TypeModel) it must be trivial to traverse it and dump a bunch of test values. Right? To a degree.
First things that pops to mind is to write a recursive function (or a recursive interface) that takes a type, goes about its internal specifics–like iterating over the attributes in an object–and returns a test value. Let's try writing a simple example of such a function:
Seems ok, until we find out that the function returns only one combined value with only one test value per attribute. Never fear, we always can return an array of values instead! But, then we still return a nested structure (array of arrays of objects or something) which we'd need to traverse once more to flatten it into an array of final test values suitable for the predicate to test on. Still doable, but what does this function return in turn? It should be a flattened array of all flattened values. Some progress, but it certainly reminds of what we started with. Could we do better and merge all this in one efficient pass?
Yes! To be able to traverse the nested type tree, permute the values any way we want and keep the code readable we just need to find a way to yield one test value at a time instead of returning arrays and merging them on each step.
Yielding a value. One step at a time. This definitely rings a bell… Use a callback? Nope. An iterator? Close. A coroutine? Bullseye! A generator, to be correct as in JavaScript they are known as generator functions. Generators as class of computation in JavaScript allow to pull a single value at a time and also have nice syntax and state management out of the box with unlimited nesting. We also get laziness for free to cover the cases when there is no need to generate all the values at once saving some CPU cycles. Implementing this level of control with the raw JS functions is still possible of course as JS is a Turing complete language and every closure in JS can technically be a coroutine. But it's not gonna be pretty (we're gonna try it anyway a bit later in this post).
Generators to the rescue!
A simple test value generator looks like this:
function*myNumber(){yield0yield42yield3.14}
Note the star after the function keyword. This star tells the JS engine to compile the function in a special way so that the yield keyword works as in a coroutine. The example above when called gives back an iterator that produces three values 0, 42 and 3.14. We can read all of them at once like this:
console.log([...myNumber()])
…or fetch one by one like this:
for(constvalueofmyNumber()){console.log(value)}
Ok, it seems that we've got the right tool. Now, how do we build with it? We'd need to find a way to combine several value generators into a compound generator and then just pull values from the top level generator and pass them to our tests. Let's try to implement a union first as it's the simplest of the combining generators:
And that's it. Generator syntax, it feels, is just made for infinite / lazy sequences computation. Let's implement the same logic manually for comparison. It looks something like this:
functionmyUnion(members){leti=0;letmember=undefined;returnfunction(){for(;i<members.length;i++){if(!member){// "instantiate" a membermember=members[i]()}constvalue=member();// no more values in the member iteratorif(value===undefined){continue;}returnvalue}// no more values in our iterator eitherreturnundefined};}
Already a lot even without the next() method and the rest of the proper iteration API.
I hope, at this point you're as deep in love with generators as I obviously am. Then you'll definitely like the beauty of generating an object using a generator function, our first combinatorially complex type (a product type):
function*myObject(attrA,attrB,attrC){for(constaofattrA())for(constbofattrB())for(constcofattrC())yield{ a, b, c }}
Wow, right? It looks like it's gonna spin N^3 times and produce a ton of values before returning any, but it's not the case. This generator first yields an object with the first value of a, the first value of b and the first value of c that gets returned immediately to the caller. The next value is gonna be still the first value of a, the first value of b and the second value of c. It's a DFS-like execution, yep. Once any of the inner generators is up the next outer progresses by one step. When the most outer is done in its turn the whole generator function returns and the caller finishes reading the test values. Hoorah!
You might have noticed that the myObject() function is a bit simplified. It's implemented for a fixed number of attributes with fixed names. This is an intentional simplification we have to make to demonstrate the raw power of the approach in a clear simple way. Of course, the real life code is around 10x longer, but it covers arbitrary objects with any number of optional attributes and any of those might have invalid test values. Next, take a look at the below example yielding both valid and invalid objects:
function*myObject(attrA,attrB,attrC){for(constaofattrA())for(constbofattrB())for(constcofattrC()){// validyield{ a, b, c }// invalid onesyield{}yield{ a }yield{ b }yield{ c }yield{ a, b }yield{ a, c }yield{ b, c }}}
Still pretty readable and easy to start with. Just using basic generators like this one already gives a good boost on attacking a combinatorial task at hand.
Finally, let's see how to combine them all into something meaningful.
A familiar combinatorial Christmas tree. Find the full source code of this example below.
I hope you've enjoyed the journey. If you're interested in how the type predicate generator tool deals with the vast amount of combination you're welcome to continue reading the footnotes below.
Footnotes
On the combinatorial explosion
First, let's calculate how many combinations a random everyday type is gonna require to test ALL possible cases. Just a single non-nested object type of 10 optional fields with a single possible value per attribute is 210 = 1024 combinations. And for a slightly bigger usual SaaS API response with 20 optional fields 220 = 1M! Let alone properties with multiple possible values or even nested objects with their own optional attributes. This is a no-go even for fuzz testing. To avoid checking all this almost infinite number of values the test generator takes the white box testing approach. It's when we write only a small number of targeted tests only for the known edge cases of the implementation we know.
The type predicate generator specifically makes two tradeoffs. First, for valid values it uses only the values that can potentially be misinterpreted by a bug in the generator (like an empty string or the number 0). And second, it does not permute all possible combinations of values effectively turning all the product types into sum types. Instead, it picks only one valid value at a time for each position keeping the rest values exactly the same effectively changing only one value at a time for the whole resulting test value. In other words imagine that a test value with all the properties and all the nesting is a tree. Each leaf of the tree is a primitive value in each position. There are not too many such leaves even in a relatively big type. The take each leaf and change it with all possible values for only this leaf and yield them. Then take the other leaf alone and change values for only that leaf not changing the others. And so on. In the end you should be getting at maximum M * N values where M is the number of all the leaves and N is the sum of all individual values of all the leaves. Compare this with NM.
Please leave a comment if you're interested in learning more on how it's implemented. Meanwhile you can take a look at the implementation in the tool itself.
On white box testing
Why white box testing approach is ok here? First and foremost, because we have to either use some kind of theorem proving formal language (which TS is not, yet) or reduce the problem space using a limited subset of the building blocks. Following this logic the test generator assumes that the type predicates use some sort of a DSL instead of the Turing complete JavaScript and tests this subset instead. This way we trust the system formed of the predicate generator and the test generator as a whole. In other terms we're operating in an inherently safe system as far as we control the DSL and its source code, no need to cover for the broad Byzantine environment that the wild JavaScript world can sometimes be.
The text was updated successfully, but these errors were encountered:
peter-leonov
changed the title
[WIP] A note on Combining Generators
A note on Combining Generators
Jan 18, 2025
peter-leonov
changed the title
A note on Combining Generators
Generating Safe Values using JavaScript Generators
Jan 20, 2025
peter-leonov
changed the title
Generating Safe Values using JavaScript Generators
Generating Test Values using JavaScript Generators
Jan 20, 2025
peter-leonov
changed the title
Generating Test Values using JavaScript Generators
[BLOG] Generating Test Values using JavaScript Generators
Jan 20, 2025
Btw, folks. In case your language of choice does not support coroutines there is a way to follow what JS community used to be doing before the full support for coroutines (and later async / await) got into the main engines. They used transpilation. Here is an example on how TypeScript compiler lowers the code for a generator function down to the old JS standard.
Combining Generators
So I've wrote yet another type predicate (guard) generator for TypeScript. Cool, cool, cool. What's next?
As the Predicate Generator is all about trust, type safety and correctness, why not to go the extra mile and also generate a test suite for the resulting type predicate function right next to the predicate itself!
Before you say it, I tried AI make it all for me first, but the results were unstable though still ok.
Why
Ok, we're generating tests for a type predicate function. Where to start? Let's first take a look at a simple predicate function.
You can see that this function uses a
typeof
operator to check ifvalue
is of the primitive typenumber
. In JS/TS all numbers are of this type. Then the function uses thesatisfies
type operator to verify that the type ofvalue
is properly narrowed down tonumber
whichsatisfies
the desired typeMyNumber
.That would be great to end the story here, but in TypeScript the body of a type predicate function is not checked to be actually correct in regards of examining the given value. The only requirement is that it returns a
boolean
. One can easily write a function like below and get no errors from the TypeScript compiler:This is why we have to check the type predicate functions thoroughly to avoid misguiding ourselves that the code does what it does. One way, as you have seen, is to always use the
satisfies
type operator before actually returningtrue
(even better if there is only onereturn true
in the whole function to be sure). The other way is the topic of this post: generating test values for a given type.What
Generating test values boils down to creating two sets of values: valid and invalid. These values are then passed to the type predicate function. The function has to return
true
for all the valid values andfalse
for all the invalid values.For our simple example the sets of values are as follows:
Having these values we can generate a unit test file like this:
Test frameworks offer a way to iterate over a list of values, but for the sake of readability I'm using the explicit syntax.
So far so good. As a next example let's manually generate a set of test values for a more complex type.
To avoid the combinatorial explosion of possible values here we're gonna pick only a few values for every type:
number
type we are checking that0
and42
are valid and"42"
andundefined
are invalidstring
type we're checking that""
and"foo"
are valid andundefined
and1
are invalida
andb
attributes are missing or any of them have an invalid valueHow
Now we need to somehow generate the test values for the types the predicates check. Given that in the process of the predicates generation we already parse input types into a simple nested structure (see
TypeModel
) it must be trivial to traverse it and dump a bunch of test values. Right? To a degree.First things that pops to mind is to write a recursive function (or a recursive interface) that takes a type, goes about its internal specifics–like iterating over the attributes in an object–and returns a test value. Let's try writing a simple example of such a function:
Seems ok, until we find out that the function returns only one combined value with only one test value per attribute. Never fear, we always can return an array of values instead! But, then we still return a nested structure (array of arrays of objects or something) which we'd need to traverse once more to flatten it into an array of final test values suitable for the predicate to test on. Still doable, but what does this function return in turn? It should be a flattened array of all flattened values. Some progress, but it certainly reminds of what we started with. Could we do better and merge all this in one efficient pass?
Yes! To be able to traverse the nested type tree, permute the values any way we want and keep the code readable we just need to find a way to yield one test value at a time instead of returning arrays and merging them on each step.
Yielding a value. One step at a time. This definitely rings a bell… Use a callback? Nope. An iterator? Close. A coroutine? Bullseye! A generator, to be correct as in JavaScript they are known as generator functions. Generators as class of computation in JavaScript allow to pull a single value at a time and also have nice syntax and state management out of the box with unlimited nesting. We also get laziness for free to cover the cases when there is no need to generate all the values at once saving some CPU cycles. Implementing this level of control with the raw JS functions is still possible of course as JS is a Turing complete language and every closure in JS can technically be a coroutine. But it's not gonna be pretty (we're gonna try it anyway a bit later in this post).
Generators to the rescue!
A simple test value generator looks like this:
Note the star after the
function
keyword. This star tells the JS engine to compile the function in a special way so that theyield
keyword works as in a coroutine. The example above when called gives back an iterator that produces three values0
,42
and3.14
. We can read all of them at once like this:…or fetch one by one like this:
Ok, it seems that we've got the right tool. Now, how do we build with it? We'd need to find a way to combine several value generators into a compound generator and then just pull values from the top level generator and pass them to our tests. Let's try to implement a union first as it's the simplest of the combining generators:
it can also be written in a shorter form:
And that's it. Generator syntax, it feels, is just made for infinite / lazy sequences computation. Let's implement the same logic manually for comparison. It looks something like this:
Already a lot even without the
next()
method and the rest of the proper iteration API.I hope, at this point you're as deep in love with generators as I obviously am. Then you'll definitely like the beauty of generating an object using a generator function, our first combinatorially complex type (a product type):
Wow, right? It looks like it's gonna spin N^3 times and produce a ton of values before returning any, but it's not the case. This generator first
yield
s an object with the first value ofa
, the first value ofb
and the first value ofc
that gets returned immediately to the caller. The next value is gonna be still the first value ofa
, the first value ofb
and the second value ofc
. It's a DFS-like execution, yep. Once any of the inner generators is up the next outer progresses by one step. When the most outer is done in its turn the whole generator function returns and the caller finishes reading the test values. Hoorah!You might have noticed that the
myObject()
function is a bit simplified. It's implemented for a fixed number of attributes with fixed names. This is an intentional simplification we have to make to demonstrate the raw power of the approach in a clear simple way. Of course, the real life code is around 10x longer, but it covers arbitrary objects with any number of optional attributes and any of those might have invalid test values. Next, take a look at the below exampleyield
ing both valid and invalid objects:Still pretty readable and easy to start with. Just using basic generators like this one already gives a good boost on attacking a combinatorial task at hand.
Finally, let's see how to combine them all into something meaningful.
The output is as follows (using only valid test values for readability):
A familiar combinatorial Christmas tree. Find the full source code of this example below.
I hope you've enjoyed the journey. If you're interested in how the type predicate generator tool deals with the vast amount of combination you're welcome to continue reading the footnotes below.
Footnotes
On the combinatorial explosion
First, let's calculate how many combinations a random everyday type is gonna require to test ALL possible cases. Just a single non-nested object type of 10 optional fields with a single possible value per attribute is 210 = 1024 combinations. And for a slightly bigger usual SaaS API response with 20 optional fields 220 = 1M! Let alone properties with multiple possible values or even nested objects with their own optional attributes. This is a no-go even for fuzz testing. To avoid checking all this almost infinite number of values the test generator takes the white box testing approach. It's when we write only a small number of targeted tests only for the known edge cases of the implementation we know.
The type predicate generator specifically makes two tradeoffs. First, for valid values it uses only the values that can potentially be misinterpreted by a bug in the generator (like an empty string or the number 0). And second, it does not permute all possible combinations of values effectively turning all the product types into sum types. Instead, it picks only one valid value at a time for each position keeping the rest values exactly the same effectively changing only one value at a time for the whole resulting test value. In other words imagine that a test value with all the properties and all the nesting is a tree. Each leaf of the tree is a primitive value in each position. There are not too many such leaves even in a relatively big type. The take each leaf and change it with all possible values for only this leaf and yield them. Then take the other leaf alone and change values for only that leaf not changing the others. And so on. In the end you should be getting at maximum M * N values where M is the number of all the leaves and N is the sum of all individual values of all the leaves. Compare this with NM.
Please leave a comment if you're interested in learning more on how it's implemented. Meanwhile you can take a look at the implementation in the tool itself.
On white box testing
Why white box testing approach is ok here? First and foremost, because we have to either use some kind of theorem proving formal language (which TS is not, yet) or reduce the problem space using a limited subset of the building blocks. Following this logic the test generator assumes that the type predicates use some sort of a DSL instead of the Turing complete JavaScript and tests this subset instead. This way we trust the system formed of the predicate generator and the test generator as a whole. In other terms we're operating in an inherently safe system as far as we control the DSL and its source code, no need to cover for the broad Byzantine environment that the wild JavaScript world can sometimes be.
The text was updated successfully, but these errors were encountered: