Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Understanding the post-drop block/br type system #736

Closed
kripken opened this issue Aug 1, 2016 · 80 comments
Closed

Understanding the post-drop block/br type system #736

kripken opened this issue Aug 1, 2016 · 80 comments
Milestone

Comments

@kripken
Copy link
Member

kripken commented Aug 1, 2016

Looks like the drop changes in 0xc change the block/br type system significantly. I've been trying to figure out what's going on from the new spec tests, but can't seem to. Is this documented somewhere?

For example, one source of puzzlement is that this should be invalid

(block $block1
  (br $block1
    (unreachable)
  )
) 

Intuitively that seems like it should be valid, as no value is sent to the block? So I can only guess it is invalid because the type of unreachable is unacceptable here. Yet, on the other hand this should be valid,

(block $block1
  (br $block1
    (br $block0
      (i32.const 8)
    )
  )
)

So somehow the type of br is different from the type of unreachable? That seems odd as both halt control flow.

@kripken
Copy link
Member Author

kripken commented Aug 1, 2016

To be specific, those examples are from the tests in the test repo's binary-0xc branch.

@kripken
Copy link
Member Author

kripken commented Aug 1, 2016

Which is identical to the spec repo's branch of the same name.

@binji
Copy link
Member

binji commented Aug 1, 2016

Are you talking about the as-br-value tests in unreachable.wast and br.wast?

@kripken
Copy link
Member Author

kripken commented Aug 1, 2016

They are from br.wast. I go through them sequentially in alphabetical order. It'll be a while before I get to 'u' for unreachable ;) But I do have block.wast passing.

@binji
Copy link
Member

binji commented Aug 1, 2016

Ah, I see. unreachable.wast has a similar test, but it validates:

  ;; unreachable.wast
  (func "as-br-value" (result i32)
    (block (br 0 (unreachable)))
  )
---
;; br.wast
(assert_invalid
  (module (func $type-arg-poly-vs-empty
    (block (br 0 (unreachable)))
  ))
  "arity mismatch"
)

AIUI, the idea is that if a br has a value argument, then the target block must either use the value or drop it. The fact that the value expression is unreachable doesn't matter, since unreachable can be used in place of any type.

(block $block1
  (br $block1
    (br $block0
      (i32.const 8)
    )
  )
)

This one doesn't make sense in isolation. Is there supposed to be a $block0 wrapping this, or is it a typo of this test from br.wast?

  (func "as-br-value" (result i32)
    (block (br 0 (br 0 (i32.const 9))))
  )

@kripken
Copy link
Member Author

kripken commented Aug 1, 2016

(The first example is from type-arg-poly-vs-empty, and the second is reduced from nested-br-value - yes, there is an outer block.)

Ok, thanks, the unreachable's type was confusing me perhaps. If I follow you, then it looks like the type system has gotten more complex and now an additional internal type is needed? What I mean is that before, binaryen had

  • i32, etc., the "concrete" normal types
  • unreachable, the type of unreachable, br, etc.. That is a type that vanishes when merged with other types, e.g. if one arm of an if is unreachable while the other is an i32, the combined merged type of the if is i32.
  • none, the type of e.g. store and set_local (after we made them not return a value), values of this type must not be consumed. E.g. if one arm of an if has this type, then no matter the type of the other, the if will have this type (and must not be consumed).

Those types appear to no longer be sufficient, as if I understand things now, then (block (br 0 (unreachable))) must be dropped, and neither of unreachable and none require that? In other words a new type like this must be added,

  • dropme, a type that can only be consumed by drop, and in fact must be consumed by it.

Is that correct?

@titzer
Copy link

titzer commented Aug 1, 2016

For the stack machine, we made the decision not to typecheck unreachable code. That's not really an undue burden on the V8 and SM implementations, since they already modeled reachability for SSA renaming. For binaryen, I assume this is an additional thing to check.

I'm not entirely sure that one can easily model the reachability requirement with types alone. Instead it is state attached to each block and to the current "stack".

@kripken
Copy link
Member Author

kripken commented Aug 1, 2016

I see. In that case, maybe it makes more sense for binaryen to just jump to that state, instead of trying to get to where binary-0xc is right now? (which looks like post-drop but pre-full stack machine) Unless that's far off.

@rossberg-chromium, when do you expect ml-proto's binary-0xc branch to get to the final state of full stack machine / not typechecking unreachable code etc.?

@kripken
Copy link
Member Author

kripken commented Aug 1, 2016

(Or is there already another branch somewhere that is at that point?)

@titzer
Copy link

titzer commented Aug 1, 2016

0xC will be updated to be the full stack machine over the next few days.

@titzer
Copy link

titzer commented Aug 1, 2016

(in the design repo)

@binji
Copy link
Member

binji commented Aug 1, 2016

I think I was confused about this too, not sure what the differences a "full stack machine" means for the type system.

@titzer
Copy link

titzer commented Aug 1, 2016

Some details are available in this slide deck: https://docs.google.com/presentation/d/1dRsN5lKY60d3IOILi4bttJXOX4ge-2tA1PaEX-d5So0/edit#slide=id.p

But I think @rossberg-chromium 's rules will make it easier to see.

@ghost
Copy link

ghost commented Aug 2, 2016

@titzer Interesting slide thank you. It claims to support multiple values, which is any area of concern for me.

So this seems to extend the break operators to be able to pick a number of values from the top of the stack and retain these at the exit of the target block while discarding the other values in the block?

The problem is that it is not possible to access the multiple values from a call. For example it is not possible to drop the top-of-stack value to expose the next value. Perhaps the proposal is to remove that restriction on drop too?

It is not even possible to store these values in local variables as the set_local operator returns no values so would be invalid while the other values are still on the stack.

Fwiw I believe that any re-arranging of values on the values stack is compatible with an expression based presentation. This can be demonstrated with a pick operator that chooses values from the values stack, popping them all, and then pushing some back in a different order.

The other show stopper is that the 's-expression syntax breaks down for stack machine', or in other words it is not presentable in an expression based format. There might be ways to work around that with canonicalization, but I have not worked through it all.

So it still appears to me that the proposal is not tenable and might not meet the expectations of the web community.

It really does not seem a big burden to keep the void values on the values stack. As I suggested long ago this could be done efficiently by keeping a counter of them in a stack location, so the stack entry would be void <n>. We could then adopt the get_value proposal and now have lexical scoped const variables, which are just values on the stack, and calls returning multiple values would simply be presented as destructing to multiple lexical scoped const variables. We might as well then define the function arguments as const variables too rather than local variables, so they could just be values on the values stack too. This is just one proposal, and at the other extreme would be the expressionless proposal using canonicalization to define a standard expression based presentation.

@ghost ghost mentioned this issue Aug 2, 2016
@qwertie
Copy link

qwertie commented Aug 2, 2016

While I don't fully understand every detail of the slide deck (I can't even tell who wrote or uploaded it) I think I get the gist of it.

The slides note that "void-in-the-middle is not consistent with the standard interpretation of an AST, without additional blocks". That's true, yet nothing prevents us from constructing a tree from the postorder and calling it an AST.

I think being able to display expressions using infix notation will be a great feature of Wasm even if it moves away from an AST interpretation. So in a comment in #697 I've explored a continuum of potential notations from "tall-postorder" to "wide-tree".

@rossberg
Copy link
Member

rossberg commented Aug 2, 2016

@kripken, the well-typedness of your original examples depends on the surrounding expression: if the block is put into a void context then the br is no longer allowed to have an argument. That is because the explicit drop change was undoing WebAssembly/spec#271, i.e., the syntactic arity of a br again has to match the block's arity.

But with a full stack machine this becomes a moot point, so you might want to skip implementing the intermediate step. The stack machine implementation of the spec is currently on the stack branch. It should be almost up to date with what we discussed, including updated tests (although it's currently lacking more interesting new tests). I'm still working on the arity changes, though.

@rossberg
Copy link
Member

rossberg commented Aug 2, 2016

@binji, the "explicit drop" changes are the biggest part, as far as type system changes for implementations are concerned. The rest towards "full stack machine" should in fact be a (minor) simplification for those that did not materialise an AST: mostly, the validator no longer has to remember when voids are "pushed on the stack". It also does not need to check syntactic arities anymore.

For the spec, the move to a stack machine is quite a fundamental change, however. The stack branch of the spec repo basically required redoing the entire core of the interpreter. The declarative type system now needs to be factored in a different manner, because there is no expression AST anymore.

@rossberg
Copy link
Member

rossberg commented Aug 2, 2016

@titzer, don't worry, unreachability is straightforward to model with typing rules. Already done. ;)

@kripken
Copy link
Member Author

kripken commented Aug 2, 2016

@rossberg-chromium: ok, thanks, I guess I'll plan to go straight to the stack branch then.

I still worry about having to understand the new type system from the tests (which has always been a tricky process), so I might wait until the spec is written, unless we think that will be far off?

@kripken
Copy link
Member Author

kripken commented Aug 2, 2016

@rossberg-chromium: I am also concerned about the full rewrite of the core of your interpreter. Is that due to something specific in your implementation, or would you expect all existing interpreters to need rewrites?

@rossberg
Copy link
Member

rossberg commented Aug 2, 2016

@kripken, yes, the difference is that ml-proto defined everything in terms of the expression AST, and that has vanished. Browser implementations did not actually build an AST, so the new semantics is a closer match to what they already do. As Ben mentioned in his presentation, it was a 200 loc change in V8, and done in like 2 days (including a complete impl of multi-returns).

In a nutshell, the accumulative changes in typing over 0xB are:

  • A void operation is valid anywhere.
  • The result of a block is the last non-void operation (or void if none).
  • All other non-void results inside a block must be consumed as operands, possibly by a drop.
  • All constraints on syntactic arities are gone.

None of this requires any new type. It's just an error if a block leaves more than one thing on the stack (in the MVP).

In addition, we discussed getting rid of polymorphism by not typechecking unreachable code, where unreachability can be defined as follows in terms of the flat operator sequence:

  • Any operator after an unreachable, br, br_table, or return is unreachable.
  • Any operator after an unreachable operator is also unreachable, up to the end or else terminating the current block.
  • Any operator after an end is unreachable if all exits from the respective block, loop, or if are unreachable (i.e., the end itself, all breaks targeting it, and the else of an if).

There are nicer ways to describe this in terms of a special bottom type, which is pretty much the unreachable type you already have.

@titzer
Copy link

titzer commented Aug 2, 2016

Just to be clear, it was a ~200 line change to the decoder; the rest of the changes to V8 were about 1500 lines, mostly to tests and the interpreter. The change is here: https://codereview.chromium.org/2176653002/

@kripken
Copy link
Member Author

kripken commented Aug 2, 2016

Thanks for the details @rossberg-chromium!

So this sounds worrying. Much like ml-proto, binaryen defines everything in terms of an AST. At this point the binaryen optimizer is a non-trivial amount of code (includes a register allocator, duplicate function eliminator, and many other passes), and the binaryen utilities like s2wasm and asm2wasm are likewise non-trivial in size and are a core part of the toolchain story for WebAssembly. If all that needs to be rewritten because it's AST-based then it's a serious matter.

But I feel like I don't have a full grasp on the issue yet. If this is just validation, I'd be happy to rewrite binaryen's binary and s-expression validators from scratch, that's not too bad.

But if valid WebAssembly code can't be represented, transformed, and optimized the way the binaryen AST currently is, then that would justify considering scrapping binaryen, in which case we would need to figure out other solutions for the toolchain functionality it currently provides - worrying in the MVP timeframe.

@kripken
Copy link
Member Author

kripken commented Aug 3, 2016

On the ml-proto stack branch, this should be valid:

(module
  (func $a
    (loop (i32.ctz (loop (br 1))))
  )
)

Is that due to loop no longer having a label to break out?

@rossberg
Copy link
Member

rossberg commented Aug 3, 2016

@kripken, yes. The branch targets the outer loop.

@rossberg
Copy link
Member

rossberg commented Aug 3, 2016

@kripken, re your previous question: it should be fairly straightforward to keep using an AST in binaryen if you are willing to build one that it is a superset of actual Wasm, i.e., adds a form of let-node. Luke and Dan have suggested this as well.

For ml-proto that wasn't an option because it is supposed to directly mirror the actual language as defined (and that definition has changed). Tools have more leeway in using custom ILs.

@kripken
Copy link
Member Author

kripken commented Aug 3, 2016

@rossberg-chromium: I'm a little confused though because e.g. float_exprs.wast on the stack branch still has a loop with (loop $exit $cont. Is that sugar for an extra block?

About the AST issue, I'm still not sure what adding a let node would look like in terms of compiling to and from wasm, especially keeping that lossless. Also unsure about code-size optimizations with such a construct (if the AST is "unnatural" then proper opts might be hard). But I guess I'll implement it when I get to the tests that need it and it'll be more concrete for me. Hopefully it won't be a problem.

@kripken
Copy link
Member Author

kripken commented Aug 3, 2016

Another question, I see in nop.wast these:

  (func $3-ary (param i32 i32 i32) (result i32)
    get_local 0 get_local 1 get_local 2 i32.sub i32.add
  )
[..]
  (func "as-br_table-everywhere" (param i32) (result i32)
    block
     (nop) (nop) (get_local 0) (nop) (nop) (get_local 0) (nop) (nop)
     br_table 1 0 0
    end
  )

Is that the new flat stack machine text format? Is it intended to be mixable with the old notation as in the second function (no ambiguity issues that the parens would have prevented)?

@kripken
Copy link
Member Author

kripken commented Aug 3, 2016

It looks like you are allowing non-parenned operations to pop parenned ones, e.g.

(get_local 0) (nop) drop

I assume that the drop drops the get_local? This seems quite odd to me, mixing an ast and a stack machine. It seems less confusing to allow just one of the two forms in each function (or perhaps block context).

(Of course I think it's interesting to investigate a mixed-mode text format for the future; my comment is about the changes to the s-expression format for the test suite today.)

@ghost
Copy link

ghost commented Aug 11, 2016

@rossberg-chromium It can be a type error in a s-exp text format, not necessarily the one being proposed, and that is my point. @kripken might not choose your text format even if using the same code. There is type information for the number of results expected for a function, in the function signature.

@kripken
Copy link
Member Author

kripken commented Aug 11, 2016

@rossberg-chromium: I see, and I guess the only way to do a return without a value is by (return (nop))?

@ghost
Copy link

ghost commented Aug 12, 2016

@kripken I doubt it would be (return (nop)) rather just (return). From what I understand of the proposal, regular expressions such as return with arguments can not consume a zero-value expression so could not consume nop here.

The point @rossberg-chromium is making might also be that the s-exp no longer has much syntax, that arguments push to the stack and counts are not checked except for the wasm encoding being valid, but there seems no reason that binaryen has to follow. For example I presume that the following is valid in the @rossberg-chromium proposal, but there is no reason for the text format parser to accept this as valid and I don't think the test suit should include such ill-formed syntax for tests.

(i32.add (i32.const 1) (i32.const 2) (i32.const 3))
(i32.add (i32.const 4) (i32.const 5) (i32.const 6))
(i32.add)
(i32.add)
(i32.add)
==
(i32.add (i32.const 1)
         (i32.add (i32.add (i32.const 2) (i32.const 3))
                  (i32.add (i32.const 4)
                           (i32.add (i32.const 5) (i32.const 6)))))

@rossberg
Copy link
Member

@kripken, no, it's still just (return). But what that actually means is determined by the type of the surrounding function now, not by a syntactic arity.

S-exprs are just syntax sugar, i.e., macro-like things without inherent semantics. So Wasm itself can't check anything about them (the type system doesn't know anything about them) and thus would (have to) allow weird abuses. But some kind of linter could check for them.

@JSStats, your example is currently invalid, because binary operators still only allow two S-expr operands. The only way to abuse the syntax is by putting void-expressions somewhere, e.g.:

(i32.const 1)
(i32.add (nop) (i32.const 2))

which of course is no better.

@ghost
Copy link

ghost commented Aug 12, 2016

@rossberg-chromium Ok, thank you.

Problematic but perhaps valid:

(i32.add (call $give_me_i32_i32) (nop))

Problematic and perhaps invalid in some contexts:

(i32.add (call $give_me_i32_i32) (call $give_me_i32_i32))

I wonder if the text parser needs to know some type information. Binaryen might still need an AST in order to catch such problems and to be able to report them.

@rossberg
Copy link
Member

@JSStats, the MVP does not yet support multiple returns, so you can't currently write these examples.

If a "parser" needs type information then it is no longer just a parser. It is in fact a translator from one (external) language into another (internal) language. There is nothing wrong with that for many use cases (including Binaryen), but technically you are dealing with two different languages at that point. The stack machine's S-expr interpretation as "macros" is intentionally dumb-as-bread because we don't want to get into the business of specifying two languages for Wasm plus a compilation function between them.

@ghost
Copy link

ghost commented Aug 12, 2016

@rossberg-chromium Ok, I see your point, and that does seem to support abandoning the s-exp format for a testing format, and just using a sequence of opcodes plus perhaps some white space conventions to make it clear how blocks nest. My only concern then is that wasm development might not recognize the constraint of supporting a one-to-one translator to a familiar text format, and there might be no demonstration or testing that this is possible.

@kripken
Copy link
Member Author

kripken commented Aug 12, 2016

@rossberg-chromium "S-exprs are just syntax sugar, i.e., macro-like things without inherent semantics. So Wasm itself can't check anything about them (the type system doesn't know anything about them) and thus would (have to) allow weird abuses."

I didn't fully appreciate what that meant until now. Hasn't this change made s-expressions much worse than before the stack change?

@kripken
Copy link
Member Author

kripken commented Aug 12, 2016

@rossberg-chromium Actually, wait, I still don't understand this. Are you saying that in

module (func $type-return-num-vs-enpty (return (i32.const 0))))

the return does not pop the i32.const, and that we know that because the function does not return a value? But if so, then why isn't there a drop on the i32.const?

@rossberg
Copy link
Member

@kripken, yes, that definitely has made them worse, in the sense that they have much looser meaning and checking now. But unfortunately, everything else would require defining another language.

The example does not require drop because return, like every control transfer, discards the operand stack anyway. Like you could always do (i32.add (i32.const 0) (return)).

@kripken
Copy link
Member Author

kripken commented Aug 12, 2016

That seems confusing to me, because when (i32.const 0) (return) appears in a function body (or block) then only the last of the two is a candidate for being consumed (as the return value of the function or block). Whereas if they appear in i32.add then both are candidates for being consumed.

@ghost
Copy link

ghost commented Aug 12, 2016

@kripken The example (i32.add (i32.const 0) (return)) depends on the non-validation of unreachable code, so I don't think binaryen needs to even be capable of generating the respective wasm binary code for this. It does not validate that the last i32.add operator actually had arguments. An interesting example would be one that binaryen might consider invalid, and can not otherwise generated, but would still be valid wasm binary code?

@rossberg
Copy link
Member

@kripken, I am not sure I understand what you are saying. But it's worth noting that, once we generalise to multi-values post-MVP, any number of operators/expressions in a block is a candidate for consumption, e.g., (i32.add (block (const 1) (const 2))) will be fine.

@JSStats, no, this example has always been valid, through the polymorphic result type of return.

@ghost
Copy link

ghost commented Aug 16, 2016

@rossberg-chromium Yes, missed that. But (f32.add (i32.const 0) (return)) will be valid, but is not currently valid?

The unreachable code still needs to be valid opcodes, it's just not typed checked?

But binaryen need not accept (i32.add (block (const 1) (const 2))) as valid code, rather it could express it as:

const (i32 c0, i32 c1) = { values(1, 2); }
c0 + c1;

Perhaps a reason to do so is to prevent (i32.add (i32.add (block (const 1) (const 2) (const 3)))) being valid syntax which seems to break any semblance of being functional.

@kf2093
Copy link

kf2093 commented Jan 17, 2017

I am trying to understand the control flow implied by block, loop, br, br_if and end. It is br and br_if that I am not sure I've understood (from the binary encoding description). Because of my uncertainty of br and br_if, I'm also doubting whether I've understood what end does. I've done my best to understand them by creating different tests and compare the outputs. I've created a pdf document containing how I believe the control flow to be affected by the different statements. Could you verify that this is indeed correct or not?
WASM block, loop, br, br_if, end clarification.pdf

@ghost
Copy link

ghost commented Jan 17, 2017

@kf2093 Slides look right, but that is not the half of it. The end could be br 0 except that at present end is more restricted and requires that the values stack have just the number of values the block returns whereas br 0 will unwind excess values. The end on a loop is a bit odd as there is no label for the exit, so it is not the same as br 0, there was once two labels on a loop but now it is necessary to wrap a block around the loop to exit in generally. You will want to understand the interaction with the values stack to better understand it all.

@kf2093
Copy link

kf2093 commented Jan 17, 2017

Once I figure out the control flow I was going to further dig deeper into the affects on the value stack.
A while true loop in C, compiled to .ll, then .s, then .wast, then .wasm results in:

<function code header info e.g. local entries, size, etc>
loop, type_empty,
br, 0x00,
end,
end

There are two ends in this case. I am guessing there is an implicit block for each function code although there is not an outer explicit block wasm instruction.

I think from my slides what I am trying to confirm is that for a loop a br 0 instruction will jump to the beginning of the loop; when an end instruction is encountered in a loop, the program counter will be updated to execute the instruction after the end instruction;
whilst for a block, a br 0 instruction will exit the block and execute the next instruction (after the end).

@rossberg
Copy link
Member

@kf2093, you're reading is correct: a br targeting a block is like a break, a br targeting a loop is like a continue. However, please note that the issue tracker is not the right forum for asking basic questions about the Wasm design.

@kf2093
Copy link

kf2093 commented Jan 17, 2017

Sorry about that! Where's the best place to ask? (Hopefully I can contribute more to the project once I get up to scratch)

@titzer
Copy link

titzer commented Jan 17, 2017 via email

@kf2093
Copy link

kf2093 commented Jan 17, 2017

I've understood the control flow aspects to br/br_if/loop/block/end. I hope this next question is more along the lines of what can be discussed here.

For an interpreter a branch instruction to a loop can be resolved by looking at the location stored on the control stack. However, a branch (br or br_if) instruction to a block (implies that execution should exit the specified block) requires the interpreter to traverse through all instructions until it reaches the associated end instruction; which seems rather inefficient -- at least for an in-place interpreter (I'm sure this is not the case when compiling).

Should interpreters be perform a first pass when loading in wasm to resolve block/end pairs?
Or should a wasm offset be applied to the br and br_if instructions to identify the jump location?

@rossberg
Copy link
Member

@kf2093, you are commenting on an old issue about the details of the static validation rules for blocks. So I'm afraid that question still is off-topic.

@ghost
Copy link

ghost commented Jan 17, 2017

@kf2093 Yes, wasm is not optimized for interpreters, and the code needs to be validated before running it, so at least one pass will be needed and during that pass you could add information on branch target locations or transform it to another representation. There are some interpreter implementations to look at, e.g. the spec interpreter, the v8 source, and in binaryen. It seems fine to me to use the relative control stack indexes to reference the target labels: efficient use of the control stack and minimal redundant information in the encoding.

@flagxor flagxor added this to the MVP milestone Feb 3, 2017
@rossberg
Copy link
Member

rossberg commented Feb 7, 2017

The original question seems to have been clarified long ago. Feel free to reopen if not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants