Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Relay][RFC] Relay IR Text Format #1781

Merged
merged 64 commits into from
Dec 2, 2018
Merged

Conversation

joshpoll
Copy link
Contributor

@joshpoll joshpoll commented Sep 28, 2018

[RFC]: Relay IR Text Format

Please comment on syntax in #1782.

This PR introduces the Relay IR text format. It is intended to be used similarly to LLVM's text format. For example, a text format makes it easier to debug Relay programs and optimization passes as well as serve as a target language for machine learning front ends.

This PR will include

  • An ANTLR grammar for the Relay IR text format.

  • A parser that uses the grammar to produce a Relay AST.

  • A pretty printer that prints a Relay AST in text format. Follow-up PR.

The syntax is heavily inspired by LLVM, Rust, and ReasonML.

Note: The ANTLR grammar accepts some programs that aren't valid Relay programs in order to have better error messages at parse time.

@tqchen
Copy link
Member

tqchen commented Sep 28, 2018

CC @dmlc/tvm-team

@tqchen
Copy link
Member

tqchen commented Sep 28, 2018

The main point we want to address is how to print the code in graph form (without all the lets). This is important because most people construct their net in this way. As an example, we could have

add(%x, add(%y, %z))

but a preferred way of printing it should be

%1 = add(%y, %z)
%2 = add(%x, %1)

@nhynes
Copy link
Member

nhynes commented Sep 28, 2018

how to print the code in graph form

Out of curiosity, what's wrong with lets? The line per assign is easy to debug and parse. If a user wanted to visualize the graph, the Relay IR can be lowered to graphviz or revered in Tensorboard. Since everything has a unique name, it'll be easy to jump back and forth.

@tqchen
Copy link
Member

tqchen commented Sep 28, 2018

I just feel there is a tension that user still starts with the old habit of constructing things via DSL(graph form), which makes them cluttered in that case. The printer need to print out the program faithfully still in this case without having to generate a long line of nested calls

@tqchen
Copy link
Member

tqchen commented Sep 28, 2018

Let us move the discussion to the corresponding RFC issue. @joshpoll can you cross post the proposal there and use the PR to mainly comment on implementation details?


// non-negative ints
// INT: '0' | [1-9] DIGIT* ; // no leading zeros
INT: DIGIT+ ;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems it will still accept leading zeros.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now intentionally parse accept leading zeros to be more forgiving. I'll delete the old version.

// TODO: is this really the desired outline? if there are defns should there be an expression at the end?
prog: option* (expr | defn+) EOF ;

option: 'set' ident BOOL_LIT ;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is option's purpose? Why it is tied to bool literal only?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are intended to be boolean compiler options not none/some options. I think @jroesch can speak more directly to their purpose.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the previous iteration our design goal was to enable the text format to be able to set pragmas/options for the program.

: '(' expr ')' # parens
| '-' expr # neg
| expr op=('*'|'/') expr # binOp
| expr op=('+'|'-') expr # binOp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need explicit operator associativity?

Copy link
Contributor Author

@joshpoll joshpoll Sep 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The ANTLR grammar is used to automatically generate the lexer and the bulk of the parser, so it's important to encode associativity here.

| expr op=('=='|'!=') expr # binOp

// function definition and application
| expr '(' (expr (',' expr)*)? ')' # call
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we need the function name as an indent instead of expr since expr will accept too much.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of agree; however, that becomes restrictive if we want to use curried functions. It's possible to restrict what expressions can be used in calls in the parser itself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to control the form of the program we should not do it in the parser as it makes the parser overly restrictive and forces semantic information into the parser instead of later analysis phases.

It also disallows programs which are semantically equivalent.

For example:

let x = f(x, y, z);
x(z, y)

Instead of:

 f(x, y, z)(z, y)

Even if it looks ugly we should let people generate the text format how they see fit imo.

For example a simple compiler targeting Relay might generate code like:

(fn (...) { ... })(x, y, z)

| 'if' expr body 'else' body # ifElse

// sequencingg
| 'let' MUT? ident (':' type_)? '=' expr ';' expr # let
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does expr ';' expr mean here?

Copy link
Contributor Author

@joshpoll joshpoll Sep 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relay doesn't have statements, only expressions. Thus let %x = 1; is not valid Relay, because it doesn't have a value. let expressions must always contain an identity, an assignment, and a body that might use that assignment (which is the value of the let expression). For example,

let %x = 1;
%x

is valid Relay and evaluates to 1.

This is similar to the syntax used by ReasonML, which is a version of OCaml made to look like JavaScript.

| ident # identExpr
| scalar # scalarExpr
| expr '.' INT # project
| 'debug' # debug
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think expr should also accept body ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good idea, and I'll look into it. It's not immediately clear what the semantics of the body should be, since Relay has no concept of it.

fun: 'fn' paramList '=>' type_? body ;
defn: 'def' ident paramList '=>' type_? body ;

paramList: '(' param (',' param)* ')' ;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fun does not accept zero parameters.

// bool

baseType
: 'int(' INT ',' INT ')' # intType
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Int can be any arbitrary number, is that expected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

INT is really a natural number, which is what we want for projection and the inputs here, which represent bits and lanes. Maybe I should change the name to reflect that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the type checker should enforce constraints on this, it will yield a better error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about adding a type-level function syntax and moving this sort of parsing into the python parser?

@MarisaKirisame
Copy link
Contributor

The main point we want to address is how to print the code in graph form (without all the lets). This is important because most people construct their net in this way. As an example, we could have

add(%x, add(%y, %z))

but a preferred way of printing it should be

%1 = add(%y, %z)
%2 = add(%x, %1)

I am a bit confused. doesnt the lower code has two let?


type_
: '(' type_ ')' # parensType
| type_ op=('*'|'/') type_ # binOpType
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I ask to provide short examples in comments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think examples might be better suited to docs, but I'm open to this.


// a program is a list of options followed by either an expression or a list of definitions
// TODO: is this really the desired outline? if there are defns should there be an expression at the end?
prog: option* (expr | defn+) EOF ;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to expr at the end. Why don't we want local defns?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defns are global function definitions. The idiomatic way to do local definitions is

let foo = fn (x, y) => {
    ...
};
...

It's difficult to see a better syntactic way to express them, but I'm open to ideas.

LE: '<=' ;
GE: '>=' ;
EQ: '==' ;
NE: '!=' ;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we define operators as a sequence of "+-*/%..." symbols plus associativity? This would probably reduce the size of grammar.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of these lines is to provide symbols that represent these pieces of syntax in the rest of the parser. I'm considering splitting the lexer and parser files as this would probably improve readability. This is idiomatic ANTLR as far as I can tell. A wealth of example grammars can be found here.

// sugar for let _ = WriteRef(ident, expr); expr
| ident '=' expr ';' expr # writeRef

| ident # identExpr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should global expressions consist of global idents only? If so, how to express it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the rest of the parser, we restrict the kinds of idents that can appear in certain positions. The ANTLR grammar is more lenient, though, because it allows us to produce better error messages when someone accidentally uses the wrong type of ident.

@joshpoll joshpoll force-pushed the relay-text-format branch 3 times, most recently from e6951dc to faa79bc Compare October 5, 2018 23:55
@joshpoll
Copy link
Contributor Author

joshpoll commented Oct 5, 2018

grammar/py2 and grammar/py3 are build artifacts. They are included to avoid Java as a dependency.

@tqchen
Copy link
Member

tqchen commented Oct 6, 2018

Let us avoid include building artifacts for now, as long as the parser is a separate dep it is fine. Later we can to do a binary release of artifact where the build can depend on

@joshpoll joshpoll mentioned this pull request Oct 6, 2018
66 tasks
@joshpoll joshpoll force-pushed the relay-text-format branch 4 times, most recently from e824e2f to e75de2c Compare October 13, 2018 22:47
@@ -69,7 +69,7 @@ struct TypeAlphaEq : TypeVisitor<const Type&> {

void VisitType_(const IncompleteTypeNode* bt1, const Type& t2) final {
if (const IncompleteTypeNode* bt2 = t2.as<IncompleteTypeNode>()) {
equal = equal && bt1 == bt2;
equal = equal && bt1->kind == bt2->kind;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you have this in a seprate 1-line pr? I need it to restore anf testing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tqchen tqchen merged commit d3bc59d into apache:master Dec 2, 2018
@tqchen
Copy link
Member

tqchen commented Dec 2, 2018

Thanks, @jroesch @joshpoll @grwlf @yuruofeifei @junrushao1994 @MarisaKirisame , this is now merged

FrozenGene pushed a commit to FrozenGene/tvm that referenced this pull request Dec 27, 2018
@ZihengJiang ZihengJiang mentioned this pull request Feb 1, 2019
wweic pushed a commit to neo-ai/tvm that referenced this pull request Feb 20, 2019
wweic pushed a commit to neo-ai/tvm that referenced this pull request Feb 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants