Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update tuple syntax to use {..} instead of (..) and allow named fields #138

Merged
merged 1 commit into from
Jul 31, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 41 additions & 11 deletions specs/src/lang/language_primitives.md
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,7 @@ TODO

Yurt provides 4 scalar types built-in: Booleans, integers, reals and strings. Yurt also provides tuples as a compound built-int type.

The syntax for a types is as follows:
The syntax for types is as follows:

```ebnf
<ty> ::= "bool"
Expand All @@ -158,10 +158,14 @@ The syntax for a types is as follows:
| "string"
| <tuple-ty>

<tuple-ty> ::= "(" ( <ty> "," ... ) ")"
<tuple-ty> ::= "{" ( [ <ident> ":" ] <ty> "," ... ) "}"
```

For example, in `let t: (int, real, string) = (5, 3.0, "foo")`, `(int, real, string)` is a tuple type.
For example, in `let t: { int, real, string } = { 5, 3.0, "foo" }`, `{ int, real, string }` is a tuple type. `{x: int, y: real, string }` is also a tuple tuple type where some of the fields are named.

Names of tuple fields modify the type of the tuple. That is, `{ x: int }` and `{ y: int }` are different types. However they both coerce to `{ int }`.

Note that the grammar disallows empty tuple types `{ }`.

## Expressions

Expand All @@ -180,7 +184,7 @@ Expressions represent values and have the following syntax:
| <real-literal>
| <string-literal>
| <tuple-expr>
| <tuple-index-expr>
| <tuple-field-access-expr>
| <if-expr>
| <cond-expr>
| <call-expr>
Expand Down Expand Up @@ -281,23 +285,49 @@ let string = "first line\
third line";
```

#### Tuple Expressions and Tuple Indexing Expressions
#### Tuple Expressions and Tuple Field Access Expressions

Tuple Expressions are written as:

```ebnf
<tuple-expr> ::= "(" ( <expr> "," ... ) ")"
<tuple-expr> ::= "{" ( [ <ident> ":" ] <expr> "," ... ) "}"
```

For example: `let t = (5, 3, "foo");`.
For example: `let t = { x: 5, 3, "foo" };`. The type of this tuple can be inferred by the compiler to be `{ x: int, int, string }`.

The following is another example:

```rust
let t: { x: int, real } = { 6, 5.0 }
```

where the type of the tuple is indicated by the type annotation and has a named field `x`, but that named field is not actually used in the tuple expression. This is allowed because `{ x: int, real }` and `{ int, real }` coerce into each other.

Tuple fields can be initialized out of order only if all the fields have names and their names are used in the tuple expression. For example, the following is allowed:

```rust
let t: { x: int, y: real } = { y: 5.0, x: 6 };
```

while the following are not:

```rust
let t: { x: int, real } = { 5.0, x: 6 };
let t: { x: int, y: real } = { 5.0, x: 6 };
let t: { x: int, y: real } = { 5.0, 6 }; // This is a type mismatch!
```

Tuple expressions that contain a single _unnamed_ field require the trailing `,` as in `let t = { 4.0, };`. Otherwise, the expression becomes a code block that simply evaluates to its contained expression. Tuple expressions that contain a single _named_ field do not require the trailing `,`.

Note that the grammar disallows empty tuple expressions `{ }`.

Tuple indexing expressions are written as:
Tuple field access expressions are written as:

```ebnf
<tuple-index-expr> ::= <expr> "." [0-9]+
<tuple-field-access-expr> ::= <expr> "." ( [0-9]+ | <ident> )
```

For example: `let second = t.1;` which extracts the second element of tuple `t` and stores it into `second`.
For example: `let second = t.1;` which extracts the second element of tuple `t` and stores it into `second`. Named field can be accessed using their names or their index. For example, if `x` is the third field of tuple `t`, then `t.2` and `t.x` are equivalent.

#### "If" Expressions

Expand Down Expand Up @@ -349,7 +379,7 @@ The precedence of Yurt operators and expressions is ordered as follows, going fr

| Operator | Associativity |
| -------------------------------- | -------------------- |
| Tuple index expressions | left to right |
| Tuple field access expressions | left to right |
| Unary `-`, Unary `+`, `!` | |
| `*`, `/`, `%` | left to right |
| Binary `+`, Binary `-` | left to right |
Expand Down
10 changes: 6 additions & 4 deletions yurtc/src/ast.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
use itertools::Either;

#[derive(Clone, Debug, PartialEq)]
pub(super) enum UseTree {
Glob,
Expand Down Expand Up @@ -46,7 +48,7 @@ pub(super) enum Type {
Int,
Bool,
String,
Tuple(Vec<Type>),
Tuple(Vec<(Option<Ident>, Type)>),
}

#[derive(Clone, Debug, PartialEq)]
Expand All @@ -69,10 +71,10 @@ pub(super) enum Expr {
Block(Block),
If(IfExpr),
Cond(CondExpr),
Tuple(Vec<Expr>),
TupleIndex {
Tuple(Vec<(Option<Ident>, Expr)>),
TupleFieldAccess {
tuple: Box<Expr>,
index: usize,
field: Either<usize, Ident>,
},
}

Expand Down
126 changes: 71 additions & 55 deletions yurtc/src/parser.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ use crate::{
lexer::{self, Token, KEYWORDS},
};
use chumsky::{prelude::*, Stream};
use itertools::Either;
use regex::Regex;
use std::{fs::read_to_string, path::Path};

Expand Down Expand Up @@ -278,12 +279,19 @@ fn expr<'sc>() -> impl Parser<Token<'sc>, ast::Expr, Error = ParseError<'sc>> +
.then(args.clone())
.map(|(name, args)| ast::Expr::Call { name, args });

let tuple = args
.validate(|args, span, emit| {
if args.is_empty() {
let tuple_fields = (ident().then_ignore(just(Token::Colon)))
.or_not()
.then(expr.clone())
.separated_by(just(Token::Comma))
.allow_trailing()
.delimited_by(just(Token::BraceOpen), just(Token::BraceClose));

let tuple = tuple_fields
.validate(|tuple_fields, span, emit| {
if tuple_fields.is_empty() {
emit(ParseError::EmptyTupleExpr { span })
}
args
tuple_fields
})
.map(ast::Expr::Tuple);

Expand All @@ -304,72 +312,78 @@ fn expr<'sc>() -> impl Parser<Token<'sc>, ast::Expr, Error = ParseError<'sc>> +
and_or_op(
Token::DoubleAmpersand,
ast::BinaryOp::LogicalAnd,
comparison_op(additive_op(multiplicative_op(tuple_index(atom)))),
comparison_op(additive_op(multiplicative_op(tuple_field_access(atom)))),
),
)
})
}

fn tuple_index<'sc, P>(
fn tuple_field_access<'sc, P>(
parser: P,
) -> impl Parser<Token<'sc>, ast::Expr, Error = ParseError<'sc>> + Clone
where
P: Parser<Token<'sc>, ast::Expr, Error = ParseError<'sc>> + Clone,
{
let indices =
filter_map(|span, token| match &token {
Token::IntLiteral(num_str) => num_str
.parse::<usize>()
.map(|index| vec![index])
.map_err(|_| ParseError::InvalidIntegerTupleIndex {
span,
index: num_str,
}),

// If the next token is of the form `<int>.<int>` which, to the lexer, looks like a real,
// break it apart manually.
Token::RealLiteral(num_str) => {
match Regex::new(r"[0-9]+\.[0-9]+")
.expect("valid regex")
.captures(num_str)
{
Some(_) => {
// Collect the spans for the two integers
let dot_index = num_str
.chars()
.position(|c| c == '.')
.expect("guaranteed by regex");
let spans = [
span.start..span.start + dot_index,
span.start + dot_index + 1..span.end,
];

// Split at `.` then collect the two indices as `usize`. Report errors as
// needed
num_str
.split('.')
.zip(spans.iter())
.map(|(index, span)| {
index.parse::<usize>().map_err(|_| {
ParseError::InvalidIntegerTupleIndex {
span: span.clone(),
index,
}
let indices = filter_map(|span, token| match &token {
// Field access with an identifier
Token::Ident(ident) => Ok(vec![Either::Right(ast::Ident(
ident.to_owned().to_string(),
))]),

// Field access with an integer
Token::IntLiteral(num_str) => num_str
.parse::<usize>()
.map(|index| vec![Either::Left(index)])
.map_err(|_| ParseError::InvalidIntegerTupleIndex {
span,
index: num_str,
}),

// If the next token is of the form `<int>.<int>` which, to the lexer, looks like a real,
// break it apart manually.
Token::RealLiteral(num_str) => {
match Regex::new(r"[0-9]+\.[0-9]+")
.expect("valid regex")
.captures(num_str)
{
Some(_) => {
// Collect the spans for the two integers
let dot_index = num_str
.chars()
.position(|c| c == '.')
.expect("guaranteed by regex");
let spans = [
span.start..span.start + dot_index,
span.start + dot_index + 1..span.end,
];

// Split at `.` then collect the two indices as `usize`. Report errors as
// needed
num_str
.split('.')
.zip(spans.iter())
.map(|(index, span)| {
index
.parse::<usize>()
.map_err(|_| ParseError::InvalidIntegerTupleIndex {
span: span.clone(),
index,
})
})
.collect::<Result<Vec<usize>, _>>()
}
None => Err(ParseError::InvalidTupleIndex { span, index: token }),
.map(Either::Left)
})
.collect::<Result<Vec<Either<usize, ast::Ident>>, _>>()
}
None => Err(ParseError::InvalidTupleIndex { span, index: token }),
}
_ => Err(ParseError::InvalidTupleIndex { span, index: token }),
});
}
_ => Err(ParseError::InvalidTupleIndex { span, index: token }),
});

parser
.then(just(Token::Dot).ignore_then(indices).repeated().flatten())
.foldl(|expr, index| ast::Expr::TupleIndex {
.foldl(|expr, field| ast::Expr::TupleFieldAccess {
tuple: Box::new(expr),
index,
field,
})
}

Expand Down Expand Up @@ -484,10 +498,12 @@ fn ident<'sc>() -> impl Parser<Token<'sc>, ast::Ident, Error = ParseError<'sc>>

fn type_<'sc>() -> impl Parser<Token<'sc>, ast::Type, Error = ParseError<'sc>> + Clone {
recursive(|type_| {
let tuple = type_
let tuple = (ident().then_ignore(just(Token::Colon)))
.or_not()
.then(type_)
.separated_by(just(Token::Comma))
.allow_trailing()
.delimited_by(just(Token::ParenOpen), just(Token::ParenClose))
.delimited_by(just(Token::BraceOpen), just(Token::BraceClose))
.validate(|args, span, emit| {
if args.is_empty() {
emit(ParseError::EmptyTupleType { span })
Expand Down
Loading