This is a toy language inspired by the C ternary operator.
It originated when I was thinking it would be nice to have an equivalent to the C ternary operator for the switch
statement, this was then expanded to why not make everything an operator and eliminate keywords.
Further to this I arrived at the following guides for implementation:
- Operators not keywords
- Binary operators only
- Only use Standard C
- Use it as a sandbox for other ideas
Windows 7
jEdit Editor
CodeBlocks (to build and debug)
Git Source Control Management
FreeCommander File Manager
ConEmu Console Terminal
Implemented as an executable syntax tree interpreter.
UTF-8 encoding.
Comments are started with #
If followed by one of (
, [
, {
, the comment is terminated by the corresponding )
, ]
, }
. The active brackets can be nested.
Otherwise comments are terminated at the end of line.
[0-9]+([Ee][0-9]+)?
(0x|0X)[0-9]+([Pp][0-9]+)?
[0-9]+\.[0-9]*([Ee][-+]?[0-9]+)?
(0x|0X)[0-9A-Fa-f]+\.[0-9A-Fa-f]*([Pp][-+]?[0-9]+)?
Delimited by "
.
Can contain escape sequences (see Characters section)
Delimited by '
.
Can contain an escape sequence.
Initiated by \
, followed by:
0
inserts a nul character.n
inserts a new-line character.t
inserts a horizontal-tab character.U
oru
followed by up to 8 hexadecimal digits specifying a Unicode code-point.W
orw
followed by up to 4 hexadecimal digits specifying a Unicode code-point.X
orx
followed by up to 2 hexadecimal digits specifying a Unicode code-point.- End-of-Line characters; these are elided, including CR-LF and LF-CR pairings.
- for other characters, acts as a quoting mechanism.
Initiated by one of (
, [
, {
, terminated by the corresponding )
, ]
, }
.
(
)
are elided, replaced in the syntax tree by the bracketed sub-expression.
[
]
and {
}
are represented in the syntax tree by distinct operators.
[
]
is used to define arrays/environments.
{
}
is used to designate an evaluation block.
Where the bracketed expression is an operand-less operator sans space, then this forms a distinct operator.
An array can be indexed, associative, or a mix of both. They can also act as an environment (aka Name-space or, scope).
Indexes are zero based. Assigning to to the last + 1 index, appends a new entry.
The following environments are predefined:
local which is the default scope within a function. It can be specifically invoked using the (:)
operator. There are no predefined identifiers in the local environment.
static which is the default scope within a source file. It can be specifically invoked using the {:}
operator. There are no predefined identifiers in the static environment.
global, which is available to all. It can be specifically invoked using the [:]
operator. Unless oboe is invoked with the --math
option, there are no predefined identifiers in the global environment.
system, which can be accessed via the sigil operator.
When an environment is applied to an expression or, expression-list, it is automatically linked to the current environment.
An anonymous environment can be utilized to limit the scope of variables.
Used to demark a block of code; primarily this will be used with conditional expressions to isolate a block of code to avoid unwanted interaction with the ;
operator which is utilized to designate alternate program flow paths.
- applicate, has no lexical representation, but is invoked by adjacency.
,
sequence, creates a list of expressions.;
assemblage, creates a list of sequences/expressions.
See lex.h
for permitted lexeme characters.
Where an operand-less operator is bracketed sans space, then this forms a distinct operator.
User-defined operators can be named by prefixing an identifier with '`' and can also be terminated with another back-tick.
All operators are inherently binary; when used as a unary operator, the operator is still parsed at the same precedence level; therefore, when an operator is used as a unary operator in a sub-expression, the sub-expression should be parenthesized.
See lex.h
for permitted lexeme characters.
Expressions are evaluated left to right.
Precedence levels, in decreasing order, are:
- Primary (Values, Identifiers, Sub-expressions)
- Applicate
- Binding
- Exponential
- Multiplicative
- Additive
- Bitwise
- Relational
- Logical
- Conditional
- Assigning
- Declarative
- Interstitial
- Sequence
- Assemblage
Although the goal is for only binary operators, the simplicity of the implementation of parsing gives us unary operators for free - it would require more code to enforce binary only. However, in the syntax tree all operators are binary, unary operations being represented by having a non-value operand (internally this is the Zen type - Zero/Empty/Null). The empty parenthesis ()
operator can be used to specify Zen explicitly.
The more detailed grammar (e.g. declaration, selection, iteration) is handled at runtime; but is built from binary operators.
left-operand ; right-operand
An assemblage may be evaluated differently when used as an operand, but is otherwise evaluated thus:
left-operand is evaluated, then right-operand is evaluated, the result of evaluating the right-operand is returned.
left-operand ,
right-operand
A sequence may be evaluated differently when used as an operand, but is otherwise evaluated thus:
left-operand is evaluated, then right-operand is evaluated, and a new sequence of the results is created. Individual operators [e.g. conditional, iteration or, selection] may handle sequences differently in certain instances.
operand ..
operand
either:
referent :
operand
or:
referent ::
operand
or:
referent :^
reference
or:
referent (
parameter? (,
parameter)* )
:
operand
or:
[precedence-operator-string]? operator-string (
parameter? (,
parameter)* )
:
operand
or:
operator-string :
operator-string
Normally, non-operator declarations are made in the static environment (source-file, function, ...); if the global environment operator [:]
is applied to the declaration then it is made in the global environment. Operator declarations are always made in the global environment.
Declarations within a non static-scope (e.g. within a function), can be made static by applying the static environment operator (:)
. They will be visible to all functions that share the same static scope; e.g. within the same source file.
either:
reference =
operand
or:
reference =^
reference
either:
condition ?
true-operand
condition is evaluated, and if the result, when cast to a boolean value, evaluates to true, then true-operand is evaluated.
or:
condition ?
(
true-operand ;
false-operand)
condition is evaluated, and if the result, when cast to a boolean value, evaluates to true, then true-operand is evaluated, otherwise false-operand id evaluated.
sequences in condition are evaluated as a simple list of expressions, each evaluated in turn; with the result of the evaluation of the final expression in the sequence is used to determine the condition.
operand ?
Zen
Zen ?
operand
evaluates operand and returns its boolean value.
The !
operator is as above, except the condition is inverted.
either:
condition ?:
(
(case-expression :
action-expression ;
)+ default-action-expression?)
or:
Zen ?:
(
(case-expression :
action-expression ;
)+ default-action-expression?)
sequences in condition are evaluated as a simple list of expressions, each evaluated in turn; with the result of the evaluation of the final expression in the sequence is used to determine the condition.
either:
iteration-control ?*
iteration-operand
or:
iteration-control ?*
(
_iteration-expression ;
no-iteration-expression )
no-iteration-operand is evaluated if the controlling condition never evaluates true
or:
iteration-control ?*
Zen
where iteration-control is either:
condition
or:
(
initialization ;
condition )
or:
(
initialization ;
condition ;
recalculation )
or:
(
identifier :
range [ &&
condition] )
or:
(
identifier =
range [ &&
condition] )
or:
(
identifier :
sequence [ &&
condition] )
or:
(
identifier =
sequence [ &&
condition] )
or:
(
identifier :
array[
range]
[ &&
condition] )
or:
(
identifier =
array[
range]
[ &&
condition] )
or:
(
identifier :
[
initializer]
[ &&
condition] )
or:
(
identifier =
[
initializer]
[ &&
condition] )
The !*
operator is as above, except the condition is inverted; does not apply to ranges.
sequences in initialization, condition and, recalculation are evaluated as a simple list of expressions, each evaluated in turn; in the case of condition with the result of the evaluation of the final expression in the sequence is used to determine the condition.
left-operand operator right-operand
The following operators are built-in:
&&
logical AND
||
logical OR
<
less than
<=
less than or equal
==
equal
<>
not equal
>=
greater than or equal
>
greater than
&
bitwise AND
|
bitwise OR
~
bitwise XOR
+
add
-
subtract
*
multiply
/
divide
//
modulo
<<
shift left
>>
shift right
<<<
extract left
>>>
extract right
<<>
rotate left
<>>
rotate right
The non-comparative operators also have a self-assigning form: e.g.
reference +=
operand
either:
left-operand right-operand
or:
operator right-operand
operand @
identifier
Used to access the attributes and functions of operand (e.g. type query, type conversion).
When Zen is the operand, it provides access to the system library.