`to-be-named`

This project is a hobby languaje, I'm just trying to learn a little about asembler and what it takes to create a compiler.

In this project you can find code for a tokenizer, parser and generator, the latter one outputs Netwide Asembler from which you can later generate a binary for many platforms ...apparently, haven't tried it, works on my machine though.

Usage

Write a text file with valid syntax (see next section) and extension .tbd (will change when I find a good name ...or not).

Build the project with go build it will procude an executable named compiler.
Run ./compiler path-to-source-file.tbd.
- It will produce an output.asm.
Use nasm to generate the object file, it depends on the target platform.
- It will produce an output.o.
- Example for x86-64 linux nasm -f elf64 output.asm
- Use nasm --help and see -f format section for other systems.
- I don't know how other systems react to the asembler my generator outputs, so fingers crossed.
Run ld output.o -o runnable.
- It will generate an executable named runnable.
- (I don't remember where I installed ld)
Then you can just run like ./runnable.

For now the only thing you can do with the language is return diferent code errores, to see that run echo $?, this prints the error code of the last command you executed. Idk how to do that in other platforms...

Syntax

For now you can only initialize variables of type int and return that as the exit code, here's an example:

int num = 420
exit num

Or simply return some value like this:

exit 69

Will expand when new things get added.

How does it work?

Tokenizer

The tokenizer reads a sequence of runes that contains the source code and returns a list of tokens.

This does not check for gramatical errors like a missing closing parentheses, it just returns the list.

Tokens

Separators	Value	Description
SEP	`\n`	Separator
P_L	`(`	Parentheses left
P_R	`)`	Parentheses right
B_L	`{`	Brace left
B_R	`}`	Brace right
SB_L	`[`	Square brace left
SB_R	`]`	Square brace right

Operations	Value	Description
ADD	`+`	Addition
SUB	`-`	Subtraction
MUL	`*`	Multiplication
DIV	`/`	Division

Others	Value	Description
EQ	`=`	Assignment operator

Keyword	Value	Description
INT	`int`	Integer type
BOOL	`bool`	Boolean type
EXIT	`exit`	Exit command

Matchers	Regex	Description
IDENTIFIER	`[a-zA-Z][a-zA-Z0-9_]*`	Identifies names of variables
INT_LITERAL	`[0-9]+`	Represents literal numbers
BOOL_LITERAL	`true\|false`	Represents literal booleans

Parser

[To be refactored]

Takes a list of tokens and generates an abstract syntax tree (AST), which represents the syntactic structure of the source code.

This process drops separators as they are only needed to delimit different parts of the code but do not need to be saved in the AST, since these limits would be implied by the structure itself.

Some nodes of the tree do refer to values from the tokens. For example a node of the tree that represents assignment might want to save the type to be assigned, the name of the variable to create and the value, which in turn might be a node of an expresion that represents addition, since you could assign the result of a sum to the variable instead of a literal.

Grammar

$$ \begin{aligned} [\text{Program}]&\to % [Program] [\text{Statement}]^* \ [\text{Statement}]&\to % [Statement] \begin{cases} \text{int}\space\it{identifier}\space\text{=}\space[\text{Expresion}] \ \it{identifier}\space\text{=}\space[\text{Expresion}] \ \text{exit}\space[\text{Expresion}] \ [\text{Scope}] \end{cases} \ [\text{Scope}]&\to % [Scope] \begin{cases} {[\text{Statement}]^} \end{cases} \ [\text{Expresion}]&\to % [Expresion] \begin{cases} [\text{Term}] \ [\text{Operation}] \end{cases} \ [\text{Term}]&\to % [Term] \begin{cases} \it{literal} \ \it{identifier} \end{cases} \ [\text{Operation}]&\to % [Operation] \begin{cases} [\text{Expresion}]\space\text{+}\space[\text{Expresion}] \ [\text{Expresion}]\space\text{-}\space[\text{Expresion}] \ [\text{Expresion}]\space\text{}\space[\text{Expresion}] \ [\text{Expresion}]\space\text{/}\space[\text{Expresion}] \end{cases} \end{aligned} $$

Generator

[To be refactored]

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.github/workflows		.github/workflows
examples		examples
generator		generator
keywords		keywords
parser		parser
tokenizer		tokenizer
utils		utils
.gitignore		.gitignore
README.md		README.md
coverage.sh		coverage.sh
go.mod		go.mod
go.sum		go.sum
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

`to-be-named`

Usage

Syntax

How does it work?

Tokenizer

Tokens

Parser

Grammar

Generator

References

About

Releases

Packages

Languages

Jorge1701/compiler

Folders and files

Latest commit

History

Repository files navigation

to-be-named

Usage

Syntax

How does it work?

Tokenizer

Tokens

Parser

Grammar

Generator

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

`to-be-named`

Packages