EZSharp Compiler

EZSharp is a Java-like language originally created by Professor Eugene Zima at Wilfrid Laurier University. This compiler is a project for the CP471 course at WLU taught by Professor Nakhat Fatima. The compiler is written in Rust to take advantage of its performance and memory safety features.

Features

These are the necessary features needed for the project to be completed. They are updated as the project progresses.

✅ Lexical Analysis
✅ Syntax Analysis
✅ Semantic Analysis
✅ Intermediary Code Generation
❌ Optimization
❌ Assembly Code Generation

Building

To build the project, you need to have Rust installed. You can install Rust by following the instructions on the official website.

Once you have Rust installed, you can build the project by running the following command in the root directory of the project:

cargo build

This will create a target directory in the root of the project with the compiled binary.

Usage

To compile an EZSharp file, you can run the following command in the root directory of the project:

cargo run -- <path-to-ezsharp-file>

Or you can run the compiled binary for the current release directly (tested on Windows 11 x86-64 only):

./ezsharp_compiler.exe <path-to-ezsharp-file>

The compiler outputs the tokens found during Lexical Analysis, the symbols found during Syntax Analysis, and the 3-TAC program:

The outputs are logged to a file called tokens.log, symbol_table.log, and o.tac respectively in a directory called logs in the root of the project.
Any errors found during Lexical Analysis are also logged to a file called lexical_errors.log in the same directory.
Any errors found during Syntax Analysis are also logged to a file called syntax_errors.log in the same directory.

The compiler supports the following options:

--log-folder <path-to-log-folder>: Folder to log the outputs and errors to (default is logs).
--output <output-file-name>: Output file for the 3-TAC program (default is o.tac).

Examples

The test_programs directory contains some sample EZSharp programs that can be used to test the compiler.

Given the following EZSharp program in a file called /test_programs/Test10.cp:

// Hello world
def int add(int x, int y)
    int z;
    return x + y;
fed;

int x, c, a[(5 > 3) * 3], b;
double d[1];
c = -23;
x = 3 + 2;

// while with brackets
while ((x + a[0]) < 3 and (x / 3 > 2)) do
    x = x + 1;
    if (x > 3) then
        print add(add(x, 3), 2);
    fi;
od;

a[1] = 2 + a[x * 3];.

It can be compiled by running the following command in the root directory of the project:

cargo run -- test_programs/Test0.cp

Which outputs the following tokens to the tokens.log file:

Kdef on line 2
Kint on line 2
Identifier("add") on line 2
Soparen on line 2
Kint on line 2
Identifier("x") on line 2
...
Identifier("x") on line 20
Omultiply on line 20
Tint(3) on line 20
Scbracket on line 20
Ssemicolon on line 20
Speriod on line 20

Outputs the following symbols to the symbol_table.log file:

{
	add: func(int, int) -> int
	{
		x: int
		y: int
		z: int
	}
	x: int
	c: int
	a: [int; 3]
	b: int
	d: [double; 1]
}

And outputs the 3-TAC codes to the o.tac file:

	Goto main0;
add1:
	BeginFunc 4;
	y1 = GetParams 4;
	x1 = GetParams 4;
	t0_ = x1 + y1;
	Return t0_;
	EndFunc;
...
	PopParams 4;
fi4:
	Goto while0;
od1:
	t19_ = x0 * 3;
	*(a0 + 1) = 2 + *(a0 + t19_);
	EndFunc;

Future Improvements

Lexical Analysis
- Automate First and Follow sets generation
Syntax Analysis
- Give better error messages (using empty cells in LL(1) table)
- ~~Clean up symbol table creation~~ (completed in Semantic Analysis)
- ~~Add support for parantheses in boolean expressions~~ (completed in Semantic Analysis)
- ~~Add support for negated expressions (grammar change)~~ (completed in Semantic Analysis)
Semantic Analysis
- Clean up semantic actions
Intermediary Code Generation
- Use standard 3-TAC format
- 3-TAC optimization
- Properly propagate types to temp variables

Additional Notes

The Productions for this grammar and the First and Follow sets were generated manually and can be found in the simplified_productions.txt and first_follow_set.txt files respectively.
The LL(1) table is generated automatically using the First and Follow sets, but a copy of what it looks like can be found in the LL1_table.csv file.
- The production indices are the same as the indices for the productions found in the syntax_analysis/productions.rs file.
- There's a production index used which is outside of the array bounds that represents all productions that go to epsilon.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.vscode		.vscode
src		src
test_programs		test_programs
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LL1_table.csv		LL1_table.csv
README.md		README.md
first_follow_set.txt		first_follow_set.txt
simplified_productions.txt		simplified_productions.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EZSharp Compiler

Features

Building

Usage

Examples

Future Improvements

Additional Notes

About

Releases 3

Packages

Languages

pabloj2001/ezsharp-compiler

Folders and files

Latest commit

History

Repository files navigation

EZSharp Compiler

Features

Building

Usage

Examples

Future Improvements

Additional Notes

About

Topics

Resources

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Packages