Rex is a tiny regular expression engine written in Rust. It features a parser and interpreter and supports many common RegEx features.
cargo run -- <input file> <expression> [options]
Or, to build a binary:
cargo build
./rex <input file> <expression> [options]
Arguments can be provided in any order. For some shells (like zsh) expressions should be wrapped in quotation marks.
cargo test
-ng
/--no-groups
: Ignore matching groups (order of operations still applies).-b
/--benchmark
: Benchmark performance (results will not be printed).
-
Concatenation:
abc
-
Union:
a|b
-
Kleene closure:
a*
- One or more:
a+
- Zero or one:
a?
- One or more:
-
Grouping:
(a|b)*
- All groups are matching groups
-
Escaping:
a\*
- Common escape codes:
\t
,\n
,\v
,\f
,\r
- Unicode escape codes:
\u2603
- Multi-character Unicode will compile but fail to interpret (TODO)
- Ascii escape codes (hex or dec):
\x61
,\97
- Ascii escape codes will always be valid:
\x61b
=ab
,\971
=a1
- Ascii escape codes will always be valid:
- Common escape codes:
-
Charsets:
[abc]
- Negation:
[^xyz]
- Ranges:
[a-zA-Z]
- Can have a set of multiple character classes (e.g.:
[\s\w]
) - Can understand when
-
is meant literally (e.g.:[\w-]
) - Can join characters with themselves (e.g.:
[a-a]
) - Cannot join character classes (e.g.:
[\w-~]
) - Cannot join characters "out of order" (e.g.:
[a-A]
)
- Can have a set of multiple character classes (e.g.:
- Negation:
-
Common Perl ASCII character classes:
.
: Any Unicode character (including\n
)\d
: digit ([0-9]
)\D
: not digit\w
: word ([a-zA-Z0-9_]
)\W
: not word\s
: whitespace ([\n-\r]
)- This range includes
\f
, which some versions of Perl do not
- This range includes
\S
: not whitespace\N
: not newline ([^\n]
)