Skip to content

String Literals

7ombie edited this page Aug 27, 2021 · 4 revisions

Like WAT, PHANTASM string literals use the Quote character (the double-quote character, not the apostrophe). However, the escape sequences are different.

The slash-characters have no special meaning inside strings literals Instead, PHANTASM uses curly braces to wrap escape sequences.

Note: PHANTASM files can only contain ASCII printables and Newline characters, which (naturally) extends to strings.

Escaping Escape Characters

To represent an actual brace within a string literal, the required character must be doubled-up:

  • {{: Represents a single Open Brace character.
  • }}: Represents a single Close Brace character.

Note: Treating closing braces like opening braces is not strictly necessary (as the lexer could disambiguate a regular single closing brace from one used to terminate an escape sequence), but it is much easier for humans to parse literals if braces (that are balanced in the expressed string) are also balanced in the literal.

Escape Sequences & Expressions

PHANTASM uses a pair of curly braces to wrap an escape sequence. Each escape sequence can contain any number of space-separated escape expressions. Each expression evaluates to a single Unicode character.

Escape expressions can be named escape expressions or hexadecimal escape expressions.

Named Escape Expressions

Named escape expressions only contain lowercase letters. The spellings are defined by PHANTASM, and only commonplace characters have names.

The most common characters have two or more names, which permits shorter aliases for the longer names. The current set of all names is summarized below:

  • b | backspace: The ASCII Backspace character.
  • f | ff | formfeed: The ASCII Formfeed character.
  • n | newline: The ASCII Newline character.
  • q | quote: The ASCII Quote (or double-quote) character.
  • r | cr | return: The ASCII Carriage Return character.
  • s | space: The ASCII Space character.
  • t | tab: The ASCII Tab character.
  • v | vt | vtab: The ASCII Vertical Tab character.
  • z | zero | null: The ASCII Null character (a zero-byte).
  • times | multiply: The Multiplication Sign.
  • divide: The Division Sign.
  • en | endash: The En Dash character.
  • em | emdash: The Em Dash character.
  • paragraph: The Paragraph Marker.
  • section: The Section Marker.
  • check: The Check Mark character.
  • cross: The Cross Mark character.
  • up: The Up Arrow.
  • down: The Down Arrow.
  • left: The Left Arrow.
  • right: The Right Arrow.
  • ellipsis: The Ellipsis character.
  • star: The (five-pointed) White Star character.

Note: More names will be added over time.

Hexadecimal Escape Expressions

Hexadecimal escape expressions use uppercase hexadecimal digits to express the Unicode code point for the desired character:

  • {1F680}: A rocket emoji (🚀).
  • {1F600}: A grinning face emoji (😀).
  • {1F600 1F680}: A grinning face, (immediately) followed by a rocket emoji (😀🚀).

Named expressions can be mixed with hexadecimal expressions. The following example creates indented bulletpoints with smiley face emojis:

"{n t 1F600} Bulletpoint One {n t 1F600} Bulletpoint Two"

Note: Unrecognized escape sequence are always syntax errors, so the syntax can be expanded in future.