Skip to content

Commit

Permalink
notes
Browse files Browse the repository at this point in the history
  • Loading branch information
jonmeow committed Mar 19, 2024
1 parent e3c02cc commit 1c86143
Showing 1 changed file with 57 additions and 25 deletions.
82 changes: 57 additions & 25 deletions proposals/p3797.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,8 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
- [Rationale](#rationale)
- [Alternatives considered](#alternatives-considered)
- [Other raw identifier syntaxes](#other-raw-identifier-syntaxes)
- [Restrict `r#` to current and future keywords](#restrict-r-to-current-and-future-keywords)
- [Don't require `r#` for references to raw identifiers](#dont-require-r-for-references-to-raw-identifiers)
- [Restrict raw identifier syntax to current and future keywords](#restrict-raw-identifier-syntax-to-current-and-future-keywords)
- [Don't require syntax for references to raw identifiers](#dont-require-syntax-for-references-to-raw-identifiers)
- [Don't provide raw identifier syntax](#dont-provide-raw-identifier-syntax)

<!-- tocstop -->
Expand Down Expand Up @@ -129,40 +129,72 @@ Advantages to `r#` are:
literals. The `r#` syntax offers consistency with this, and will hopefully
be recognizable to users.
- Consistency with Rust.
- Avoids reserving an otherwise unused character for a syntax that should have
narrow usage.

Disadvantages are that any `r`-prefixed identifier parses substantially slower,
as noted in [PR #3044](https://github.com/carbon-language/carbon-lang/pull/3344)
which implemented `r#` syntax.

Other syntaxes we considered are:

- `#` without `r`.
A disadvantage is that any `r`-prefixed identifier parses substantially slower,
as noted by the benchmarks in
[PR #3044](https://github.com/carbon-language/carbon-lang/pull/3344) which
implemented `r#` syntax. A 2% benchmark slowdown indicates around 2x because `r`
is about 1-in-55 identifiers. This may be reduced if we enable tail calls and
other optimizations.

Various other prefixes have been discussed, mostly using a special character
prefix in order to restrict the lexing impact. In particular:

- `\` prefix.
- Similar to `\` escaping in strings.
- More intuitive "escaping" semantic for some developers versus `r#`.
- Creates a different meaning for `\n` as an identifier versus `\n` as a
character escape.
- Some of this could be addressed by restricting `\` raw identifiers
to only keywords in the language, meaning `\n` would only be a
character escape. The alternative
[Restrict raw identifier syntax to current and future keywords](#restrict-raw-identifier-syntax-to-current-and-future-keywords)
applies to this solution.
- `#` prefix without `r`.
- Would be more consistent with string literals, and avoid the lexing
overhead.
- We are considering using a `#` prefix for metaprogramming, so the `r`
offers a way to keep the `#` prefix available for other purposes.
- `#if` may look to C++ developers like a compiler directive, rather than
a raw identifier.
- Backticks, as in Swift.
a raw identifier for `if`.
- Backticks, consistent with Swift.
- We prefer not to use backticks for Carbon syntax so that it is easy to
write in Markdown, which uses backticks for inline code.
- `@` prefix, as in C#.
- `@` prefix, consistent with C#.
- We've also discussed using `@` for attributes, similar to Python.
- Other currently unused characters, such as `~` or `%`.
- Reserves a character for a feature with limited usage.
- Misses an opportunity to provide cross-language consistency.

### Restrict `r#` to current and future keywords
- Other currently unused characters, such as `~`, `$`, or `%`.
- We expect raw identifiers to be relatively rare. There may be future
uses for these characters that allow us to serve a broader use-case.
- While we could change raw string literal syntax to use the same
character, it would be helpful if raw string literal syntax had some
degree of cross-language syntactic consistency in order to reduce
learning curves.

Raw identifier syntax is expected to be an edge case of the language. As a
consequence, it should probably be expected that developers reading it will be
more likely to rely on their understanding of the syntax either from other parts
of Carbon, or from other languages. This means it's helpful if the syntax can be
understood on its own, but if it's confusable with C++ syntax, the relative
rarity could exacerbate understandability issues.

### Restrict raw identifier syntax to current and future keywords

We had discussed maintaining a list of current and future keywords, and only
allowing `r#` in those cases. We aren't doing that because it creates an
additional burden for software evolution, that the language must release a
version that "declares" future keywords without turning them into actual
keywords.

### Don't require `r#` for references to raw identifiers
allowing raw identifier syntax in those cases. If this were done as part of the
toolchain, releases would need to push versions that "declare" future keywords
without turning them into actual keywords. For a library that used those
identifiers, it would initially be compatible with compiler versions up to and
including the "future" keyword version; upon using raw identifier syntax, that
would become the minimum compiler version. This creates a compiler versioning
dependency that it might be helpful to avoid.

As an alternative approach, Carbon could provide a command line option which
libraries could use to specify future keywords that are used in the program.
While some systems such as `bazel` allow libraries to indicate options they need
for compilation, other build systems such as `cmake` might require library users
to update their dependencies as well.

### Don't require syntax for references to raw identifiers

We could say that, in a scope where a raw identifier has been declared, the
token without `r#` now refers to the identifier instead of the keyword. If the
Expand Down

0 comments on commit 1c86143

Please sign in to comment.