Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Compiler Bug]: Stripping curly braces breaks character escapes (breaks JSX spec 1.5.1) #29648

Open
1 of 4 tasks
coravacav opened this issue May 29, 2024 · 3 comments
Open
1 of 4 tasks

Comments

@coravacav
Copy link

What kind of issue is this?

  • React Compiler core (the JS output is incorrect, or your app works incorrectly after optimization)
  • babel-plugin-react-compiler (build issue installing or using the Babel plugin)
  • eslint-plugin-react-compiler (build issue installing or using the eslint plugin)
  • react-compiler-healthcheck (build issue installing or using the healthcheck script)

Link to repro

https://playground.react.dev/#N4Igzg9grgTgxgUxALhAgHgBwjALgAgBMEAzAQygBsCSoA7OXASwjvwAkIBbBACgEp8wADpt8MBLlhteo-PPwAeAHxyF8xQGVuCAMLdsdBHQK4MuALzAA5AEFhogKLWAvvgD0qseq079XQ2NTcwthEHsnMI8vH081fgBuURdREBcgA

Repro steps

  1. Use a string with a newline character (\n) passed as a prop to a component with curly braces prop={"text\ntext"}
  2. Observe that react compiler outputs prop="text\ntext" instead of prop={"text\ntext"}

This is incorrect behavior. The newline character is no longer properly treated as a newline but rather as the raw text \n.

You can see this behavior be correctly also incorrectly handled in the second component, where prop="text\ntext" becomes prop="text\\ntext", resulting in the wrong text being displayed on the screen.

JSX rules specifically don't allow this kind of conversion, noted in 1.5.1 of the spec

Historically, string characters within JSXAttributeValue and JSXText are extended to allow the presence of HTML character references to make copy-pasting between HTML and JSX easier, at the cost of not supporting \ EscapeSequence of ECMAScript's StringLiteral.

How often does this bug happen?

Every time

What version of React are you using?

19.0.0-rc-6f23540c7d-20240528

@coravacav coravacav added Component: Optimizing Compiler Status: Unconfirmed A potential issue that we haven't yet confirmed as a bug Type: Bug labels May 29, 2024
@josephsavona
Copy link
Contributor

Thanks for reporting, this looks like a bug. We'll prioritize a fix.

@josephsavona josephsavona removed the Status: Unconfirmed A potential issue that we haven't yet confirmed as a bug label May 30, 2024
@SeekingLight233
Copy link

SeekingLight233 commented Jun 8, 2024

image

It seems like this issue existed even before BuildHIR🤔

poteto added a commit that referenced this issue Jun 20, 2024
This PR extends the previous logic added in #29141 to also account for
other kinds of non-ascii characters such as `\n`. Because these control
characters are individual special characters (and not 2 characters `\`
and `n`) we match based on unicode which was already being checked for
non-Latin characters.

This allows control characters to continue to be compiled equivalently
to its original source if it was provided in a JsxExpressionContainer.
However note that this PR does not convert JSX attributes that are
StringLiterals to JsxExpressionContainer, to preserve the original
source code as it was written.

Alternatively we could always emit a JsxExpressionContainer if it was
used in the source and not try to down level it to some other node
kind. But since we already do this I opted to keep this behavior.

Partially addresses #29648.

ghstack-source-id: 42a80bb1c8bc6d655b8b96bd5f967d141ee7b466
Pull Request resolved: #29997
poteto added a commit that referenced this issue Jun 20, 2024
This PR extends the previous logic added in #29141 to also account for
other kinds of non-ascii characters such as `\n`. Because these control
characters are individual special characters (and not 2 characters `\`
and `n`) we match based on unicode which was already being checked for
non-Latin characters.

This allows control characters to continue to be compiled equivalently
to its original source if it was provided in a JsxExpressionContainer.
However note that this PR does not convert JSX attributes that are
StringLiterals to JsxExpressionContainer, to preserve the original
source code as it was written.

Alternatively we could always emit a JsxExpressionContainer if it was
used in the source and not try to down level it to some other node
kind. But since we already do this I opted to keep this behavior.

Partially addresses #29648.

ghstack-source-id: 965cf50df43ff8f8211990223ef1ecb2dfdcfec7
Pull Request resolved: #29997
poteto added a commit that referenced this issue Jun 20, 2024
This PR extends the previous logic added in #29141 to also account for
other kinds of non-ascii characters such as `\n`. Because these control
characters are individual special characters (and not 2 characters `\`
and `n`) we match based on unicode which was already being checked for
non-Latin characters.

This allows control characters to continue to be compiled equivalently
to its original source if it was provided in a JsxExpressionContainer.
However note that this PR does not convert JSX attributes that are
StringLiterals to JsxExpressionContainer, to preserve the original
source code as it was written.

Alternatively we could always emit a JsxExpressionContainer if it was
used in the source and not try to down level it to some other node
kind. But since we already do this I opted to keep this behavior.

Partially addresses #29648.

ghstack-source-id: e5f3ca7f378c331de024e5118d5790f9b92d03a3
Pull Request resolved: #29997
poteto added a commit that referenced this issue Jun 20, 2024
This PR extends the previous logic added in #29141 to also account for
other kinds of non-ascii characters such as `\n`. Because these control
characters are individual special characters (and not 2 characters `\`
and `n`) we match based on unicode which was already being checked for
non-Latin characters.

This allows control characters to continue to be compiled equivalently
to its original source if it was provided in a JsxExpressionContainer.
However note that this PR does not convert JSX attributes that are
StringLiterals to JsxExpressionContainer, to preserve the original
source code as it was written.

Alternatively we could always emit a JsxExpressionContainer if it was
used in the source and not try to down level it to some other node
kind. But since we already do this I opted to keep this behavior.

Partially addresses #29648.

ghstack-source-id: 552bd447ceacc9c8802f13e637f1cca61c3f7831
Pull Request resolved: #29997
poteto added a commit that referenced this issue Jun 20, 2024
This PR extends the previous logic added in #29141 to also account for
other kinds of non-ascii characters such as `\n`. Because these control
characters are individual special characters (and not 2 characters `\`
and `n`) we match based on unicode which was already being checked for
non-Latin characters.

This allows control characters to continue to be compiled equivalently
to its original source if it was provided in a JsxExpressionContainer.
However note that this PR does not convert JSX attributes that are
StringLiterals to JsxExpressionContainer, to preserve the original
source code as it was written.

Alternatively we could always emit a JsxExpressionContainer if it was
used in the source and not try to down level it to some other node
kind. But since we already do this I opted to keep this behavior.

Partially addresses #29648.

ghstack-source-id: 032700cc4ddb191e81af7c611b4d4345b02568a2
Pull Request resolved: #29997
@poteto
Copy link
Member

poteto commented Jun 20, 2024

Thanks for reporting! I have a PR #29997 for preserving the JsxExpressionContainer if it contains non-ascii characters, which fixes the first example in your repro. However the 2nd case is an issue with @babel/parser as @SeekingLight233 pointed out. We could perhaps also check if jsx attribute string literals contain those special characters and convert it into a JsxExpressionContainer

poteto added a commit that referenced this issue Jun 20, 2024
This PR extends the previous logic added in #29141 to also account for
other kinds of non-ascii characters such as `\n`. Because these control
characters are individual special characters (and not 2 characters `\`
and `n`) we match based on unicode which was already being checked for
non-Latin characters.

This allows control characters to continue to be compiled equivalently
to its original source if it was provided in a JsxExpressionContainer.
However note that this PR does not convert JSX attributes that are
StringLiterals to JsxExpressionContainer, to preserve the original
source code as it was written.

Alternatively we could always emit a JsxExpressionContainer if it was
used in the source and not try to down level it to some other node
kind. But since we already do this I opted to keep this behavior.

Partially addresses #29648.

ghstack-source-id: ecc61c9f0bece90d18623b3c570fea05fbcd811a
Pull Request resolved: #29997
poteto added a commit that referenced this issue Jun 21, 2024
This PR extends the previous logic added in #29141 to also account for
other kinds of non-ascii characters such as `\n`. Because these control
characters are individual special characters (and not 2 characters `\`
and `n`) we match based on unicode which was already being checked for
non-Latin characters.

This allows control characters to continue to be compiled equivalently
to its original source if it was provided in a JsxExpressionContainer.
However note that this PR does not convert JSX attributes that are
StringLiterals to JsxExpressionContainer, to preserve the original
source code as it was written.

Alternatively we could always emit a JsxExpressionContainer if it was
used in the source and not try to down level it to some other node
kind. But since we already do this I opted to keep this behavior.

Partially addresses #29648.

ghstack-source-id: ecc61c9f0bece90d18623b3c570fea05fbcd811a
Pull Request resolved: #29997
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants