Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit opted-in string literals into data section as UTF8 #76036

Merged
merged 22 commits into from
Jan 15, 2025

Conversation

jjonescz
Copy link
Member

@jjonescz jjonescz commented Nov 22, 2024

Spec: #76139

When this opt-in feature flag is enabled by users, string literals with length over the configured threshold are emitted into data section of the PE file, hence allowing users to overcome the "error CS8103: Combined length of user strings used by the program exceeds allowed limit."

VB support is not implemented in this PR, could be added in the future if there's demand.

I've manually tried this on our unit tests (e.g., Emit2 compiled standalone resulted in significantly smaller file size due to the use of UTF8 instead of UTF16; run time was the same; combining Emit2 and Emit3 into one assembly worked without emit errors). It was also tested on aspnetcore benchmarks and that didn't reveal any significant runtime/build perf changes.

Main motivation for this feature are Razor projects which can accumulate lots of string literals over time (due to how .razor file compilation works) and then customers are one day hit with this error without any prior warnings (imagine they are fixing a small bug and suddenly they cannot compile anymore without refactoring their codebase into multiple projects).

@dotnet-issue-labeler dotnet-issue-labeler bot added Area-Compilers untriaged Issues and PRs which have not yet been triaged by a lead labels Nov 22, 2024
@jjonescz jjonescz force-pushed the DataSectionStringLiterals branch from 16f5293 to 43aa343 Compare November 22, 2024 16:42
@jjonescz jjonescz force-pushed the DataSectionStringLiterals branch from 43aa343 to e9e701d Compare November 25, 2024 13:45
@jjonescz jjonescz marked this pull request as ready for review November 25, 2024 15:25
@jjonescz jjonescz requested a review from a team as a code owner November 25, 2024 15:25
@AlekseyTs
Copy link
Contributor

AlekseyTs commented Nov 27, 2024

@jjonescz Please add details how do we achieve the goal: what artifacts are getting generated, what APIs are needed and how they are used, what compilation phases/layers are involved and how, etc. Basically, I think we need something like an implementation speclet for the compiler feature. #Closed

@AlekseyTs
Copy link
Contributor

AlekseyTs commented Nov 27, 2024

I stopped reviewing the PR because I think first we need to agree on the implementation strategy and have a speclet to review against. #Closed

@jjonescz
Copy link
Member Author

jjonescz commented Nov 28, 2024

@AlekseyTs thanks for the review. Sounds like a good idea to get a spec reviewed first. Opened a separate PR for that: #76139 #Closed

@AlekseyTs
Copy link
Contributor

AlekseyTs commented Dec 5, 2024

@jjonescz The current PR title: "Add feature flag for emitting string literals into data section as UTF8" sounds like the only thing this PR is doing is adding a flag. I suggest to adjust the title to reflect the main goal/purpose of the PR, which I think is to store values of string literals as Utf-8 encoded bytes in a blob, etc. By comparison to that, the feature flag is a minor detail, which, I think, can be dropped from the title if its length becomes too big. #Closed

@jjonescz jjonescz changed the title Add feature flag for emitting string literals into data section as UTF8 Emit opted-in string literals into data section as UTF8 Dec 5, 2024
@jjonescz jjonescz marked this pull request as ready for review December 25, 2024 10:24
Copy link
Contributor

@AlekseyTs AlekseyTs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (commit 21)

@jaredpar jaredpar requested a review from cston January 13, 2025 23:44
@jaredpar
Copy link
Member

jaredpar commented Jan 13, 2025

@cston PTAL #Resolved

@jjonescz jjonescz merged commit 98f293e into dotnet:main Jan 15, 2025
24 checks passed
@jjonescz jjonescz deleted the DataSectionStringLiterals branch January 15, 2025 16:59
@dotnet-policy-service dotnet-policy-service bot added this to the Next milestone Jan 15, 2025
@jaredpar
Copy link
Member

/backport to /refs/release/dev17.3

Copy link

Started backporting to /refs/release/dev17.3: https://github.com/dotnet/roslyn/actions/runs/12818583301

Copy link

@jaredpar an error occurred while backporting to "/refs/release/dev17.3", please check the run log for details!

Error: The specified backport target branch "/refs/release/dev17.3" wasn't found in the repo.

@jaredpar
Copy link
Member

/backport to /release/dev17.13

Copy link

Started backporting to /release/dev17.13: https://github.com/dotnet/roslyn/actions/runs/12818638613

Copy link

@jaredpar an error occurred while backporting to "/release/dev17.13", please check the run log for details!

Error: The specified backport target branch "/release/dev17.13" wasn't found in the repo.

@jaredpar
Copy link
Member

/backport to /refs/release/dev17.13

Copy link

Started backporting to /refs/release/dev17.13: https://github.com/dotnet/roslyn/actions/runs/12818690111

Copy link

@jaredpar an error occurred while backporting to "/refs/release/dev17.13", please check the run log for details!

Error: The specified backport target branch "/refs/release/dev17.13" wasn't found in the repo.

@jaredpar
Copy link
Member

/backport to release/dev17.13

Copy link

Started backporting to release/dev17.13: https://github.com/dotnet/roslyn/actions/runs/12818749822

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Compilers Feature - String Literals in Data Section as UTF8 untriaged Issues and PRs which have not yet been triaged by a lead
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants