-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve codegen for concatenation of string
and char
(attempt 2)
#71793
Conversation
…ser-provided span-based `string.Concat` when possible
Is it important to try them one by one, or could we simply mark them all missing in one go? #Closed Refers to: src/Compilers/CSharp/Test/Emit2/CodeGen/CodeGenSpanBasedStringConcatTests.cs:312 in e4e26f1. [](commit_id = e4e26f1, deletion_comment = False) |
Is it really better to not go with the span approach, use extra space in the string blob of the assembly image, and then allocate the string at runtime? #Closed Refers to: src/Compilers/CSharp/Test/Emit2/CodeGen/CodeGenSpanBasedStringConcatTests.cs:348 in e4e26f1. [](commit_id = e4e26f1, deletion_comment = False) |
I guess trying one by one doesn't hurt, but I would also add a flavor when all of them are missing together. In reply to: 1912258587 Refers to: src/Compilers/CSharp/Test/Emit2/CodeGen/CodeGenSpanBasedStringConcatTests.cs:312 in e4e26f1. [](commit_id = e4e26f1, deletion_comment = False) |
Looking at Refers to: src/Compilers/CSharp/Test/Emit2/CodeGen/CodeGenSpanBasedStringConcatTests.cs:1134 in e4e26f1. [](commit_id = e4e26f1, deletion_comment = False) |
I am surprised the span overload is not used when it is available along with necessary conversion helpers. For other methods as well. Probably there is a problem with decomposing/composing the arguments. I haven't looked at the implementation yet. #Closed Refers to: src/Compilers/CSharp/Test/Emit2/CodeGen/CodeGenSpanBasedStringConcatTests.cs:2503 in e4e26f1. [](commit_id = e4e26f1, deletion_comment = False) |
Perhaps this is the same issue with "premature char to string conversion", which could probably be mitigated by going from a single character string constant back to char when we look at the decomposed arguments. In reply to: 1912464476 Refers to: src/Compilers/CSharp/Test/Emit2/CodeGen/CodeGenSpanBasedStringConcatTests.cs:2503 in e4e26f1. [](commit_id = e4e26f1, deletion_comment = False) |
~~I am not comfortable with this inconsistency. I think we should have a simpler approach, we either unwrap things, or we don't. The order of processing shouldn't matter. ~~ #Closed Refers to: src/Compilers/CSharp/Test/Emit2/CodeGen/CodeGenSpanBasedStringConcatTests.cs:4480 in e4e26f1. [](commit_id = e4e26f1, deletion_comment = False) |
After a closer look, I changed my mind. In reply to: 1912530749 Refers to: src/Compilers/CSharp/Test/Emit2/CodeGen/CodeGenSpanBasedStringConcatTests.cs:4480 in e4e26f1. [](commit_id = e4e26f1, deletion_comment = False) |
Are you ok if I replace existing tests, where we test positive scenarios with missing unimportant members 1 by 1 with equivalent ones, but where all unimportant members are missing? I agree that tests where all members are missing are better (since this is basically a real-life scenario on old target frameworks like .NET Framework), but new test file is already 4500+ lines long, so adding such tests on top of what we have will make things even harder to follow.
Compiler can take advantage of the fact, that Benchmark
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
BenchmarkRunner.Run<Benchmarks>();
[MemoryDiagnoser]
public class Benchmarks
{
private string _s;
[Params(1, 1000)]
public int Length { get; set; }
[GlobalSetup]
public void Setup() => _s = new string('s', Length);
[Benchmark]
public string UseConstString() => M1(_s);
[Benchmark]
public string UseSpan() => M2(_s);
private static string M1(string s) => string.Concat("a", s);
private static string M2(string s)
{
char ch = 'a';
return string.Concat(new ReadOnlySpan<char>(in ch), s);
}
} If you worry that we take space in string blob, than I don't think it matters. Just think about it: it took a decade of development and a project filled to the brim with IL baselines to reach string blob size limitation in only 1 project in the whole roslyn. An avarage developer would probably never face this error in his professional life at all. We save far more space in IL bytes by lowering that char to string than we can potentially save from a string blob. |
src/Compilers/CSharp/Portable/Lowering/LocalRewriter/LocalRewriter_StringConcat.cs
Outdated
Show resolved
Hide resolved
Done with review pass (commit 70) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (commit 71)
@333fred, @dotnet/roslyn-compiler For the second review |
} | ||
else if (arg.ConstantValueOpt is { IsString: true, StringValue: [char c] }) | ||
{ | ||
preparedArgs.Add( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we not set needsSpanRefParamConstructor
or the other tracking locals here? Is it just not worth it for the case where it's a bunch of one-character string additions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is part of Aleksey's unification suggestion on handling constants. This operation might be harmful to both codegen size and runtime perf, so in the end a compromise was made to enable this code path only when we already have other non-constant char, so we would take span-based approach anyway. There are a bunch of tests verifying that we don't take span-based path when there are only constant char.ToString
s in concatenation arguments: Concat{Two|Three|Four}_ConstantCharToString
, Concat{Two|Three|Four}_AllConstantCharToStrings
@DoctorKrolic I suggest to not act on the resolution from #72102 in context of this PR. I will follow up on that separately after this PR is merged, unless, of course, you really want to do that work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM. I suspect my refactoring comment will become obviated when we adapt to the public api decision today; feel free to resolve it if it's not necessary.
I think the adjustment around of usage of |
Agreed. Just slightly worried, that there will be a time gap, when the codebase contains unwanted APIs, but I guess, you'll handle this pretty quickly, so it won't be dangerously large |
Looks like there are compilation errors? You may need to merge |
I don't know how, but the method was actually somehow missing. Replaced with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (commit 74). @AlekseyTs for the last couple of changes since your approval. Thanks for your persistence here DoctorKrolic!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (commit 74)
@DoctorKrolic Thank you for the contribution. |
@AlekseyTs Special thank you for such patient and consistent review! |
* upstream/main: (1045 commits) Fix excessive compilation times due to speculative parsing after an incomplete string (#72565) Convert official pipeline to 1ES template (#72430) Fix #69628 Analyzer summary should show suppressor ID (#72569) Fix event hookup even when in a projection buffer Drop win32-ia32 language server support Remove workaround for .net7 r2r assembly loading issue remove unused usings Share compilation when generators don't produce any generated files Fix potential exception in AssetProvider.SynchronizeAssetsAsync (#72597) Fix Update __arglist.md (#72523) Improve code gen for concatenation of `string` and `char` (#71793) Reduce File I/O under the AnalyzerAssemblyLoader folder (#72412) Reduce allocations in AbstractTypeMap (#72588) disable diagnostics when solution crawler option is disabled disable diagnostics when solution crawler option is disabled better thread transitions Fix Add check Fix ...
Closes #66827
This PR takes approach suggested in #70971 (comment) by @AlekseyTs with some implementation details chaged. E.g. it turned out is's easier to handle
constantChar.ToString()
pattern inConvertConcatExprToString
rather than inTryFoldTwoConcatOperands
and so on.Also this PR implements merging user-provided span-basedAnd @AlekseyTs I have a question for you regarding item 1 in your proposal: the suggestion looks reasonable but it seems to me that it is related not to our case of concatenation of strings and chars, but rather can be treated as a general optimization. If I am wrong, can you please provide an example of a unit test, which is related to the theme of the PR and where codegen can be improved by that change? I am of course open for implementing new optimizations, but this PR is quite big on its own, so if that optimization is not directly related to the main theme, maybe it's better to implement in a later PR rather than in this onestring.Concat
with additional concatenation arguments, which was not described in the original proposal.