Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expander optimization #11069

Closed
wants to merge 14 commits into from
Closed

Conversation

SimaTian
Copy link
Member

@SimaTian SimaTian commented Dec 2, 2024

changes:

  1. The AddArgumentFromSlices is an optimization that bypasses the need for argumentBuilder which has noticeable overhead.
    I use indices to directly calculate the argument instead.

  2. The lines 2126 - 2142
    The expressionCapture.Captures?.Any iteration was visible in the profiler.
    I pulled both branches into one loop since it looked slightly better. Performance improvement is speculative at best, can be
    removed if we decide that it is too risky or something.

  3. Finally the src/StringTools/WeakStringCache.Concurrent.cs
    the .Count call on the ConcurrentDictionary is locking everything which is not great since we do it quite often. my change aims to introduce separate count for that to limit the number of locks we take and to unblock the dictionary.

All together when profiling on my machine, after killing the outliers (MSBuild sometimes has a random-ish spikes in build time that I decided to skip) my aggregated output was ~this:

  1. 23 cold runs for our current Main branch, with -maxcpucount:10 -tl:true and average time of 62.103s
  2. 13 cold runs for this branch with same arguments, average time was 61.31s
    Though these are speculative at best, I've yet to find a reasonable and stable enough way of checking the impact.

While using VS profiler for running my changes on a single-process basis, the improvement looked to be somewhere in the vicinity of 0.2-03%

@maridematte
Copy link
Contributor

Thank you for tackling performance, I just have a few comments before getting to the specifics of the code.

  1. There are a lot of commented out code that doesn't seem that will be used in the future, please clean it up.
  2. For the PR description, it is very confusing. Please add the reason that these changes need to be made, and separate what you changed and the performance impact into two separate sections.

@SimaTian
Copy link
Member Author

SimaTian commented Dec 3, 2024

First of all, thank you for looking at my experimental changes. Much appreciated.

Thank you for tackling performance, I just have a few comments before getting to the specifics of the code.

  1. There are a lot of commented out code that doesn't seem that will be used in the future, please clean it up.
  2. For the PR description, it is very confusing. Please add the reason that these changes need to be made, and separate what you changed and the performance impact into two separate sections.
  1. I originally intended to keep the old code around for comparison purposes. That being said I forget that Git will make it visible anyways so I will remedy that soon-ish.
  2. I will try.

Copy link
Member

@JanKrivanek JanKrivanek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for digging into this!!

The main reason why I do not want to approve now - This is nontrivial change in core logic and the PR has no mention of test coverage. If it's already covered by pre-existing tests, please call those out. If not - I'd prefer a tailored test(s) to be added first (can be same PR, but idealy a separate commit, that shows the test were passing on previous version).

src/StringTools/WeakStringCache.Concurrent.cs Show resolved Hide resolved
src/Build/Evaluation/Expander.cs Outdated Show resolved Hide resolved
src/Build/Evaluation/Expander.cs Outdated Show resolved Hide resolved
src/Build/Evaluation/Expander.cs Show resolved Hide resolved
@SimaTian
Copy link
Member Author

SimaTian commented Dec 4, 2024

Thank you for the review. I wasn't expecting approval anytime soon. This is mostly to open the discussion about this area in particular.
Before we go ahead with an eventual merge(after resolving all concerns and/or discarding things we deem invalid), I would prefer to have some way to quantify the results other than "it looks good in my local profiler" or "while running on my machine, it looked to be faster after averaging X vs Y runs."

Regarding the test coverage:
We have 348 expander tests covering this area in general, see Expander_tests.cs
If I were to call out one test in particular that made my life somewhat uncomfortable, it would be Medley - a bunch of random error checks and edge cases.
Every test there does some expansion somewhere.
Furthermore, I was able to build the Orchard even while some tests were failing - I don't think I can do much better test wise, unless we want to spend a lot more time on this.

@SimaTian SimaTian requested a review from JanKrivanek December 9, 2024 15:03
@JanKrivanek
Copy link
Member

All sounds perfect!

Can you run an exp insertion, request perf runs (DDRITs and Speedometer) a post results here?

Copy link
Member

@JanKrivanek JanKrivanek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

// Trim from the end.
// There is some extra logic here to avoid edge case where we would trim a whitespace only character with
// one slice from the start, and then once more from the end, resulting in invalid indices.
while (((firstSlice < lastSlice) || (firstSlice == lastSlice && firstSliceIdx < lastSliceIdx)) && Char.IsWhiteSpace(arg, lastSliceIdx - 1))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like (firstSlice <= lastSlice && Char.IsWhiteSpace(arg, firstSliceIdx) can be moved to the separate method with descriptive name and reused in the while loop here and on the line 794

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This checks checks against lastSliceIndx and also checks to prevent duplicate whitespace removal (the second part after the ||)
It was similar-ish before, then I run into the "all whitespace" edge case and crashed.
that is what the firstSlice == lastSlice && firstSliceIdx < lastSliceIdx is for.
So the function would have to be something like
IsEdgeWhitespace(arg, firstSlice, lastSlice, idxToCheck) and then some extra logic for this case.
Is that more readable? I would like to say no, but that could be just my laziness/lack of feeling for cases such as this.


// If the argument is in quotes, we want to remove those
if ((arg[firstSliceIdx] == '\'' && arg[lastSliceIdx - 1] == '\'') ||
(arg[firstSliceIdx] == '`' && arg[lastSliceIdx - 1] == '`') ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you remove the static fields for these repetitive chars?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comparison didn't use them (I'm not 100% sure why, but I kept it that way). Since I'm working with indices, I didn't need them for .Trim() calls anymore so they became unused and the ./build.cmd started complaining.

@JanProvaznik JanProvaznik requested a review from Copilot December 10, 2024 10:42
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 suggestion.

Comments skipped due to low confidence (1)

src/Build/Evaluation/Expander.cs:778

  • Add a test case to verify the behavior of the AddArgumentFromSlices method, especially for edge cases such as arguments with quotes and 'null' values.
private static void AddArgumentFromSlices(List<string> arguments, List<Tuple<int, int>> slices, string arg)

src/StringTools/WeakStringCache.Concurrent.cs Show resolved Hide resolved

while (firstSlice < lastSlice)
{
argValue += arg.Substring(firstSliceIdx, slices[firstSlice].Item2 - firstSliceIdx);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit confused why would this improve perf, i guess there are usually few arguments. In general repeated concatenations are perf antipattern. Couldn't the StringTools.SpanBasedStringBuilder be optimized instead?

Copy link
Member Author

@SimaTian SimaTian Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I saw/googled, the concat should be faster / similar speedwise up to 3-4 concatenations, after that the stringbuilder is faster.
Most of resolved variables have 1 or 2 slices to concatenate(beyond the initial empty string) so I opted for the simplicity.

There is an argument to be made to have a split there based on number of slices and use a stringbuilder for 3+.
As for optimizing SpanBasedStringBuilder - that is not our code but something from Microsoft.NET.StringTools.
I think that the main difference is that for most cases, the span based string builder is an overkill.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the perf improvement, the first two lines of the previous version function were doing this:

  argumentBuilder.Trim();
  string argValue = argumentBuilder.ToString();

and then doing everything with String, throwing away any and all advantage the spanbased stringbuilder might've had.
So there is an option to kill most of my changes and replace the string with SpanBasedChar - it could achieve similar if not identical results. The main cost would be probably further profiling since it's rather large replacement once again.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bit more googling and having a static StringBuilder and reusing the instance should have the best from the both worlds - reasonable performance for small concatenation counts while avoiding the danger of allocation.

src/Build/Evaluation/Expander.cs Outdated Show resolved Hide resolved
src/Build/Evaluation/Expander.cs Outdated Show resolved Hide resolved
src/Build/Evaluation/Expander.cs Outdated Show resolved Hide resolved
src/Build/Evaluation/Expander.cs Outdated Show resolved Hide resolved
@YuliiaKovalova
Copy link
Member

@SimaTian , I am slightly concerned about code readability factor.
It's great that applying these changes brings perf improvements , but it's quite difficult to understand a new logic in the amended parts.

@maridematte
Copy link
Contributor

I agree with @YuliiaKovalova. The code readability is not great here, it took me sometime to understand what is going on, and this concerns me if we need to debug it. My current opinion is that the performance improvement on this one is not worth the code readability trade-off.

@SimaTian
Copy link
Member Author

SimaTian commented Jan 3, 2025

I will split out the two smaller non-problematic improvements to a separate PR and kill this one.

@SimaTian SimaTian closed this Jan 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants