Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of String.StartsWith(constant string) #27682

Closed
thargol1 opened this issue Oct 20, 2018 · 10 comments
Closed

Improve performance of String.StartsWith(constant string) #27682

thargol1 opened this issue Oct 20, 2018 · 10 comments
Labels
area-System.Runtime enhancement Product code improvement that does NOT require public API changes/additions help wanted [up-for-grabs] Good issue for external contributors tenet-performance Performance related issue
Milestone

Comments

@thargol1
Copy link

JIT: Improve performance on String.StartsWith(constant string)

I like using String.StartsWith because it's self explanatory and make code easy to read.
How ever on some occasions I ran into performance issues and had to change my code, to faster less readable code.

I created an example and a benchmark. The code with char compare is over 20 times faster.

Can't the JIT-compiler to this kind of optimisation for me?

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System;

namespace ConsoleApp2
{
	public class Program
	{
		private static void Main(string[] args) => BenchmarkRunner.Run<StringStartsWith>();

		public class StringStartsWith
		{

			[Params("", "http://hello.world", "hello world", "HttP://test")]
			public string v;

			[Benchmark]
			public bool StartsWithTest() => 
				v.StartsWith("http://", StringComparison.InvariantCultureIgnoreCase);

			[Benchmark]
			public bool CharTest() =>
					   (v.Length >= 7)
					&& (v[0] == 'h' || v[0] == 'H')
					&& (v[1] == 't' || v[1] == 'T')
					&& (v[2] == 't' || v[2] == 'T')
					&& (v[3] == 'p' || v[3] == 'P')
					&& (v[4] == ':')
					&& (v[5] == '/')
					&& (v[6] == '/');
		}
	}
}

Benchmark results:

BenchmarkDotNet=v0.11.1, OS=Windows 10.0.17763
Intel Core i7-3820 CPU 3.60GHz (Ivy Bridge), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=2.1.403
  [Host]     : .NET Core 2.1.5 (CoreCLR 4.6.26919.02, CoreFX 4.6.26919.02), 64bit RyuJIT
  DefaultJob : .NET Core 2.1.5 (CoreCLR 4.6.26919.02, CoreFX 4.6.26919.02), 64bit RyuJIT
Method v Mean Error StdDev
StartsWithTest 13.7086 ns 0.0377 ns 0.0334 ns
CharTest 0.4983 ns 0.0054 ns 0.0045 ns
StartsWithTest HttP://test 96.5470 ns 0.3262 ns 0.3051 ns
CharTest HttP://test 4.1528 ns 0.0126 ns 0.0117 ns
StartsWithTest hello world 81.3532 ns 0.2773 ns 0.2594 ns
CharTest hello world 1.6564 ns 0.0030 ns 0.0027 ns
StartsWithTest http://hello.world 96.4200 ns 0.3082 ns 0.2883 ns
CharTest http://hello.world 3.8201 ns 0.0199 ns 0.0176 ns
@thargol1 thargol1 changed the title Improve performance of String.StartsWith(constans string) Improve performance of String.StartsWith(constant string) Oct 20, 2018
@stephentoub
Copy link
Member

For reference, note that Invariant and Ordinal are not the same thing; the actual comparison for your CharTest would be with OrdinalIgnoreCase rather than InvariantCultureIgnoreCase, in which case you should see a significant difference, e.g. on .NET Core 2.1 adding an OrdinalIgnoreCase variant:

                  Method |                  v |       Mean |     Error |    StdDev |
------------------------ |------------------- |-----------:|----------:|----------:|
   StartsWithOrdinalTest |                    | 10.0892 ns | 0.0980 ns | 0.0917 ns |
 StartsWithInvariantTest |                    | 13.0126 ns | 0.2786 ns | 0.2861 ns |
                CharTest |                    |  0.2914 ns | 0.0185 ns | 0.0173 ns |
   StartsWithOrdinalTest |        HttP://test | 22.6900 ns | 0.3001 ns | 0.2808 ns |
 StartsWithInvariantTest |        HttP://test | 79.7019 ns | 0.8092 ns | 0.7569 ns |
                CharTest |        HttP://test |  1.8364 ns | 0.0302 ns | 0.0268 ns |
   StartsWithOrdinalTest |        hello world | 17.3224 ns | 0.3743 ns | 0.4311 ns |
 StartsWithInvariantTest |        hello world | 70.6186 ns | 1.4413 ns | 1.6598 ns |
                CharTest |        hello world |  1.0366 ns | 0.0517 ns | 0.0690 ns |
   StartsWithOrdinalTest | http://hello.world | 21.3477 ns | 0.1348 ns | 0.1125 ns |
 StartsWithInvariantTest | http://hello.world | 79.0416 ns | 0.7784 ns | 0.7281 ns |
                CharTest | http://hello.world |  2.3849 ns | 0.0425 ns | 0.0377 ns |

And StartsWith + OrdinalIgnoreCase has already been significantly improved since 2.1. Here's the same test running on current coreclr bits for .NET Core 3.0:

                  Method |                  v |       Mean |     Error |    StdDev |
------------------------ |------------------- |-----------:|----------:|----------:|
   StartsWithOrdinalTest |                    |  1.5173 ns | 0.0632 ns | 0.0621 ns |
 StartsWithInvariantTest |                    |  4.9249 ns | 0.1277 ns | 0.1872 ns |
                CharTest |                    |  0.3025 ns | 0.0395 ns | 0.0455 ns |
   StartsWithOrdinalTest |        HttP://test | 12.2989 ns | 0.1928 ns | 0.1709 ns |
 StartsWithInvariantTest |        HttP://test | 72.5324 ns | 1.4310 ns | 1.8098 ns |
                CharTest |        HttP://test |  1.7594 ns | 0.0679 ns | 0.1324 ns |
   StartsWithOrdinalTest |        hello world |  5.9746 ns | 0.1480 ns | 0.2260 ns |
 StartsWithInvariantTest |        hello world | 61.6073 ns | 1.2060 ns | 1.2384 ns |
                CharTest |        hello world |  0.8205 ns | 0.0168 ns | 0.0149 ns |
   StartsWithOrdinalTest | http://hello.world |  9.2609 ns | 0.2274 ns | 0.2792 ns |
 StartsWithInvariantTest | http://hello.world | 70.9415 ns | 1.5548 ns | 1.8509 ns |
                CharTest | http://hello.world |  2.0295 ns | 0.0690 ns | 0.0821 ns |

@danmoseley
Copy link
Member

@thargol1 on 3.0, is StartsWith fast enough for your scenario -- or at least, fast enough in all but one or two cases where it is not too painful for you to break out into chars?

In my career I have only in one scenario found a need to break out into chars because the path was so hot. I am not sure this is common enough for the JIT work to be done to recognize StartsWith + Ordinal + short constant string.

@thargol1
Copy link
Author

@danmosemsft .Net core 3.0 seems fast enough for me.

I did some further investigation on .Net Framework and .Net core 2.2. A part of the performance difference is due to the fact that StartsWith internally calls a version of string compare ordinal ignorecase. When the characters do not match string compare performs an extra calculation to return a value less or greater than zero. For StartsWith a simple false would suffice.

A very small performance can be gained using xor:
In ordinal ignore case charA ^ charB returns either 0 of 32 for equal a-z characters.

Part of my test code:

const uint ZA = (uint)('z' - 'a');
char charA = source[length];
char charB = pattern[length];
uint x = (uint)charA ^ (uint)charB;
if (x == 0 || ((x == 0x20) && (uint)((charA | 0x20) - 'a') <= ZA))
{
	length--;
}
else
{
	return false;
}

Perhaps you can test slightly altered code in 3.0 to speed up it a little bit more (less than 5%).

@danmoseley
Copy link
Member

@thargol1 missed your response. Do you want to offer a PR for this, with your measurements?

@thargol1
Copy link
Author

@danmosemsft I don't do PR's currently, simply because I don't know how it works, still have to read the github manual on that.

This my testcode. I tried to isolate the actual test-code as much as possible, so the results are as accurate as possible.
I changed the int-returntype to bool, because for StartsWith you don't need integers.

Testcode:

[ReturnValueValidator]
public class CharCompareOrdinalTest
{
	[Params('a','A','0')]
	public int A;

	[Params('a','B','[')]
	public int B;

	[Benchmark(Baseline =true)]
	public bool Original() => OrginalCharOrdinalCompare(A, B);

	[Benchmark]
	public bool Custom() => CustomCharOrdinalCompare(A, B);

	public static bool OrginalCharOrdinalCompare(int charA, int charB)
	{
		// https://github.com/Microsoft/referencesource/blob/master/mscorlib/system/string.cs
		if ((uint)(charA - 'a') <= (uint)('z' - 'a')) charA -= 0x20;
		if ((uint)(charB - 'a') <= (uint)('z' - 'a')) charB -= 0x20;

		return charA == charB;
	}

	public static bool CustomCharOrdinalCompare(int charA, int charB)
	{
		uint x = (uint)charA ^ (uint)charB;
		return x == 0 || ((x == 0x20) && (uint)((charA | 0x20) - 'a') <= (uint)('z' - 'a'));
	}
}

Results:

BenchmarkDotNet=v0.11.4, OS=Windows 10.0.17763.379 (1809/October2018Update/Redstone5)
Intel Core i7-3820 CPU 3.60GHz (Ivy Bridge), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=3.0.100-preview3-010431
  [Host]     : .NET Core 3.0.0-preview3-27503-5 (CoreCLR 4.6.27422.72, CoreFX 4.7.19.12807), 64bit RyuJIT
  DefaultJob : .NET Core 3.0.0-preview3-27503-5 (CoreCLR 4.6.27422.72, CoreFX 4.7.19.12807), 64bit RyuJIT

Method A B Mean Error StdDev Median Ratio RatioSD
Original 0 B 1.8640 ns 0.1253 ns 0.1111 ns 1.8189 ns 1.00 0.00
Custom 0 B 1.6399 ns 0.0770 ns 0.0721 ns 1.6258 ns 0.89 0.07
Original 0 [ 2.1994 ns 0.0209 ns 0.0185 ns 2.2002 ns 1.00 0.00
Custom 0 [ 0.4884 ns 0.3174 ns 0.8742 ns 0.0000 ns 0.93 0.08
Original 0 a 1.8573 ns 0.1183 ns 0.1215 ns 1.8868 ns 1.00 0.00
Custom 0 a 1.4531 ns 0.0297 ns 0.0263 ns 1.4455 ns 0.79 0.06
Original A B 2.1554 ns 0.0190 ns 0.0178 ns 2.1595 ns 1.00 0.00
Custom A B 1.3569 ns 0.0534 ns 0.0500 ns 1.3486 ns 0.63 0.02
Original A [ 2.2571 ns 0.0774 ns 0.0686 ns 2.2430 ns 1.00 0.00
Custom A [ 0.4355 ns 0.2929 ns 0.7716 ns 0.0000 ns 0.79 0.12
Original A a 0.4519 ns 0.3025 ns 0.7970 ns 0.0000 ns ? ?
Custom A a 0.4250 ns 0.3123 ns 0.7543 ns 0.0000 ns ? ?
Original a B 1.3982 ns 0.0158 ns 0.0148 ns 1.3955 ns 1.00 0.00
Custom a B 1.5232 ns 0.0402 ns 0.0376 ns 1.5188 ns 1.09 0.03
Original a [ 1.2900 ns 0.0863 ns 0.0808 ns 1.2765 ns 1.00 0.00
Custom a [ 1.7122 ns 0.1181 ns 0.1160 ns 1.7111 ns 1.33 0.10
Original a a 1.4140 ns 0.0694 ns 0.0649 ns 1.4010 ns 1.00 0.00
Custom a a 1.8392 ns 0.0289 ns 0.0256 ns 1.8283 ns 1.30 0.06

Conclusion:
Indeterminate: my code is often faster, but not always. So maybe this inspires someone to an even better algorithm that is always faster.

@danmoseley
Copy link
Member

OK, thanks I"ll mark this as up for grabs in case someone wants to look further.

cc @GrabYourPitchforks as he's an expert in using XOR with checking ASCII ranges... 😈

@GrabYourPitchforks
Copy link
Member

As @stephentoub said, we made substantial performance improvements to this for 3.0. The reference sources referred to are for Full Framework 4.7.2 and aren't representative of what's currently in Core. See dotnet/coreclr@dd6c690 for the current code in Core. (This commit is for String.Equals(..., OrdinalIgnoreCase), but the same code path is invoked for String.StartsWith(..., OrdinalIgnoreCase).

I'm not quite sure what specifically this issue is asking for. Can somebody clarify?

@thargol1
Copy link
Author

@GrabYourPitchforks Well I finally found the core 3.0 code thanks to your link. The XOR-variant was removed and replace by code that is faster I assume.

So it seems someone had the same idea as I had, but before me and than someone else even found a better way.

I close this issue.... Thanx for the support.

@danmoseley
Copy link
Member

@thargol1 thanks for the suggestion though. We would welcome contributions in other areas, if you decide you'd like to become more famliar with Github and the .NET codebase. Just look for issues tagged "easy" or "up-for-grabs"..

@GrabYourPitchforks
Copy link
Member

I also strongly recommend checking out the https://source.dot.net/ tool as an easy way to browse up-to-date Core sources. It's updated every few days. Type String.StartsWith into the upper-left search field on that site and you can browse the sources, even being able to click through to other methods that the higher-level method calls into.

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 3.0 milestone Jan 31, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 15, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Runtime enhancement Product code improvement that does NOT require public API changes/additions help wanted [up-for-grabs] Good issue for external contributors tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

5 participants