Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: String.Contains(char) #24068

Closed
danmoseley opened this issue Nov 7, 2017 · 9 comments
Closed

Proposal: String.Contains(char) #24068

danmoseley opened this issue Nov 7, 2017 · 9 comments
Assignees
Labels
api-approved API was approved in API review, it can be implemented area-System.Runtime
Milestone

Comments

@danmoseley
Copy link
Member

Rationale and Usage

Finding a character in a string is a fairly common primitive operation. Unfortunately we lead users to the pit of failure because mystring.Contains(char) will bind to the Linq version, which can easily be 20x slower than mystring.IndexOf(char) != -1. By adding these to string, the next recompile will give a performance improvement.

Proposed API

    public sealed partial class String : System.Collections.Generic.IEnumerable<char>, System.Collections.IEnumerable, System.IComparable, System.IComparable<string>, System.IConvertible, System.IEquatable<string>, System.ICloneable
    {
        public bool Contains(char value) { throw null; }
        public bool Contains(char value, StringComparison comparisonType) { throw null; }
        public bool Contains(string value) { throw null; }  // already exists 
        public bool Contains(string value, StringComparison comparisonType) { throw null; } // already exists
        public int IndexOf(char value, StringComparison comparisonType) { throw null; } // to implement above
    }

the implementations will simply be

public bool Contains(char value)
{
    return IndexOf(value) != -1;
}

public bool Contains(char value, StringComparison comparisonType)
{
    return IndexOf(value, comparisonType) != -1;
}

Microbenchmark

Searching in 10 and 1000 character strings:

    Method |       Mean |     Error |    StdDev |
---------- |-----------:|----------:|----------:|
 LinqShort | 142.750 ns | 0.4997 ns | 0.4430 ns |
  LinqLong | 180.121 ns | 0.4741 ns | 0.3428 ns |
 FastShort |   7.575 ns | 0.0196 ns | 0.0174 ns |
  FastLong |   8.508 ns | 0.0403 ns | 0.0377 ns |

Variations

The StringComparison overload is in order to search case insensitively if desired. The IndexOf overload is needed to implement it. These could be discarded as the 90% case I would expect to not take a comparison.

@stephentoub
Copy link
Member

public bool Contains(char value, StringComparison comparisonType)

Why is this needed? Is it purely for Ordinal and OrdinalIgnoreCase?

return IndexOf(value, comparisonType) != -1;

Does such an IndexOf overload exist that takes a char and a StringComparison?

@danmoseley
Copy link
Member Author

Is it purely for Ordinal and OrdinalIgnoreCase?

Yes

Does such an IndexOf overload exist that takes a char and a StringComparison?

Oops, no. I added it to the proposal. But I don't this comparison is too important, it could be set aside.

@jnm2
Copy link
Contributor

jnm2 commented Nov 7, 2017

Just yesterday I had a use case for String.Contains(char) and I thought for sure we already had this but it was String.Contains(string, StringComparison) I was thinking of. +1.

@tannergooding
Copy link
Member

If the LINQ version is so much worse, is there room for a targeted optimization to improve/special case It?

@danmoseley
Copy link
Member Author

dotnet/corefx#25097

@tannergooding
Copy link
Member

even beyond that, why not

if (comparer == null)
{
    if (typeof(TSource) == typeof(string))
    {
        return string.IndexOf() != -1;
    }
    else
    {
        foreach (element in source) { EqualityComparer.Default.Equals() }
    }
}

IIRC, typeof(T) == typeof(...) is treated as a constant by the JIT, so doing something like the above might be worthwhile for string itself.

@stephentoub
Copy link
Member

TSource would be char, not string. It could do "if (typeof(TSource) == typeof(char))" and then within that would need to do a cast/type check on the source instance to see if it was a string. That will add some overhead when TSource==char and the source isn't a string, so the question would be whether such overhead is measurable and worthwhile to optimize for the string case.

@danmoseley
Copy link
Member Author

I believe this is ready for review.

@karelz
Copy link
Member

karelz commented Nov 21, 2017

FYI: The API review discussion was recorded - see https://youtu.be/o5JFZtgaJLs?t=1169 (1 min duration)

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 2.1.0 milestone Jan 31, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 19, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Runtime
Projects
None yet
Development

No branches or pull requests

7 participants