Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: ReadOnlySpan<char> overloads for StringNormalizationExtensions #87757

Closed
udaken opened this issue Jun 19, 2023 · 3 comments · Fixed by #110465
Closed

[API Proposal]: ReadOnlySpan<char> overloads for StringNormalizationExtensions #87757

udaken opened this issue Jun 19, 2023 · 3 comments · Fixed by #110465
Labels
api-approved API was approved in API review, it can be implemented area-System.Globalization in-pr There is an active PR which will close this issue when it is merged
Milestone

Comments

@udaken
Copy link

udaken commented Jun 19, 2023

Edited by @tarekgh

Background and motivation

If we try to verify that the string represented by ReadOnlySpan<char> is Unicode normalized, we currently have to turn it into a string once.
Overloading for ReadOnlySpan<char> is needed to reduce allocation.

API Proposal

namespace System;

// Doesn't throw on the operation even when having wrong character sequence. 

public static class StringNormalizationExtensions {
+    public static bool IsNormalized(this ReadOnlySpan<char> source, NormalizationForm normalizationForm = NormalizationForm.FormC); // will return false in invalid sequence
+    public static OperationStatus TryNormalize(this ReadOnlySpan<char> source, Span<char> destination, out int charsWritten, NormalizationForm normalizationForm = NormalizationForm.FormC);
}

Original Proposal - Alternative Proposal

namespace System;

// throws on wrong sequence as String.Normalize does. 

public static class StringNormalizationExtensions {
+    public static bool IsNormalized(this ReadOnlySpan<char> strInput, NormalizationForm normalizationForm = NormalizationForm.FormC);
+    public static bool TryNormalize(this ReadOnlySpan<char> strInput, Span<char> destination, out int charsWritten, NormalizationForm normalizationForm = NormalizationForm.FormC);
}

API Usage

ReadOnlySpan<char> partOfhugeArray =  hugeArray.AsSpan(start, length);
if (!partOfhugeArray.IsNormalized())
    throw new Exception("Text must be Unicode normalized.");
/// ...

No response

Risks

No response

@udaken udaken added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Jun 19, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jun 19, 2023
@ghost
Copy link

ghost commented Jun 19, 2023

Tagging subscribers to this area: @dotnet/area-system-globalization
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

If we try to verify that the string represented by ReadOnlySpan<char> is Unicode normalized, we currently have to turn it into a string once.
Overloading for ReadOnlySpan<char> is needed to reduce allocation.

API Proposal

namespace System;

public static class StringNormalizationExtensions {
+    public static bool IsNormalized(this ReadOnlySpan<char> strInput, NormalizationForm normalizationForm = NormalizationForm.FormC);
+    public static bool TryNormalize(this ReadOnlySpan<char> strInput, Span<char> destination, out int charsWritten, NormalizationForm normalizationForm = NormalizationForm.FormC);
}

API Usage

ReadOnlySpan<char> partOfhugeArray =  hugeArray.AsSpan(start, length);
if (!partOfhugeArray.IsNormalized())
    throw new Exception("Text must be Unicode normalized.");
/// ...

Alternative Designs

No response

Risks

No response

Author: udaken
Assignees: -
Labels:

api-suggestion, area-System.Globalization, untriaged

Milestone: -

@udaken udaken changed the title [API Proposal]: [API Proposal]: ReadOnlySpan<char> overloads for StringNormalizationExtensions Jun 19, 2023
@tarekgh tarekgh added this to the Future milestone Jun 19, 2023
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Jun 19, 2023
@tarekgh tarekgh modified the milestones: Future, 10.0.0 Nov 18, 2024
@tarekgh
Copy link
Member

tarekgh commented Nov 18, 2024

@udaken I made a few edits to the proposal and hope that works for you.

@tarekgh tarekgh added api-ready-for-review API is ready for review, it is NOT ready for implementation blocking Marks issues that we want to fast track in order to unblock other important work labels Nov 19, 2024
@bartonjs
Copy link
Member

bartonjs commented Nov 19, 2024

Video

  • TryNormalize as proposed mixes semantics of OperationStatus and the Try pattern, and needs to pick one or the other.
    • We went with Try
  • We added GetNormalizedLength for callers who are concerned about too many double-and-call-again for the Try-Write.
namespace System;

public static partial class StringNormalizationExtensions
{
    public static bool IsNormalized(this ReadOnlySpan<char> source, NormalizationForm normalizationForm = NormalizationForm.FormC);
    public static bool TryNormalize(this ReadOnlySpan<char> source, Span<char> destination, out int charsWritten, NormalizationForm normalizationForm = NormalizationForm.FormC);
    public static int GetNormalizedLength(this ReadOnlySpan<char> source, NormalizationForm normalizationForm = NormalizationForm.FormC);
}

@bartonjs bartonjs added api-approved API was approved in API review, it can be implemented and removed api-suggestion Early API idea and discussion, it is NOT ready for implementation api-ready-for-review API is ready for review, it is NOT ready for implementation blocking Marks issues that we want to fast track in order to unblock other important work labels Nov 19, 2024
@dotnet-policy-service dotnet-policy-service bot added the in-pr There is an active PR which will close this issue when it is merged label Dec 6, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Jan 9, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Globalization in-pr There is an active PR which will close this issue when it is merged
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants