Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cs] funky String.lastIndexOf performance #7746

Closed
Simn opened this issue Feb 5, 2019 · 5 comments · Fixed by #11551
Closed

[cs] funky String.lastIndexOf performance #7746

Simn opened this issue Feb 5, 2019 · 5 comments · Fixed by #11551
Assignees
Labels
platform-cs Everything related to c#
Milestone

Comments

@Simn
Copy link
Member

Simn commented Feb 5, 2019

HL

Our benchmarks suggest that lastIndexOf does something weird when there should be a match on the first attempt:

Case: StringScan
  Suite: length 20003
           indexOf hit: 3,116,116 (100.00%)
          indexOf find:    33,330 (  1.06%)
       lastIndexOf hit:    16,091 (  0.51%)
      lastIndexOf find:    16,075 (  0.51%)
        indexOf nofind:    11,043 (  0.35%)
    lastIndexOf nofind:    11,030 (  0.35%)
            split find:     8,278 (  0.26%)
          split nofind:     7,323 (  0.23%)

The "lastIndexOf hit" should have the same result as "indexOf hit" because the provided offset points directly at the substring we're looking for.

Neko has the same characteristics but I suppose we're not gonna fix that.

C#

This is also weird:

Case: StringScan
  Suite: length 20003
       lastIndexOf hit: 12,904,904 (100.00%)
           indexOf hit:  6,123,123 ( 47.45%)
      lastIndexOf find:     91,203 (  0.70%)
          indexOf find:     91,160 (  0.70%)
        indexOf nofind:     45,690 (  0.35%)
    lastIndexOf nofind:     43,590 (  0.33%)
          split nofind:      9,047 (  0.07%)
            split find:      8,327 (  0.06%)

indexOf must be doing something worse than lastIndexOf here.

@Simn Simn added platform-cs Everything related to c# platform-hl Everything related to HashLink labels Feb 5, 2019
@ncannasse
Copy link
Member

What does "hit" means there? I'm not sure what's the problem. Performances?

@Simn
Copy link
Member Author

Simn commented Feb 8, 2019

Yes it's about performance. "hit" means that it uses a startIndex which is close (in this case directly on) the sub-string we're looking for. The test is here: https://github.com/HaxeFoundation/haxe/blob/development/tests/benchs/src/cases/StringScan.hx#L16

The performance characteristics suggests that it walks a large portion of the string instead of starting at startIndex.

@ncannasse
Copy link
Member

I'm not surprised that several platforms exhibit the same behavior, as I mostly copy/pasted the implementation when adding a new platform.

@ncannasse
Copy link
Member

I made a small fix in lastIndexOf in ac8b78a
But the problem remain that we don't have a native way to search bytes backward, which would be necessary to reach the required performances.

@ncannasse
Copy link
Member

I've opened HaxeFoundation/hashlink#239 not to forget about it

@ncannasse ncannasse removed the platform-hl Everything related to HashLink label May 17, 2019
@ncannasse ncannasse changed the title [hl] [cs] [neko] funky String.lastIndexOf performance [cs] funky String.lastIndexOf performance May 17, 2019
@Simn Simn added this to the Backlog milestone May 22, 2019
@Simn Simn modified the milestones: Backlog, Later Mar 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform-cs Everything related to c#
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants