Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[stdlib] Add Span.count() #3792

Open
wants to merge 22 commits into
base: main
Choose a base branch
from
Open

Conversation

martinvuyk
Copy link
Contributor

Add Span.count().

This was part of PR #3686 but I split it off to have its independent review.

CC: @JoeLoser yet another place where we need #3548

Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
@martinvuyk martinvuyk requested a review from a team as a code owner November 21, 2024 14:55
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
@lsh
Copy link
Contributor

lsh commented Dec 17, 2024

!sync

@modularbot modularbot added the imported-internally Signals that a given pull request has been imported internally. label Dec 17, 2024
@skongum02 skongum02 deleted the branch modular:main January 29, 2025 18:58
@skongum02 skongum02 closed this Jan 29, 2025
@skongum02 skongum02 reopened this Jan 29, 2025
@skongum02 skongum02 changed the base branch from nightly to main January 29, 2025 20:44
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
@martinvuyk martinvuyk requested a review from a team as a code owner February 27, 2025 13:58
StringSlice[origin](
ptr=sub.unsafe_ptr().bitcast[Byte](),
length=len(sub) * sizeof[Scalar[D]](),
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion This seems to construct StringSlice values of pointers to arbitrary byte slices, which may not contain valid UTF-8. Could we do this refactoring in a different order to avoid that, and instead add Span.count() methods that don't rely on calling into StringSlice.count()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't really because we don't have Span.find(). I've been trying unsuccessfully to convince Joe since september of last year that we need it in #3548 . There have been a lot of PRs where I realized we need it. It's the best way to be "correct" about the way we search for arbitrary datatypes in memory, while centralizing the implementation and letting it be used for other things beside strings (DB indexing, etc.).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this function that does the counting of overlapping sequences as an afterthought to have the same option as Python for bytes.count(). But my main goal is getting the parametrized count function into the stdlib because I need it to optimize some other string code. We could split the pythonic count to another PR and/or wait until we have Span.find() WDYT?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
imported-internally Signals that a given pull request has been imported internally.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants