-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[stdlib] Add Span.count()
#3792
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
!sync |
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
stdlib/src/memory/span.mojo
Outdated
StringSlice[origin]( | ||
ptr=sub.unsafe_ptr().bitcast[Byte](), | ||
length=len(sub) * sizeof[Scalar[D]](), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion This seems to construct StringSlice
values of pointers to arbitrary byte slices, which may not contain valid UTF-8. Could we do this refactoring in a different order to avoid that, and instead add Span.count()
methods that don't rely on calling into StringSlice.count()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't really because we don't have Span.find()
. I've been trying unsuccessfully to convince Joe since september of last year that we need it in #3548 . There have been a lot of PRs where I realized we need it. It's the best way to be "correct" about the way we search for arbitrary datatypes in memory, while centralizing the implementation and letting it be used for other things beside strings (DB indexing, etc.).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this function that does the counting of overlapping sequences as an afterthought to have the same option as Python for bytes.count()
. But my main goal is getting the parametrized count function into the stdlib because I need it to optimize some other string code. We could split the pythonic count to another PR and/or wait until we have Span.find()
WDYT?
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Add
Span.count()
.This was part of PR #3686 but I split it off to have its independent review.
CC: @JoeLoser yet another place where we need #3548