-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[stdlib] Fix iterator ergonomics #3700
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
I think the name is dictated by the compiler, thus can't be changed. |
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Yeah, its name is coupled in the compiler, so we'd need to change it both in the library and compiler. I'm not wedded to the name that @Mogball chose here, but I'll let him chime in as well. CC: @ConnorGray in case you have opinions too. |
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
This function is temporary anyways, so the name doesn't really matter |
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
… fix-iter-ergonomics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of making everything implement dunder bool, did you consider changing iter
function to just directly call self.__hasmore__()
in the debug_assert
usage?
If I'm not so sure about implicit conversion and such, but to me this looks prettier: iterator = iter("hello world")
fn example_optional_wrapper() -> Optional[StaticString]:
return next(iterator) if iterator else None |
IMO |
@gryznar does this seem readable to you? fn read(
self, start: UInt = 0
) -> (UInt, JsonType, Self._Sp, UInt) as output:
"""Read the next `JsonType` **skipping over whitespace**.
Args:
start: The absolute offset in bytes to start reading a valid
`JsonType`.
Returns:
The absolute offset, its `JsonType`, the `Span` containing it, and
the start of the next item (does not check bounds).
"""
alias `{` = UInt8(ord("{"))
alias `}` = UInt8(ord("}"))
alias `[` = UInt8(ord("["))
alias `]` = UInt8(ord("]"))
alias `,` = UInt8(ord(","))
alias `"` = UInt8(ord('"'))
alias ` ` = UInt8(ord(" "))
alias `\n` = UInt8(ord("\n"))
alias `\t` = UInt8(ord("\t"))
alias `\r` = UInt8(ord("\r"))
invalid = start, JsonType(JsonType.invalid), self._buffer, start
ptr = self._buffer.unsafe_ptr()
debug_assert(
start <= len(self._buffer), "start is bigger than buffer length"
)
iterator = StringSlice[origin](
unsafe_from_utf8_ptr=ptr + start, len=len(self._buffer) - start
).__iter__()
iter_len = len(iterator)
if iter_len == 0:
output = invalid
return
char = iterator.__next__()
while (
char.byte_length() > 0
and Self.isspace(char.unsafe_ptr()[0])
and iterator.__hasmore__()
):
char = iterator.__next__()
if not char:
output = invalid
return
char_p = char.unsafe_ptr()
b0_char = char_p[0]
if b0_char == `{`:
if not iterator.__hasmore__():
output = invalid
return
char = iterator.__next__()
char_p = char.unsafe_ptr()
b0_char = char_p[0]
if Self.isspace(b0_char):
if not iterator.__hasmore__():
output = invalid
return
char = iterator.__next__()
char_p = char.unsafe_ptr()
b0_char = char_p[0]
if b0_char == `}`:
... # TODO: object end
elif b0_char == `,`:
... # TODO: next item
elif b0_char == `[`:
if not iterator.__hasmore__():
output = invalid
return
char = iterator.__next__()
char_p = char.unsafe_ptr()
b0_char = char_p[0]
if Self.isspace(b0_char):
if not iterator.__hasmore__():
output = invalid
return
char = iterator.__next__()
char_p = char.unsafe_ptr()
b0_char = char_p[0]
if b0_char == `]`:
... # TODO: array end
elif b0_char == `,`:
... # TODO: next item
elif b0_char == `"`:
... # TODO: string
# TODO: value
# TODO: escaped_value
# TODO: parametrized support for NaN
# TODO: parametrized support for Infinity
# TODO: parametrized support for -Infinity
output = invalid yes this very much something I threw together without much thought and can do capturing functions to prettify some of the stuff, but it is still very ugly even adding |
@martinvuyk if in case above most code operates on dunders, it is more consistent to do the same with |
This PR is exactly about the ergonomics of iterator related operations... dunders aren't meant to be used directly. |
@gryznar Python raises when iterators have no more values, which is problematic for many reasons >>> next(iter([]))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration The Iterator proposal I did (which was rejected) was like Rust. Rust returns What I am proposing here is still indirectly following the Optional style logic and IMO seems coherent with the way Mojo is turning out: fn print_next_value(inout iterator: _StringSliceIter) -> Bool:
if not iterator:
return False
print(next(iterator))
return True
fn print_newline_optional(a: Optional[StringLiteral]):
if not a:
print("a is empty")
return
iterator = iter(a.value())
while print_next_value(iterator):
continue
print("finished printing")
fn main():
a = "hello world"
print_newline_optional(a) And most Python code needing to work with iterators directly (which very few people do) is very clunky (if you follow best practices) from typing import Iterator
def print_next_value(iterator: Iterator[str]) -> bool:
try:
print(next(iterator))
return True
except StopIteration:
return False
except:
raise Exception("python can surprise you in unpleasant ways :')")
def print_newline_optional(a: str | None):
if a is None:
print("a is empty")
return
iterator = iter(a)
while print_next_value(iterator):
continue
print("finished printing")
if __name__ == "__main__":
a = "hello world"
print_newline_optional(a) We can actually "improve" Python here, there is no need to do everything the exact same. We can keep compatibility by adding an iterator wrapper that behaves like Python's if we wish to (probably easier said than done). |
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
!sync |
…slice tests Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
@JoeLoser gentle ping, and also to ask if you want me to split this PR. IMHO it makes no difference since this barely changes any functionality, it's just wrappers and docstrings. But if it will be faster for you guys just let me know and I'll do it. |
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
Fix iterator ergonomics. Add
iter() -> Iterator
andnext[T](Iterator[T]) -> T
for most common iterators, add missingreversed() -> Iterator[forward=False]
for Stringlike types