You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the above code, the non_whitespace.cloned() call actually copies the token (and string) even when many callsites only need to check what it is and could get by with a &
Suggestions for improvements
The biggest suggestion is to stop copying each token unless it actually needed a clone. For example instead of this
implParser{// returns an OWNED TokenWithLocationpubfnpeek_token(&self) -> TokenWithLocation{
...}}
Make it like this
implParser{// returns an reference to TokenWithLocation// (the & means no copying!)pubfnpeek_token_ref(&self) -> &TokenWithLocation{
...}}
I think we could do this without massive breaking changes by adding new functions like peek_token_ref() that returned a reference to the token rather than a clone
Ideas
I played around with it a bit and here is what I came up with. I think a similar pattern could be applied to other places
implParser{
...
/// Return the first non-whitespace token that has not yet been processed/// (or None if reached end-of-file) and mark it as processed. OK to call/// repeatedly after reaching EOF.pubfnnext_token(&mutself) -> TokenWithLocation{self.next_token_ref().clone()}/// Return the first non-whitespace token that has not yet been processed/// (or None if reached end-of-file) and mark it as processed. OK to call/// repeatedly after reaching EOF.pubfnnext_token_ref(&mutself) -> &TokenWithLocation{loop{self.index += 1;// skip whitespaceifletSome(TokenWithLocation{token:Token::Whitespace(_),location: _,}) = self.tokens.get(self.index - 1){continue}break;}if(self.index - 1) < self.tokens.len(){&self.tokens[self.index - 1]}else{eof_token()}}
...
}staticEOF_TOKEN:OnceLock<TokenWithLocation> = OnceLock::new();fneof_token() -> &'static TokenWithLocation{EOF_TOKEN.get_or_init(|| {TokenWithLocation::wrap(Token::EOF)})}
The text was updated successfully, but these errors were encountered:
Part of #1557
While looking at the flamegraph posted to that ticket, one obvious source of improvement is to stop copying each token so much.
Functions like
peek_token()
, andexpect_token()
copy theToken
(which often includes a string)For example
https://github.com/apache/datafusion-sqlparser-rs/blob/2e90e105a74bf9f50f2bad6c22992759ddb06880/src/parser/mod.rs#L3314-L3334
In the above code, the
non_whitespace.cloned()
call actually copies the token (and string) even when many callsites only need to check what it is and could get by with a &Suggestions for improvements
The biggest suggestion is to stop copying each token unless it actually needed a clone. For example instead of this
Make it like this
I think we could do this without massive breaking changes by adding new functions like
peek_token_ref()
that returned a reference to the token rather than a cloneIdeas
I played around with it a bit and here is what I came up with. I think a similar pattern could be applied to other places
The text was updated successfully, but these errors were encountered: