-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix deprecations with copy! and Vector{UInt8}(str) #234
Conversation
@@ -272,7 +273,7 @@ function parse_string(ps::ParserState) | |||
if c == BACKSLASH | |||
c = advance!(ps) | |||
if c == LATIN_U # Unicode escape | |||
append!(b, Vector{UInt8}(string(read_unicode_escape!(ps)))) | |||
append!(b, get_bytes(string(read_unicode_escape!(ps)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
append!(b, codeunits(s))
would work here … no need to force a copy to be made as in get_bytes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem was that codeunits
wasn't defined in Compat.jl
to make it work on v0.6.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even using codeunits
is not the best approach here.
Instead of having to create a string just to append it, it would be better to pass b
to a read_unicode_escape!(b, ps)
function, that directly push!
ed the 1-4 bytes from the unicode escape.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another question - why do you think my change would force a copy? It uses Vector{UInt8}
on v0.6
, which doesn't copy (and hence could be unsafe), and codeunits
on v0.7
, which doesn't copy either.
I think your objection was incorrect.
pc = ParserContext{unparameterize_type(dicttype), inttype}() | ||
ps = MemoryParserState(Vector{UInt8}(String(str)), 1) | ||
ps = MemoryParserState(get_bytes(str), 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better not to make a copy here.
Since MemoryParserState
is only ever created from a String
, it would be better to just (a) change the utf8data
field to a String
field and (b) call codeunit(s, i)
to extract a byte from the string (e.g. in the byteat
function). Note that codeunit
already works in Julia 0.6.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but I'm trying to make the minimal changes here (as I was told many times before to do 🙂 ) to make it work on both v0.6 and later.
For a fast JSON parser, I'll probably just do it based on my Strs.jl package, not String
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My suggestions require hardly any more code, and don't make more copies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They don't make less copies either. My change actually requires gets rid of the copies on v0.7, and doesn't add them for v0.6.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your version has makes more copies on 0.7 than 0.6, whereas mine does not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On v0.7, it calls codeunits
. Isn't that what your's does on v0.7?
@@ -262,6 +262,7 @@ function read_unicode_escape!(ps) | |||
end | |||
end | |||
|
|||
get_bytes(str) = Vector{UInt8}(@static isdefined(Base, :codeunits) ? codeunits(str) : str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is not necessary given the changes I suggest below.
However, it would be good to have a Compat.codeunits(s)
function that just returns Vector{UInt8}(s)
in Julia 0.6.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just pushed a PR for Compat to add codeunits
and ncodeunits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If those are merged soon, then yes, I'll change this to use codeunits.
I tried to put this bump message on JuliaLang/Compat.jl#474 but wasn't able to: The failure on Travis is because of something unrelated on master:
Could this be merged so that I can fix #234 the way Steven would like? |
Sure, performance can be fixed later. I'd rather not have so many deprecations. |
@TotalVerb, I think @ScottPJones was asking for the Compat PR to be merged, not this one. I really wish this PR had not been merged. Now, instead of noisy deprecations, we have silent performance problems that we will have to remember to chase down later. |
I just came back to ask when the update to Compat would be tagged, so that I could push an update to this PR to use |
Don't blame this on your tendency to raise unrelated issues (which you are doing again). The problem with this PR has nothing to do with minimalism: it updates code which made no copies in 0.6 to code that makes copies in 0.7, rather than keeping the equivalent behavior. It is a regression. I'm not in favor of PRs that introduce regressions in order to silence deprecations, whereas avoiding the regression required few or no additional lines of code. |
@TotalVerb, I'd prefer that you revert. |
This reverts commit 6a61914.
@stevengj: Stop making these incorrect statements, please. This change was NOT a regression. Also, I'd said that I'd change it to use |
Your change did introduce additional copies. |
No description provided.