Fix deprecations with copy! and Vector{UInt8}(str) #234

ScottPJones · 2018-01-24T19:20:33Z

No description provided.

stevengj · 2018-01-24T20:35:10Z

src/Parser.jl

@@ -272,7 +273,7 @@ function parse_string(ps::ParserState)
        if c == BACKSLASH
            c = advance!(ps)
            if c == LATIN_U  # Unicode escape
-                append!(b, Vector{UInt8}(string(read_unicode_escape!(ps))))
+                append!(b, get_bytes(string(read_unicode_escape!(ps))))


append!(b, codeunits(s)) would work here … no need to force a copy to be made as in get_bytes

The problem was that codeunits wasn't defined in Compat.jl to make it work on v0.6.

Even using codeunits is not the best approach here.
Instead of having to create a string just to append it, it would be better to pass b to a read_unicode_escape!(b, ps) function, that directly push!ed the 1-4 bytes from the unicode escape.

Another question - why do you think my change would force a copy? It uses Vector{UInt8} on v0.6, which doesn't copy (and hence could be unsafe), and codeunits on v0.7, which doesn't copy either.
I think your objection was incorrect.

stevengj · 2018-01-24T20:40:40Z

src/Parser.jl

    pc = ParserContext{unparameterize_type(dicttype), inttype}()
-    ps = MemoryParserState(Vector{UInt8}(String(str)), 1)
+    ps = MemoryParserState(get_bytes(str), 1)


It would be better not to make a copy here.

Since MemoryParserState is only ever created from a String, it would be better to just (a) change the utf8data field to a String field and (b) call codeunit(s, i) to extract a byte from the string (e.g. in the byteat function). Note that codeunit already works in Julia 0.6.

Yes, but I'm trying to make the minimal changes here (as I was told many times before to do 🙂 ) to make it work on both v0.6 and later.
For a fast JSON parser, I'll probably just do it based on my Strs.jl package, not String.

My suggestions require hardly any more code, and don't make more copies.

They don't make less copies either. My change actually requires gets rid of the copies on v0.7, and doesn't add them for v0.6.

Your version has makes more copies on 0.7 than 0.6, whereas mine does not.

On v0.7, it calls codeunits. Isn't that what your's does on v0.7?

stevengj · 2018-01-24T20:41:43Z

src/Parser.jl

@@ -262,6 +262,7 @@ function read_unicode_escape!(ps)
    end
 end

+get_bytes(str) = Vector{UInt8}(@static isdefined(Base, :codeunits) ? codeunits(str) : str)


This function is not necessary given the changes I suggest below.

However, it would be good to have a Compat.codeunits(s) function that just returns Vector{UInt8}(s) in Julia 0.6.

Just pushed a PR for Compat to add codeunits and ncodeunits

If those are merged soon, then yes, I'll change this to use codeunits.

ScottPJones · 2018-01-26T15:09:56Z

I tried to put this bump message on JuliaLang/Compat.jl#474 but wasn't able to:

The failure on Travis is because of something unrelated on master:

Error During Test at /home/travis/.julia/v0.7/Compat/test/runtests.jl:1079
  Test threw an exception of type MethodError
  Expression: Compat.IteratorSize(v) == Base.HasShape()
  MethodError: no method matching Base.HasShape()
  Stacktrace:
   [1] top-level scope at /home/travis/.julia/v0.7/Compat/test/runtests.jl:1079

Could this be merged so that I can fix #234 the way Steven would like?

TotalVerb · 2018-01-29T07:06:04Z

Sure, performance can be fixed later. I'd rather not have so many deprecations.

stevengj · 2018-01-29T07:09:16Z

@TotalVerb, I think @ScottPJones was asking for the Compat PR to be merged, not this one.

I really wish this PR had not been merged. Now, instead of noisy deprecations, we have silent performance problems that we will have to remember to chase down later.

TotalVerb · 2018-01-29T07:14:41Z

@stevengj I can revert if necessary, but I intend to revisit MemoryParserState anyway (as would be necessary to fix #232), so I figured it would not hurt that much. We can avoid tagging in the meantime.

ScottPJones · 2018-01-29T13:16:48Z

I just came back to ask when the update to Compat would be tagged, so that I could push an update to this PR to use codeunits and ncodeunits.
As soon as that is tagged, I'll make another PR to JSON to use them (I don't like performance problems either, as you well know!). However, regarding performance problems, there are a lot of much more significant ones in this code, such as creating strings just to calculate the length when there are \u escapes, which is why (in this PR), I didn't worry about fixing all of it's problems (having had keeping PRs focused rather beaten into me!)

stevengj · 2018-01-29T17:18:42Z

Don't blame this on your tendency to raise unrelated issues (which you are doing again). The problem with this PR has nothing to do with minimalism: it updates code which made no copies in 0.6 to code that makes copies in 0.7, rather than keeping the equivalent behavior. It is a regression. I'm not in favor of PRs that introduce regressions in order to silence deprecations, whereas avoiding the regression required few or no additional lines of code.

stevengj · 2018-01-29T17:20:35Z

@TotalVerb, I'd prefer that you revert.

This reverts commit 6a61914.

ScottPJones · 2018-01-30T14:17:29Z

it updates code which made no copies in 0.6 to code that makes copies in 0.7, rather than keeping the equivalent behavior

@stevengj: Stop making these incorrect statements, please. This change was NOT a regression.
I made a change that did not introduce any additional copies, and was simpler than your proposed change.

Also, I'd said that I'd change it to use codeunits as soon as that is available in Compat.
Your PR has been merged, but no new release has yet been tagged, so it still cannot be used.

stevengj · 2018-02-01T16:02:44Z

Your change did introduce additional copies. Vector{UInt8}(codeunits(s)) makes a copy in 0.7.

Fix deprecations with copy! and Vector{UInt8}(str)

651662b

stevengj reviewed Jan 24, 2018

View reviewed changes

stevengj mentioned this pull request Jan 24, 2018

codeunits and ncodeunits JuliaLang/Compat.jl#474

Merged

TotalVerb merged commit 6a61914 into JuliaIO:master Jan 29, 2018

TotalVerb mentioned this pull request Jan 29, 2018

Revise MemoryParserState #235

Open

5 tasks

ScottPJones deleted the spj/fixdeps branch January 29, 2018 13:16

TotalVerb added a commit that referenced this pull request Jan 30, 2018

Revert "Fix deprecations with copy! and Vector{UInt8}(str) (#234)"

00327c2

This reverts commit 6a61914.

stevengj mentioned this pull request Feb 1, 2018

reduce number of string/buffer allocations #236

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix deprecations with copy! and Vector{UInt8}(str) #234

Fix deprecations with copy! and Vector{UInt8}(str) #234

ScottPJones commented Jan 24, 2018

stevengj Jan 24, 2018

ScottPJones Jan 24, 2018

ScottPJones Jan 29, 2018

ScottPJones Jan 29, 2018

stevengj Jan 24, 2018

ScottPJones Jan 24, 2018 •

edited

Loading

stevengj Jan 24, 2018

ScottPJones Jan 29, 2018

stevengj Jan 29, 2018

ScottPJones Jan 30, 2018

stevengj Jan 24, 2018

stevengj Jan 24, 2018

ScottPJones Jan 24, 2018

ScottPJones commented Jan 26, 2018

TotalVerb commented Jan 29, 2018 •

edited

Loading

stevengj commented Jan 29, 2018 •

edited

Loading

TotalVerb commented Jan 29, 2018

ScottPJones commented Jan 29, 2018

stevengj commented Jan 29, 2018 •

edited

Loading

stevengj commented Jan 29, 2018

ScottPJones commented Jan 30, 2018 •

edited

Loading

stevengj commented Feb 1, 2018

Fix deprecations with copy! and Vector{UInt8}(str) #234

Fix deprecations with copy! and Vector{UInt8}(str) #234

Conversation

ScottPJones commented Jan 24, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ScottPJones Jan 24, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ScottPJones commented Jan 26, 2018

TotalVerb commented Jan 29, 2018 • edited Loading

stevengj commented Jan 29, 2018 • edited Loading

TotalVerb commented Jan 29, 2018

ScottPJones commented Jan 29, 2018

stevengj commented Jan 29, 2018 • edited Loading

stevengj commented Jan 29, 2018

ScottPJones commented Jan 30, 2018 • edited Loading

stevengj commented Feb 1, 2018

ScottPJones Jan 24, 2018 •

edited

Loading

TotalVerb commented Jan 29, 2018 •

edited

Loading

stevengj commented Jan 29, 2018 •

edited

Loading

stevengj commented Jan 29, 2018 •

edited

Loading

ScottPJones commented Jan 30, 2018 •

edited

Loading