Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change in behaviour from 1.1.4 to 1.2.3 #344

Closed
yob opened this issue Jul 25, 2020 · 3 comments
Closed

Change in behaviour from 1.1.4 to 1.2.3 #344

yob opened this issue Jul 25, 2020 · 3 comments

Comments

@yob
Copy link

yob commented Jul 25, 2020

We recently attempted an upgrade from 1.14 to 1.2.3 and encountered a behaviour change that failed some of our test suite.

I assume it may be because we are doing something not recommended, but I thought I'd post it here in case it's a bug.

Here's a basic reproduction script:

require "pg"

puts "pg version: #{PG::VERSION}"

conn = PG.connect(dbname: "test")

conn.exec("CREATE TEMP TABLE test (flag bool, data bytea)")

puts "insert with null byte"
conn.send_query_params(
  "INSERT INTO test VALUES ($1,$2)",
  [true, {format: 1, value: "H\0ello"}],
  nil,
  PG::BasicTypeMapForQueries.new(conn),
)

puts conn.exec("SELECT flag, data FROM test").first.inspect

conn.exec("DELETE from test")

puts "insert with ascii bytes"
conn.send_query_params(
  "INSERT INTO test VALUES ($1,$2)",
  [true, {format: 1, value: "Hello"}],
  nil,
  PG::BasicTypeMapForQueries.new(conn),
)

puts conn.exec("SELECT flag, data FROM test").first.inspect

output on 1.1.4

$ ruby test.rb 
pg version: 1.1.4
insert with null byte
{"flag"=>"t", "data"=>"\\x4800656c6c6f"}
insert with ascii bytes
{"flag"=>"t", "data"=>"\\x48656c6c6f"}

output on 1.2.3

$ ruby test.rb 
pg version: 1.2.3
insert with null byte
nil
insert with ascii bytes
{"flag"=>"t", "data"=>"\\x7b22666f726d6174223a312c2276616c7565223a2248656c6c6f227d"}

Inserting with a null byte fails silently, and inserting with only ascii bytes results in the entire hash ({format: 1, value: "Hello"}) being serialized as raw bytes.

I notice the docs for send_query_params say:

type_map can be a PG::TypeMap derivation (such as PG::BasicTypeMapForQueries). This will type cast the params from various Ruby types before transmission based on the encoders defined by the type map. When a type encoder is used the format and oid of a given bind parameter are retrieved from the encoder instead out of the hash form described above.

Given that, attempting to use binary format and a type map may not be the correct thing to do. However, it seems to work (contrary to the docs) < 1.2. If its the wrong approach, can you recommend an alternative way to use type maps while also using the binary format for some columns?

@larskanis
Copy link
Collaborator

Thank you for raising this issue! It can be divided into two parts:

  1. A Hash is serialized as JSON since pg-1.2.0
  2. send_query_params() sends data only, but doesn't wait for the result, so that it's discarded

To 1: I think this is kind of unintended regression. pg-1.2 added Hash to JSON serialization to BasicTypeMapForQueries but a type encoder takes preference over Hash input form. See API docs:

When a type encoder is used the format and oid of a given bind parameter are retrieved from the encoder instead out of the hash form described above.

So the classic Hash form is no longer usable in combination with a BasicTypeMapForQueries. If I would have noticed this earlier, I would have added encode_hash_as= similar to encode_array_as=. But since pg-1.2 is out since 8 months, I hesitate to revert to the pg-1.1 behavior for Hashs, since this could break existing code based on pg-1.2 and it's JSON serialization.

To 2: send_query_params() sends data only, but doesn't wait for the result. Instead the subsequent exec call implicit retrieves the result, but discards it silently. You should use exec_params instead. It properly raises invalid input syntax for type bytea (PG::InvalidTextRepresentation) when trying to insert null byte into JSON. send_* methods are part of the asynchronous API and they must be used properly.

@yob
Copy link
Author

yob commented Aug 4, 2020

So the classic Hash form is no longer usable in combination with a BasicTypeMapForQueries. If I would have noticed this earlier, I would have added encode_hash_as= similar to encode_array_as=. But since pg-1.2 is out since 8 months, I hesitate to revert to the pg-1.1 behavior for Hashs, since this could break existing code based on pg-1.2 and it's JSON serialization.

Ahh. Thanks for clarifying this.

We'll have a discussion about how best to handle the new behaviour - maybe we'll have to try submitting the bytea data in the text format and watch how it performs.

You should use exec_params instead. It properly raises invalid input syntax for type bytea (PG::InvalidTextRepresentation) when trying to insert null byte into JSON

Oops, this shows my limited understanding of the postgres API. Although my reproduction script uses the async API incorrectly thanks to my ignorance, I've checked out production code and it's using it correctly. Issue (1) is the root of the issue we ran into.

@larskanis
Copy link
Collaborator

We'll have a discussion about how best to handle the new behaviour - maybe we'll have to try submitting the bytea data in the text format and watch how it performs.

Rails has it's own mechanisms to work with binary data. You usually can assign the binary data directly to the attribute of type bytea and rails handles de-/encoding transparently.

If you want to use PG::BasicTypeMapForQueries you can use a helper class like so:

class BinaryData < String
end

conn = PG.connect
conn.exec("CREATE TEMP TABLE test (flag bool, data bytea, i int)")
conn.type_map_for_results = PG::BasicTypeMapForResults.new(conn)
conn.type_map_for_queries = PG::BasicTypeMapForQueries.new(conn)
conn.type_map_for_queries[BinaryData] = PG::BinaryEncoder::Bytea.new

bd = BinaryData.new("ab\xff\0cd")
conn.exec_params("INSERT INTO test (data, i) VALUES ($1, $2)", [bd, 3])

res = conn.exec("SELECT * FROM  test")
p res.to_a  # => [{"flag"=>nil, "data"=>"ab\xFF\x00cd", "i"=>3}]

The part starting from conn.type_map_for_queries[BinaryData] = PG::BinaryEncoder::Bytea.new should also work in rails, since it's using PG::TypeMapByClass internally. And it should work since pg-0.18.0.

larskanis added a commit to larskanis/ruby-pg that referenced this issue Aug 4, 2020
It has always been easy to register a binary encoder to PG::BasicTypeMapForQueries like so:

  class BinaryData < String; end
  tm[BinaryData] = PG::BinaryEncoder::Bytea.new

However it might be more convenient to add a default class that can be used without registration.

Related to ged#344
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants