-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add utility for transforming Trino ROW type columns into Ruby hashes #117
Conversation
def blank?(obj) | ||
obj.respond_to?(:empty?) ? !!obj.empty? : !obj | ||
end | ||
|
||
def starts_with?(str, prefix) | ||
prefix.respond_to?(:to_str) && str[0, prefix.length] == prefix | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are adopted from ActiveSupport.
90727cb
to
fcb8c10
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the pull request! I really like this idea.
Transforming structured types to objects looks fine, but transforming primitive types by default may be a problematic as the current logic doesn't seem to cover all use cases. Isn't it better to have a mechanism to control transformations and/or to allow using custom transformers?
Therefore, I wonder if modifying Trino::Client#run_with_names to do this makes sense. However, this is a breaking change, so I didn't put it in here. What do you think?
Backward compatibility is important. It would be better to add an option to run_with_names
or new method.
@takezoe thank you for reviewing this!
I like this idea a lot and I'll work on it 👍 I think this will cover your questions around timestamps and varbinary; we can keep these implementations that work for us, but move their definition into something that gets configured vs being a part of the library.
Great idea to handle these! I'll look into the Trino spec for these to see what will work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Scalar transformation is now configurable and it looks really good to me. I think including your scalar parser in this library is also fine if you want.
@takezoe thanks again for reviewing! I'm still looking into including support for
What do you think about me adding an example to the README of using |
Sounds good! Would you include it in this PR? |
@kmcq For now, we will merge this PR and release new version. It would be great if you can create another PR to update README. |
Will do @takezoe! Thanks for merging and sorry I was slow to update the docs. |
@kmcq Thanks for your contribution. 🎉 I published a new version, 2.1.0. |
Hello! I'm a longtime user of this gem, thanks so much for creating and maintaining it ✨
Purpose
My team uses Ruby to parse a lot of data with Trino and many of the columns of the tables we handle are Trino ROW types (e.g. objects like a User). I'd like to upstream the code we've been using to parse these into Ruby hashes; we find it to be quite useful.
Overview
What's in here
Mostly I've added a new class,
Trino::Client::ColumnValueParser
, that parses the Trino type of a column in order to correctly transform the array that Trino returns for a ROW type into a Ruby hash. It handles ROW types that have ROW types (and ARRAY types too for what it's worth) embedded within them, as this is something we need.We also exclusively stream results with something like
so I've added methods to
Trino::Client::Query
to accommodate transforming results like this:It seemed like the best tests to modify were the ones in
client_spec.rb
. I updated an existing test to show thatrun_with_names
returns arrays and then added a new spec that transforms the ROW types into hashes uses a helper method I added for folks who don't have access to thequery
object, like those usingrun_with_names
.What's not in here
Within my team's work, having ROW types returned as arrays of elements, as Trino does by default, isn't as useful as having them transformed into Hashes like this PR enables. Therefore, I wonder if modifying
Trino::Client#run_with_names
to do this makes sense. However, this is a breaking change, so I didn't put it in here. What do you think? If you'd like, I could do that change in a subsequent PR. We don't userun_with_names
so I don't have an opinion here. I also considered adding a new method that is arun_with_names
clone but with this change, but figured I'd ask what you think would be useful before doing so.Finally, I haven't updated the README at all. I'd be happy to! But I wanted to see if you think this is useful enough for folks to add it there.
Cheers, thanks for reading ✨
-- @kmcq
Checklist