-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: optimize row merging #628
Conversation
- extract read-rows-acceptance-test.json based tests into own file - update the json to match the latest available in https://github.com/googleapis/conformance-tests/tree/main/bigtable/v2 - use parameterized pytest test to run all of the scenarios (instead of creating a function for each json blob) - use json protobufs to parse the file I left a TODO to allow easy updates of the file, unfortunately its not straight forward as the canonical protos get renamed for python gapic Next PR will extract row merging functionality from row_data to make it easier to maintain
No region tags are edited in this PR.This comment is generated by snippet-bot.
|
Can you add the documentation of row_merger to the Sphinx documentation? Then add We should also update the usages of row_data.Cell / row_data.PartialRowData / row_data.InvalidChunk in the documentation to use the row_merger instead. |
This is in preparation for extracting row merging into a separate class. See googleapis#628
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The row merging logic LGTM. I left some nits that you can fix or not.
* chore: move row value classes out of row_data This is in preparation for extracting row merging into a separate class. See #628 Co-authored-by: Anthonios Partheniou <partheniou@google.com>
# Conflicts: # google/cloud/bigtable/row_data.py
# Conflicts: # google/cloud/bigtable/row_merger.py
This PR rewrites the row merging logic to be more correct and improve performance:
{family: { qualifier: [] }}
data, this should maintain serverside ordering (family in creation order and qualifier in lexiographical).Overall this improves performance by 20% and in my opinion is a lot more readable