-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: optimize ResultSet decoding #1244
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
product-auto-label
bot
added
size: m
Pull request size is medium.
api: spanner
Issues related to the googleapis/python-spanner API.
labels
Nov 28, 2024
olavloite
force-pushed
the
optimize-result-set-decoding
branch
2 times, most recently
from
November 28, 2024 12:24
e5667f0
to
2591dbd
Compare
product-auto-label
bot
added
size: l
Pull request size is large.
and removed
size: m
Pull request size is medium.
labels
Nov 28, 2024
ResultSet decoding went through a long if-elif-else construct for every row and every column to determine how to decode that specific cell. This caused large result sets to see a significantly higher decoding time than necessary, as determining how to decode a column only needs to be determined once for the entire ResultSet. This change therefore collects the decoders once before starting to decode any rows. It does this by: 1. Iterating over the columns in the ResultSet and get a decoder for the specific type of that column. 2. Store those decoders as function references in an array. 3. Pick the appropriate function directly from this array each time a column needs to be decoded. Selecting and decoding a query result with 100 rows consisting of 24 columns (one for each supported data type) takes ~35-40ms without this change, and ~18-20ms with this change. The following benchmarks were executed locally against an in-mem mock Spanner server running in Java. The latter was chosen because: 1. We have a random ResultSet generator in Java that can be used for this. 2. Having the mock Spanner server running in a separate process and in another programming language reduces the chance that the mock server itself has an impact on the differences that we see between the different runs. Results without this change (100 iterations): ``` Elapsed: 43.5490608215332 ms Elapsed: 39.53838348388672 ms Elapsed: 38.68389129638672 ms Elapsed: 38.26117515563965 ms Elapsed: 38.28692436218262 ms Elapsed: 38.12098503112793 ms Elapsed: 39.016008377075195 ms Elapsed: 38.15174102783203 ms Elapsed: 38.3448600769043 ms Elapsed: 38.00082206726074 ms Elapsed: 38.0091667175293 ms Elapsed: 38.02800178527832 ms Elapsed: 38.03110122680664 ms Elapsed: 38.42306137084961 ms Elapsed: 38.535356521606445 ms Elapsed: 38.86699676513672 ms Elapsed: 38.702964782714844 ms Elapsed: 38.881778717041016 ms Elapsed: 38.08116912841797 ms Elapsed: 38.084983825683594 ms Elapsed: 38.04278373718262 ms Elapsed: 38.74492645263672 ms Elapsed: 38.57111930847168 ms Elapsed: 38.17009925842285 ms Elapsed: 38.64407539367676 ms Elapsed: 38.00559043884277 ms Elapsed: 38.06161880493164 ms Elapsed: 38.233280181884766 ms Elapsed: 38.48695755004883 ms Elapsed: 38.71011734008789 ms Elapsed: 37.92428970336914 ms Elapsed: 38.8491153717041 ms Elapsed: 38.90705108642578 ms Elapsed: 38.20919990539551 ms Elapsed: 38.07401657104492 ms Elapsed: 38.30099105834961 ms Elapsed: 38.07377815246582 ms Elapsed: 38.61117362976074 ms Elapsed: 39.58392143249512 ms Elapsed: 39.69216346740723 ms Elapsed: 38.27810287475586 ms Elapsed: 37.88185119628906 ms Elapsed: 38.763999938964844 ms Elapsed: 39.05320167541504 ms Elapsed: 38.82408142089844 ms Elapsed: 38.47217559814453 ms Elapsed: 38.024187088012695 ms Elapsed: 38.07687759399414 ms Elapsed: 38.11931610107422 ms Elapsed: 37.9488468170166 ms Elapsed: 38.04421424865723 ms Elapsed: 38.57421875 ms Elapsed: 39.543867111206055 ms Elapsed: 38.4981632232666 ms Elapsed: 37.89806365966797 ms Elapsed: 38.0861759185791 ms Elapsed: 38.72990608215332 ms Elapsed: 38.47217559814453 ms Elapsed: 38.71774673461914 ms Elapsed: 38.27619552612305 ms Elapsed: 38.08403015136719 ms Elapsed: 38.6350154876709 ms Elapsed: 38.03229331970215 ms Elapsed: 39.01100158691406 ms Elapsed: 38.4981632232666 ms Elapsed: 38.25807571411133 ms Elapsed: 38.59400749206543 ms Elapsed: 38.83624076843262 ms Elapsed: 38.584232330322266 ms Elapsed: 39.54625129699707 ms Elapsed: 38.268089294433594 ms Elapsed: 39.3218994140625 ms Elapsed: 37.9948616027832 ms Elapsed: 38.05804252624512 ms Elapsed: 38.88821601867676 ms Elapsed: 38.08021545410156 ms Elapsed: 38.22588920593262 ms Elapsed: 37.97507286071777 ms Elapsed: 38.03110122680664 ms Elapsed: 37.91308403015137 ms Elapsed: 38.00201416015625 ms Elapsed: 38.529157638549805 ms Elapsed: 38.44308853149414 ms Elapsed: 38.87534141540527 ms Elapsed: 38.85912895202637 ms Elapsed: 38.48695755004883 ms Elapsed: 38.41686248779297 ms Elapsed: 38.10882568359375 ms Elapsed: 37.98198699951172 ms Elapsed: 38.50507736206055 ms Elapsed: 38.16986083984375 ms Elapsed: 38.07711601257324 ms Elapsed: 37.92715072631836 ms Elapsed: 37.93692588806152 ms Elapsed: 38.04588317871094 ms Elapsed: 38.62190246582031 ms Elapsed: 38.5129451751709 ms Elapsed: 37.960052490234375 ms Elapsed: 37.99295425415039 ms Elapsed: 38.45930099487305 ms ``` Results with this change: ``` Elapsed: 21.09503746032715 ms Elapsed: 17.00878143310547 ms Elapsed: 17.43626594543457 ms Elapsed: 16.201019287109375 ms Elapsed: 16.66712760925293 ms Elapsed: 15.926837921142578 ms Elapsed: 16.408205032348633 ms Elapsed: 16.13783836364746 ms Elapsed: 16.27206802368164 ms Elapsed: 17.15087890625 ms Elapsed: 16.06607437133789 ms Elapsed: 16.852855682373047 ms Elapsed: 23.713111877441406 ms Elapsed: 17.20905303955078 ms Elapsed: 16.60609245300293 ms Elapsed: 16.30997657775879 ms Elapsed: 15.933990478515625 ms Elapsed: 15.688180923461914 ms Elapsed: 16.228914260864258 ms Elapsed: 16.252994537353516 ms Elapsed: 16.33000373840332 ms Elapsed: 15.842676162719727 ms Elapsed: 16.328096389770508 ms Elapsed: 16.4949893951416 ms Elapsed: 16.47210121154785 ms Elapsed: 16.674041748046875 ms Elapsed: 15.768766403198242 ms Elapsed: 16.48569107055664 ms Elapsed: 15.876054763793945 ms Elapsed: 16.852140426635742 ms Elapsed: 16.035079956054688 ms Elapsed: 16.407012939453125 ms Elapsed: 15.882015228271484 ms Elapsed: 16.71886444091797 ms Elapsed: 15.86294174194336 ms Elapsed: 16.566038131713867 ms Elapsed: 15.904903411865234 ms Elapsed: 16.289234161376953 ms Elapsed: 16.14999771118164 ms Elapsed: 16.31784439086914 ms Elapsed: 16.106843948364258 ms Elapsed: 16.581058502197266 ms Elapsed: 16.435861587524414 ms Elapsed: 15.904903411865234 ms Elapsed: 16.408205032348633 ms Elapsed: 16.062021255493164 ms Elapsed: 16.256093978881836 ms Elapsed: 15.87367057800293 ms Elapsed: 16.23702049255371 ms Elapsed: 16.745805740356445 ms Elapsed: 15.92707633972168 ms Elapsed: 16.142845153808594 ms Elapsed: 16.492843627929688 ms Elapsed: 21.553754806518555 ms Elapsed: 17.05002784729004 ms Elapsed: 16.932964324951172 ms Elapsed: 16.810894012451172 ms Elapsed: 16.577720642089844 ms Elapsed: 15.714168548583984 ms Elapsed: 16.2351131439209 ms Elapsed: 16.072988510131836 ms Elapsed: 16.038894653320312 ms Elapsed: 16.055822372436523 ms Elapsed: 16.378164291381836 ms Elapsed: 15.806913375854492 ms Elapsed: 15.5792236328125 ms Elapsed: 15.954732894897461 ms Elapsed: 15.566825866699219 ms Elapsed: 15.707969665527344 ms Elapsed: 15.514135360717773 ms Elapsed: 15.43116569519043 ms Elapsed: 15.332937240600586 ms Elapsed: 15.470027923583984 ms Elapsed: 15.269756317138672 ms Elapsed: 15.250921249389648 ms Elapsed: 15.47694206237793 ms Elapsed: 15.306949615478516 ms Elapsed: 15.72728157043457 ms Elapsed: 15.938043594360352 ms Elapsed: 16.324996948242188 ms Elapsed: 16.198158264160156 ms Elapsed: 15.982627868652344 ms Elapsed: 16.308069229125977 ms Elapsed: 17.843246459960938 ms Elapsed: 15.820026397705078 ms Elapsed: 16.428232192993164 ms Elapsed: 15.978097915649414 ms Elapsed: 16.347885131835938 ms Elapsed: 16.026020050048828 ms Elapsed: 16.362905502319336 ms Elapsed: 16.900062561035156 ms Elapsed: 17.3337459564209 ms Elapsed: 17.65608787536621 ms Elapsed: 20.101070404052734 ms Elapsed: 18.137216567993164 ms Elapsed: 16.952991485595703 ms Elapsed: 16.7691707611084 ms Elapsed: 16.71290397644043 ms Elapsed: 16.3421630859375 ms Elapsed: 16.36195182800293 ms ```
olavloite
force-pushed
the
optimize-result-set-decoding
branch
from
November 28, 2024 14:51
2591dbd
to
cd3b204
Compare
harshachinta
approved these changes
Dec 2, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
api: spanner
Issues related to the googleapis/python-spanner API.
size: l
Pull request size is large.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Optimize ResultSet Decoding
ResultSet decoding went through a long if-elif-else construct for every row and every column to determine how to decode that specific cell. This caused large result sets to see a significantly higher decoding time than necessary, as determining how to decode a column only needs to be determined once for the entire ResultSet.
This change therefore collects the decoders once before starting to decode any rows. It does this by:
Selecting and decoding a query result with 100 rows consisting of 24 columns (one for each supported data type) takes ~35-40ms without this change, and ~15-20ms with this change. The following benchmarks were executed locally against an in-mem mock Spanner server running in Java. The latter was chosen because:
Lazy Decoding
This change also introduces an option to lazily decode result sets. This option can be used if the application wants to decode (some of) the data manually, or access the raw underlying data. This can significantly reduce the decode time that is needed for large result sets, especially if the results contain structured data (e.g JSON), and the application just wants to access the underlying strings.
Benchmarks
Results without this change (100 iterations):
Results with this change:
Benchmarking Lazy Decode
Reading the same result set with lazy decoding enabled (and no further decoding of the data):