-
Notifications
You must be signed in to change notification settings - Fork 753
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(query): turn on new agg hashtable #15155
Conversation
Docker Image for PR
|
Performance test on ClickBench/hitsNote: We use
|
Query | Local singleton | main | pr | improve | Local cluster two nodes | main | pr | improve | Local cluster three nodes | main | pr | improve | Cloud small | main | pr | improve | Cloud medium | main | pr | improve | ||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Q1 | 0.006s | 0.006s | 0% | 0.008s | 0.008s | 0% | 0.012s | 0.012s | 0% | 0.033s | 0.032s | 3% | 0.025s | 0.024s | 4% | |||||||||
Q2 | 0.059s | 0.059s | 0% | 0.079s | 0.078s | 1% | 0.084s | 0.083s | 1% | 0.168s | 0.106s | 37% | 0.109s | 0.079s | 28% | |||||||||
Q3 | 0.17s | 0.161s | 5% | 0.175s | 0.182s | -4% | 0.189s | 0.186s | 2% | 0.139s | 0.117s | 16% | 0.172s | 0.115s | 33% | |||||||||
Q4 | 0.191s | 0.185s | 3% | 0.199s | 0.208s | -5% | 0.21s | 0.206s | 2% | 0.338s | 0.283s | 16% | 0.345s | 0.318s | 8% | |||||||||
Q5 | 0.544s | 0.542s | 0% | 0.885s | 0.811s | 8% | 1.075s | 1.049s | 2% | 0.423s | 0.419s | 1% | 0.571s | 0.513s | 10% | |||||||||
Q6 | 0.932s | 0.858s | 8% | 1.381s | 1.3s | 6% | 1.714s | 1.303s | 24% | 0.666s | 0.592s | 11% | 0.75s | 0.631s | 16% | |||||||||
Q7 | 0.049s | 0.049s | 0% | 0.07s | 0.07s | 0% | 0.067s | 0.067s | 0% | 0.092s | 0.092s | 0% | 0.099s | 0.099s | 0% | |||||||||
Q8 | 0.061s | 0.061s | 0% | 0.081s | 0.088s | -9% | 0.097s | 0.102s | -5% | 0.089s | 0.082s | 8% | 0.111s | 0.086s | 23% | |||||||||
Q9 | 0.862s | 0.872s | -1% | 1.11s | 1.102s | 1% | 1.336s | 1.207s | 10% | 0.604s | 0.673s | -11% | 0.782s | 0.707s | 10% | |||||||||
Q10 | 1.023s | 1.044s | -1% | 1.323s | 1.274s | 4% | 1.618s | 1.376s | 15% | 0.73s | 0.802s | -10% | 0.938s | 0.717s | 24% | |||||||||
Q11 | 0.415s | 0.413s | 0% | 0.488s | 0.48s | 2% | 0.557s | 0.504s | 10% | 0.424s | 0.398s | 6% | 0.472s | 0.435s | 8% | |||||||||
Q12 | 0.465s | 0.461s | 1% | 0.54s | 0.528s | 2% | 0.613s | 0.572s | 7% | 0.402s | 0.38s | 5% | 0.441s | 0.405s | 8% | |||||||||
Q13 | 1.081s | 0.881s | 19% | 1.652s | 1.416s | 14% | 1.995s | 1.417s | 29% | 0.722s | 0.533s | 26% | 0.838s | 0.686s | 18% | |||||||||
Q14 | 1.837s | 1.424s | 22% | 2.514s | 2.041s | 19% | 2.909s | 2.122s | 27% | 1.158s | 0.864s | 25% | 1.022s | 0.934s | 9% | |||||||||
Q15 | 1.279s | 0.983s | 23% | 1.876s | 1.536s | 18% | 2.27s | 1.563s | 31% | 0.817s | 0.603s | 26% | 0.885s | 0.711s | 20% | |||||||||
Q16 | 0.927s | 0.696s | 25% | 1.501s | 1.162s | 23% | 1.737s | 1.393s | 20% | 0.563s | 0.461s | 18% | 0.752s | 0.699s | 7% | |||||||||
Q17 | 3.03s | 1.62s | 47% | 4.154s | 2.631s | 37% | 4.361s | 2.934s | 33% | 1.714s | 1.049s | 39% | 1.435s | 1.047s | 27% | |||||||||
Q18 | 1.663s | 0.969s | 42% | 1.757s | 1.2s | 32% | 1.6s | 1.313s | 18% | 1.084s | 0.777s | 28% | 0.542s | 0.519s | 4% | |||||||||
Q19 | 6.223s | 2.737s | 56% | 8.269s | 4.699s | 43% | 9.316s | 5.06s | 46% | 3.326s | 1.872s | 44% | 2.515s | 1.642s | 35% | |||||||||
Q20 | 0.007s | 0.007s | 0% | 0.04s | 0.04s | 0% | 0.016s | 0.016s | 0% | 0.047s | 0.047s | 0% | 0.06s | 0.06s | 0% | |||||||||
Q21 | 2.497s | 2.444s | 2% | 2.614s | 2.566s | 2% | 2.682s | 2.669s | 0% | 1.515s | 1.512s | 0% | 1.109s | 0.813s | 27% | |||||||||
Q22 | 2.819s | 2.795s | 1% | 2.958s | 2.941s | 1% | 3.082s | 3.082s | 0% | 1.821s | 1.82s | 0% | 1.597s | 0.998s | 38% | |||||||||
Q23 | 5.848s | 5.808s | 1% | 6.087s | 6.007s | 1% | 6.488s | 6.269s | 3% | 4.036s | 3.95s | 2% | 2.915s | 2.161s | 26% | |||||||||
Q24 | 3.412s | 3.398 | 1% | 3.223s | 3.204s | 1% | 3.441s | 3.392s | 1% | 2.551s | 2.566s | -1% | 2.693s | 2.524s | 6% | |||||||||
Q25 | 0.718s | 0.721s | -4% | 0.738s | 0.734s | 1% | 0.783s | 0.783s | 0% | 0.755s | 0.726s | 4% | 0.79s | 0.447s | 43% | |||||||||
Q26 | 0.53s | 0.536s | -1% | 0.515s | 0.511s | 1% | 0.564s | 0.547s | 3% | 0.388s | 0.38s | 2% | 0.388s | 0.336s | 13% | |||||||||
Q27 | 0.769s | 0.774s | -1% | 0.775s | 0.768s | 1% | 0.825s | 0.804s | 3% | 0.748s | 0.743s | 1% | 0.8s | 0.462s | 42% | |||||||||
Q28 | 3.177s | 3.15s | 1% | 3.187s | 3.186s | 0% | 3.263s | 3.256s | 0% | 1.623s | 1.652s | -2% | 1.18s3 | 0.949s | 20% | |||||||||
Q29 | 4.359s | 4.394s | -1% | 4.730s | 4.701s | 1% | 5.426s | 4.714s | 13% | 3.074s | 3.033s | 1% | 2.02s6 | 1.833s | 10% | |||||||||
Q30 | 0.137s | 0.137s | 0% | 0.14s | 0.14s | 0% | 0.147s | 0.147s | 0% | 0.169s | 0.169s | 0% | 0.168s | 0.168s | 0% | |||||||||
Q31 | 1.097s | 0.993s | 9% | 1.519s | 1.413s | 7% | 1.823s | 1.386s | 24% | 0.843s | 0.74s | 12% | 0.882s | 0.692s | 22% | |||||||||
Q32 | 1.72s | 1.296s | 25% | 2.259s | 1.946s | 14% | 2.586s | 1.897s | 27% | 1.331s | 1.099s | 17% | 1.349s | 1.181s | 12% | |||||||||
Q33 | 9.633s | 3.772s | 61% | 13.7s | 7.863s | 43% | 19.029s | 8.305s | 56% | 4.258s | 2.322s | 45% | 3.62s | 2.352s | 35% | |||||||||
Q34 | 5.485s | 5.125s | 7% | 8.278s | 7.39s | 11% | 8.982s | 8.419s | 6% | 2.96s | 2.412s | 19% | 2.936s | 2.698s | 8% | |||||||||
Q35 | 5.45s | 5.106s | 6% | 8.361s | 7.348s | 12% | 8.968s | 8.478s | 5% | 2.963s | 2.483s | 16% | 2.996s | 2.682s | 10% | |||||||||
Q36 | 0.738s | 0.513s | 30% | 1.104s | 0.908s | 18% | 1.406s | 1.07s | 24% | 0.491s | 0.41s | 16% | 0.721s | 0.704s | 2% | |||||||||
Q37 | 0.147s | 0.138s | 6% | 0.244s | 0.17s | 30% | 0.313s | 0.192s | 39% | 0.167s | 0.115s | 31% | 0.242s | 0.184s | 24% | |||||||||
Q38 | 0.136s | 0.136s | 0% | 0.159s | 0.159s | 0% | 0.167s | 0.181s | -8% | 0.124s | 0.121s | 2% | 0.131s | 0.131s | 0% | |||||||||
Q39 | 0.103s | 0.107s | -4% | 0.12s | 0.121s | -1% | 0.124s | 0.126s | -1% | 0.105s | 0.104s | 1% | 0.118s | 0.117s | 1% | |||||||||
Q40 | 0.281s | 0.254s | 10% | 0.432s | 0.33s | 24% | 0.493s | 0.33s | 33% | 0.236s | 0.203s | 14% | 0.326s | 0.224s | 31% | |||||||||
Q41 | 0.04s | 0.039s | 3% | 0.08s | 0.062s | 23% | 0.068s | 0.066s | 3% | 0.086s | 0.083s | 3% | 0.097s | 0.075s | 23% | |||||||||
Q42 | 0.045s | 0.037s | 18% | 0.058s | 0.059s | -2% | 0.063s | 0.064s | -1% | 0.08s | 0.073s | 9% | 0.083s | 0.069s | 17% | |||||||||
Q43 | 0.032s | 0.032s | 0% | 0.051s | 0.054s | -6% | 0.06s | 0.052s | 17% | 0.075s | 0.072s | 4% | 0.079s | 0.068s | 14% |
Docker Image for PR
|
There are explain tests that need to be fixed. |
We use random and fixed SQL queries to cover the correctness tests. Test Codes in https://github.com/sundy-li/mmbend/tree/master/examples |
Great 👍 Wizard [SELECTS] tests passed, with results compared to Snowflake. SQL selects script here: https://github.com/datafuselabs/wizard/blob/main/checksb/sql/selects/check.sql How to run the tests:
|
May I know the test carried on memory limited ?
|
No major difference if blocks are spilled, because the bottleneck is io issue. |
got thanks |
I disagree with that. The older one is the general-purpose hashtable you indicate. |
Oh, OK, I mean without IO effect. Since most of agg workloads are memory-bound. And the way of implementation above seems need indirection access? Or maybe I missed something, but memory bandwidth is important for high cardinality keys. |
Indirect access is that we only compare the high 16 bits of the hash. If they match, then we proceed to compare the actual data. General-purpose hashtable compares the entire hash value before proceeding to compare the actual data. |
OK got you thanks. Since the general-purpose hashtable will compare hash values if they saved the hash, if not just use it for probing, but any way the memory layout is more cache-friendly. |
* fix hits-q18-perf * turn on new agg hashtable * rewrite sqllogical test * fix sqllogical test --------- Co-authored-by: jw <freejw@gmail.com> (cherry picked from commit 7e9b835)
I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/
Summary
In the previous PR, we implemented a new aggregation hash table. Now, it supports both singleton and cluster environments, and we have also added support for spill. This PR includes performance tests and try to enable the new aggregation hash table.
Tests
Type of change