Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(bitmap): change the buffer unit from u8 to usize #7030

Merged
merged 5 commits into from
Dec 23, 2022
Merged

Conversation

wangrunji0408
Copy link
Contributor

I hereby agree to the terms of the Singularity Data, Inc. Contributor License Agreement.

What's changed and what's your intention?

This PR changes the underlying buffer of Bitmap from Vec<u8> to Vec<usize>.
Processing in the unit of machine word rather than bytes makes it more efficient.

The bench results in bitmap of size 1024:

bench Vec<u8> Vec<usize> change
new_zeros 22.748 ns 18.920 ns -17.121%
new_ones 24.573 ns 22.103 ns -10.967%
get 344.78 ps 313.85 ps -9.2259%
and 105.73 ns 26.811 ns -74.634%
or 105.75 ns 26.709 ns -74.742%
not 30.109 ns 23.285 ns -22.672%

This PR also changes some function names and adds some docs.

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • All checks passed in ./risedev check (or alias, ./risedev c)

Refer to a related PR or issue link (optional)

#6868

Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Copy link
Member

@BugenZhao BugenZhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. 🚀🚀🚀

@codecov
Copy link

codecov bot commented Dec 23, 2022

Codecov Report

Merging #7030 (418d885) into main (3e3e4d2) will increase coverage by 0.01%.
The diff coverage is 95.95%.

@@            Coverage Diff             @@
##             main    #7030      +/-   ##
==========================================
+ Coverage   72.87%   72.89%   +0.01%     
==========================================
  Files        1040     1039       -1     
  Lines      167141   167082      -59     
==========================================
- Hits       121810   121798      -12     
+ Misses      45331    45284      -47     
Flag Coverage Δ
rust 72.89% <95.95%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/common/src/util/mod.rs 0.00% <ø> (ø)
src/frontend/src/scheduler/plan_fragmenter.rs 69.35% <0.00%> (ø)
src/storage/src/table/batch_table/storage_table.rs 85.79% <0.00%> (ø)
src/stream/src/common/table/state_table.rs 84.10% <0.00%> (ø)
src/stream/src/executor/dynamic_filter.rs 93.42% <ø> (ø)
src/stream/src/executor/watermark_filter.rs 90.87% <ø> (ø)
...tream/src/executor/managed_state/dynamic_filter.rs 90.96% <75.00%> (ø)
src/common/src/buffer/bitmap.rs 96.78% <97.76%> (+0.59%) ⬆️
src/common/src/array/bool_array.rs 90.40% <100.00%> (ø)
src/common/src/array/data_chunk.rs 97.30% <100.00%> (ø)
... and 24 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@lmatz lmatz merged commit 0cf49e3 into main Dec 23, 2022
@lmatz lmatz deleted the wrj/bitmap-usize branch December 23, 2022 06:40
lmatz pushed a commit that referenced this pull request Jan 3, 2023
* bitmap: use pointer-sized element for underlying buffer

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* use `Box<[usize]>` to save 8 bytes

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* rename functions and add docs

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* add bench for bitmap

Signed-off-by: Runji Wang <wangrunji0408@163.com>

Signed-off-by: Runji Wang <wangrunji0408@163.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
lmatz added a commit that referenced this pull request Jan 3, 2023
* fix: clean the verbose logs of "failed to send message to actor" (#6973)

As title.

```
// println!("{:?}", &chunk);
StreamChunk { cardinality: 4, capacity: 4, .. }

// println!("{:#?}", &chunk);
StreamChunk { cardinality: 4, capacity: 4, data:
+----+---+---+
|  + | 1 | 6 |
|  - | 2 |   |
| U- | 3 | 7 |
| U+ | 4 |   |
+----+---+---+
}
```


Approved-By: BugenZhao

Co-Authored-By: Eric Fu <eric@singularity-data.com>

* feat(stream): Make scale DAG aware (#7013)

**This section will be used as the commit message. Please do not leave this empty!**

Please explain **IN DETAIL** what the changes are in this PR and why they are needed:

- Let `Reschedule` accept more than one downstream fragment.
- Let the dispatcher id calculated from the exchange operator id together with the upstream and downstream id.


Approved-By: BugenZhao

Co-Authored-By: Dylan Chen <zilin@singularity-data.com>

* chore: log AST in sqlparser (#7012)

Log AST in sqlparser for better debugging. Also make it easier to configure logs.


Approved-By: lmatz

Co-Authored-By: xxchan <xxchan22f@gmail.com>

* feat(streaming): do not backfill for empty table (#7009)

If the snapshot is empty, we don't need to backfill and can immediately finish the progress. This can speed up some tests.

```
dev=> create materialized view mv2 as select * from t;
CREATE_MATERIALIZED_VIEW
Time: 1033.834 ms (00:01.034)

dev=> delete from t;
DELETE 1
Time: 9.869 ms

dev=> create materialized view mv3 as select * from t;
CREATE_MATERIALIZED_VIEW
Time: 18.550 ms
```

Note that every executor requires a barrier for the first message. So if there are few records in the table (but not empty), we cannot adapt this optimization. The further plan might be to issue next checkpoints more frequently for this case.


Approved-By: chenzl25

Co-Authored-By: Bugen Zhao <i@bugenzhao.com>

* refactor(logging): be aware of RUST_LOG env (#7016)

This PR supports overwriting log filters predefined in `init_risingwave_logger`, by specifying RUST_LOG environment variable.
One use case is I want to suppress certain logs in CI, to avoid large log size.


Approved-By: BugenZhao
Approved-By: xxchan

Co-Authored-By: zwang28 <84491488@qq.com>

* perf(bitmap): change the buffer unit from `u8` to `usize` (#7030)

* bitmap: use pointer-sized element for underlying buffer

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* use `Box<[usize]>` to save 8 bytes

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* rename functions and add docs

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* add bench for bitmap

Signed-off-by: Runji Wang <wangrunji0408@163.com>

Signed-off-by: Runji Wang <wangrunji0408@163.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* refactor(optimizer): rename optimizer rule filename. (#7038)

rename optimizer rule

* chore(connector): fix log level (#7043)

`info -> debug` otherwise print every record.

Approved-By: tabVersion

* fix: check sink properties and validate them in advance to avoid panic or recovery (#7041)

Check whether sink properties provided in frontend and simply validate them when building executors to avoid panic or recovery.

Approved-By: BugenZhao

* fix(batch): match `probe_row` once for `SemiJoin` with `non_equi` predicates (#7033)

### Problem

- For a probe key, after chunk is spilled, we may continue appending rows for processing.
- This happens even though the probe key already has found matching build row.
- These rows should be discarded instead / not appended.
- As a result, when processing the next spilled chunk, they are also included inside, and finally returned in the results as duplicate rows.
- As you can see in the [expected results before fix](https://github.com/risingwavelabs/risingwave/pull/7033/commits/ec6eac5fbac700ed9c03662592a30886a198d248), we have 5 probe rows, but results return 8 output.

### Solution

- if probe row match, break and do matching for next probe row. Then we won't process duplicate matches for matched probe rows.
- If probe row not match, continue appending. The appended rows will be contained in buffered chunk and processed later.

Approved-By: chenzl25

* perf(expr): complete expression benchmark framework (#6995)

This PR completes the micro-benchmark framework for expressions.

It can be run with:
```sh
cd src/expr
# run all benches
cargo bench --bench expr -- --quick
# list all benches
cargo bench --bench expr -- --list
# run specified benches
cargo bench --bench expr -- --quick "add\(int32,int32\)"
```

The detailed bench results have been updated to #6868.
Here is a statistical overview of all expressions:

<img width="470" alt="截屏2022-12-20 21 13 24" src="https://user-images.githubusercontent.com/15158738/208812177-529f62aa-4590-4dbc-b93b-9d8f1151418c.png">

To enumerate all valid expressions, we utilize the function signature maps defined in the frontend. We moved them into the expr crate in order to avoid dependency on the frontend thus reducing the compilation time.

Approved-By: lmatz

* perf(expr): prebuild the AC for `to_char` (#7048)

As title, put it on lazy-static to avoid building every time.

bench | Before time(us) | After time(us) | Change(%)
-- | -- | -- | --
to_char(timestamp,varchar) | 11060.000 | 283.900 | -97.4%

Approved-By: TennyZhuang

* fix(alloc): missing padding in realloc (#7046)

as title

Approved-By: TennyZhuang

* fix(parser): fix parsing nested wildcard struct field access (#7024)

Fix #7011: nested wildcard struct field access panics when there are additional parentheses.

Also did some minor style refactoring and added some comments.

Approved-By: st1page
Approved-By: yezizp2012

* fix: fix parser test of wildcard struct field with additional parentheses (#7052)

Fix parser test of wildcard struct field with additional parentheses: https://buildkite.com/risingwavelabs/main-cron/builds/282#01854176-ffb0-4018-aa7b-fe128a411a41

Approved-By: xxchan

* feat(optimizer): support union merge rule (#7037)

**This section will be used as the commit message. Please do not leave this empty!**

Please explain **IN DETAIL** what the changes are in this PR and why they are needed:

- Merge all binary unions into multi inputs union.

Approved-By: st1page

* fix: fix period non-zero panic in election (#7065)

Fix #7061

Approved-By: lmatz

* perf(expr): optimize lower/upper/trim/md5 (#7047)

This PR optimizes `lower`/`upper`/`trim`/`md5` operations by avoiding generating String.

bench | Before time(us) | After time(us) | Change(%)
-- | -- | -- | --
md5(varchar) | 447.040 | 338.540 | -24.3%
ltrim(varchar,varchar) | 38.893 | 20.666 | -46.9%
rtrim(varchar,varchar) | 38.870 | 22.311 | -42.6%
trim(varchar,varchar) | 38.327 | 20.831 | -45.6%
lower(varchar) | 36.172 | 10.851 | -70.0%
upper(varchar) | 34.607 | 10.946 | -68.4%

Approved-By: lmatz

* perf(expr): reduce format parsing in to_char (#7051)

Reduce duplicated format parsing during evaluation of `to_char`.

The perf improve by about 20% with a constant format string "YYYY/MM/DD HH24:MI:SS", but downgraded with the current benchmark framework due to #7050 .

Approved-By: lmatz
Approved-By: wangrunji0408

* perf(expr): cover the fast path for `to_char(timestamp)` (#7056)

This PR changes the input of bench `to_char(timestamp,varchar)`.
It makes the second argument a constant format string to cover the fast path of evaluation.

```
to_char(timestamp,varchar)
time:   [245.77 µs 245.80 µs 245.89 µs]
change: [-70.241% -70.126% -70.010%] (p = 0.07 > 0.05)
```

Approved-By: lmatz

* perf(expr): vectorize infallible operations (10x speedup) (#7055)

This PR vectorizes the following infallible operations:
- `and/or/not`
- `bitwise_{and/or/xor/not}`
- `is[_not]_{true/false}`
- `eq/ne/gt/lt/ge/le`
- `is[_not]_distinct_from`
- `round/ceil/floor`

making them 10x faster on average, and up to 50x speed up.

The distribution curve of all operation times before and after this PR:
<img width="491" alt="perf-stat" src="https://user-images.githubusercontent.com/15158738/209499628-1631c779-b6d5-49be-b842-8dcf89a23e1a.png">

<details>
<summary>Click to show full results</summary>

bench | Before time(us) | After time(us) | Change(%) | Speedup
-- | -- | -- | -- | --
and(boolean,boolean) | 11.110 | 0.417 | -96.2% | 25.6
bitwise_and(int16,int16) | 5.284 | 0.142 | -97.3% | 36.2
bitwise_and(int16,int32) | 4.540 | 0.170 | -96.3% | 25.7
bitwise_and(int16,int64) | 4.535 | 0.257 | -94.3% | 16.7
bitwise_and(int32,int16) | 4.509 | 0.183 | -95.9% | 23.6
bitwise_and(int32,int32) | 4.495 | 0.185 | -95.9% | 23.3
bitwise_and(int32,int64) | 4.549 | 0.232 | -94.9% | 18.6
bitwise_and(int64,int16) | 4.489 | 0.241 | -94.6% | 17.6
bitwise_and(int64,int32) | 4.500 | 0.254 | -94.4% | 16.7
bitwise_and(int64,int64) | 4.508 | 0.268 | -94.0% | 15.8
bitwise_not(int16) | 4.363 | 0.116 | -97.3% | 36.5
bitwise_not(int32) | 4.353 | 0.164 | -96.2% | 25.6
bitwise_not(int64) | 4.358 | 0.219 | -95.0% | 18.9
bitwise_or(int16,int16) | 5.278 | 0.142 | -97.3% | 36.3
bitwise_or(int16,int32) | 4.603 | 0.171 | -96.3% | 25.9
bitwise_or(int16,int64) | 4.556 | 0.242 | -94.7% | 17.8
bitwise_or(int32,int16) | 4.531 | 0.169 | -96.3% | 25.8
bitwise_or(int32,int32) | 4.495 | 0.182 | -95.9% | 23.7
bitwise_or(int32,int64) | 4.532 | 0.250 | -94.5% | 17.1
bitwise_or(int64,int16) | 4.493 | 0.237 | -94.7% | 17.9
bitwise_or(int64,int32) | 4.600 | 0.251 | -94.5% | 17.3
bitwise_or(int64,int64) | 4.495 | 0.254 | -94.3% | 16.7
bitwise_xor(int16,int16) | 5.292 | 0.137 | -97.4% | 37.6
bitwise_xor(int16,int32) | 4.561 | 0.172 | -96.2% | 25.5
bitwise_xor(int16,int64) | 4.526 | 0.238 | -94.7% | 18.0
bitwise_xor(int32,int16) | 4.486 | 0.179 | -96.0% | 24.0
bitwise_xor(int32,int32) | 4.508 | 0.200 | -95.6% | 21.6
bitwise_xor(int32,int64) | 4.574 | 0.266 | -94.2% | 16.2
bitwise_xor(int64,int16) | 4.596 | 0.254 | -94.5% | 17.1
bitwise_xor(int64,int32) | 4.502 | 0.292 | -93.5% | 14.4
bitwise_xor(int64,int64) | 4.502 | 0.276 | -93.9% | 15.3
ceil(decimal) | 7.564 | 3.507 | -53.6% | 1.2
ceil(float64) | 4.358 | 0.208 | -95.2% | 20.0
equal(boolean,boolean) | 7.444 | 0.136 | -98.2% | 53.5
equal(date,date) | 7.079 | 0.491 | -93.1% | 13.4
equal(date,timestamp) | 7.572 | 1.033 | -86.4% | 6.3
equal(decimal,decimal) | 12.715 | 8.304 | -34.7% | 0.5
equal(decimal,float32) | 12.879 | 5.474 | -57.5% | 1.4
equal(decimal,float64) | 12.945 | 5.349 | -58.7% | 1.4
equal(decimal,int16) | 11.359 | 4.574 | -59.7% | 1.5
equal(decimal,int32) | 11.384 | 4.472 | -60.7% | 1.5
equal(decimal,int64) | 11.410 | 4.615 | -59.6% | 1.5
equal(float32,decimal) | 12.972 | 5.566 | -57.1% | 1.3
equal(float32,float32) | 7.219 | 0.678 | -90.6% | 9.6
equal(float32,float64) | 7.337 | 0.853 | -88.4% | 7.6
equal(float32,int16) | 6.858 | 0.841 | -87.7% | 7.2
equal(float32,int32) | 6.826 | 0.712 | -89.6% | 8.6
equal(float32,int64) | 7.003 | 0.667 | -90.5% | 9.5
equal(float64,decimal) | 12.835 | 5.507 | -57.1% | 1.3
equal(float64,float32) | 7.271 | 0.869 | -88.0% | 7.4
equal(float64,float64) | 7.220 | 0.922 | -87.2% | 6.8
equal(float64,int16) | 6.772 | 0.878 | -87.0% | 6.7
equal(float64,int32) | 6.752 | 0.757 | -88.8% | 7.9
equal(float64,int64) | 6.724 | 0.723 | -89.2% | 8.3
equal(int16,decimal) | 11.437 | 4.623 | -59.6% | 1.5
equal(int16,float32) | 6.785 | 0.746 | -89.0% | 8.1
equal(int16,float64) | 6.767 | 0.709 | -89.5% | 8.6
equal(int16,int16) | 7.911 | 0.476 | -94.0% | 15.6
equal(int16,int32) | 7.087 | 0.494 | -93.0% | 13.4
equal(int16,int64) | 7.105 | 0.565 | -92.0% | 11.6
equal(int32,decimal) | 11.105 | 4.449 | -59.9% | 1.5
equal(int32,float32) | 6.749 | 0.641 | -90.5% | 9.5
equal(int32,float64) | 6.714 | 0.596 | -91.1% | 10.3
equal(int32,int16) | 7.086 | 0.492 | -93.1% | 13.4
equal(int32,int32) | 7.094 | 0.489 | -93.1% | 13.5
equal(int32,int64) | 7.118 | 0.546 | -92.3% | 12.0
equal(int64,decimal) | 11.331 | 4.622 | -59.2% | 1.5
equal(int64,float32) | 6.732 | 0.593 | -91.2% | 10.3
equal(int64,float64) | 6.808 | 0.567 | -91.7% | 11.0
equal(int64,int16) | 7.103 | 0.572 | -92.0% | 11.4
equal(int64,int32) | 7.089 | 0.550 | -92.2% | 11.9
equal(int64,int64) | 7.086 | 0.556 | -92.1% | 11.7
equal(interval,interval) | 8.289 | 2.381 | -71.3% | 2.5
equal(interval,time) | 7.969 | 1.874 | -76.5% | 3.3
equal(time,interval) | 7.669 | 2.039 | -73.4% | 2.8
equal(time,time) | 7.523 | 0.546 | -92.7% | 12.8
equal(timestamp,date) | 7.622 | 1.040 | -86.4% | 6.3
equal(timestamp,timestamp) | 8.041 | 1.159 | -85.6% | 5.9
equal(timestampz,timestampz) | 7.099 | 0.545 | -92.3% | 12.0
equal(varchar,varchar) | 12.215 | 12.335 | 1.0% | -0.0
floor(decimal) | 7.532 | 2.890 | -61.6% | 1.6
floor(float64) | 4.357 | 0.211 | -95.2% | 19.6
greater_than_or_equal(boolean,boolean) | 7.376 | 0.174 | -97.6% | 41.3
greater_than_or_equal(date,date) | 7.082 | 0.509 | -92.8% | 12.9
greater_than_or_equal(date,timestamp) | 7.970 | 2.087 | -73.8% | 2.8
greater_than_or_equal(decimal,decimal) | 12.622 | 8.342 | -33.9% | 0.5
greater_than_or_equal(decimal,float32) | 13.727 | 6.046 | -56.0% | 1.3
greater_than_or_equal(decimal,float64) | 13.906 | 5.957 | -57.2% | 1.3
greater_than_or_equal(decimal,int16) | 12.191 | 4.627 | -62.0% | 1.6
greater_than_or_equal(decimal,int32) | 12.040 | 4.524 | -62.4% | 1.7
greater_than_or_equal(decimal,int64) | 12.128 | 4.636 | -61.8% | 1.6
greater_than_or_equal(float32,decimal) | 13.212 | 6.129 | -53.6% | 1.2
greater_than_or_equal(float32,float32) | 7.594 | 0.697 | -90.8% | 9.9
greater_than_or_equal(float32,float64) | 7.631 | 0.874 | -88.6% | 7.7
greater_than_or_equal(float32,int16) | 7.751 | 1.160 | -85.0% | 5.7
greater_than_or_equal(float32,int32) | 7.620 | 1.021 | -86.6% | 6.5
greater_than_or_equal(float32,int64) | 7.632 | 0.984 | -87.1% | 6.8
greater_than_or_equal(float64,decimal) | 13.328 | 5.996 | -55.0% | 1.2
greater_than_or_equal(float64,float32) | 7.758 | 0.963 | -87.6% | 7.1
greater_than_or_equal(float64,float64) | 7.607 | 0.930 | -87.8% | 7.2
greater_than_or_equal(float64,int16) | 7.764 | 1.217 | -84.3% | 5.4
greater_than_or_equal(float64,int32) | 7.673 | 1.102 | -85.6% | 6.0
greater_than_or_equal(float64,int64) | 7.621 | 1.035 | -86.4% | 6.4
greater_than_or_equal(int16,decimal) | 11.404 | 4.960 | -56.5% | 1.3
greater_than_or_equal(int16,float32) | 6.930 | 0.832 | -88.0% | 7.3
greater_than_or_equal(int16,float64) | 6.948 | 0.860 | -87.6% | 7.1
greater_than_or_equal(int16,int16) | 7.886 | 0.458 | -94.2% | 16.2
greater_than_or_equal(int16,int32) | 7.125 | 0.491 | -93.1% | 13.5
greater_than_or_equal(int16,int64) | 7.102 | 0.569 | -92.0% | 11.5
greater_than_or_equal(int32,decimal) | 11.355 | 4.701 | -58.6% | 1.4
greater_than_or_equal(int32,float32) | 6.858 | 0.707 | -89.7% | 8.7
greater_than_or_equal(int32,float64) | 6.852 | 0.761 | -88.9% | 8.0
greater_than_or_equal(int32,int16) | 7.129 | 0.509 | -92.9% | 13.0
greater_than_or_equal(int32,int32) | 7.176 | 0.506 | -93.0% | 13.2
greater_than_or_equal(int32,int64) | 7.121 | 0.566 | -92.1% | 11.6
greater_than_or_equal(int64,decimal) | 11.442 | 4.946 | -56.8% | 1.3
greater_than_or_equal(int64,float32) | 6.861 | 0.692 | -89.9% | 8.9
greater_than_or_equal(int64,float64) | 6.861 | 0.728 | -89.4% | 8.4
greater_than_or_equal(int64,int16) | 7.163 | 0.578 | -91.9% | 11.4
greater_than_or_equal(int64,int32) | 7.094 | 0.563 | -92.1% | 11.6
greater_than_or_equal(int64,int64) | 7.131 | 0.560 | -92.1% | 11.7
greater_than_or_equal(interval,interval) | 8.229 | 2.400 | -70.8% | 2.4
greater_than_or_equal(interval,time) | 8.928 | 2.687 | -69.9% | 2.3
greater_than_or_equal(time,interval) | 7.858 | 2.796 | -64.4% | 1.8
greater_than_or_equal(time,time) | 7.745 | 0.669 | -91.4% | 10.6
greater_than_or_equal(timestamp,date) | 7.252 | 1.433 | -80.2% | 4.1
greater_than_or_equal(timestamp,timestamp) | 8.461 | 2.235 | -73.6% | 2.8
greater_than_or_equal(timestampz,timestampz) | 7.088 | 0.547 | -92.3% | 12.0
greater_than_or_equal(varchar,varchar) | 12.385 | 12.439 | 0.4% | -0.0
greater_than(boolean,boolean) | 6.948 | 0.175 | -97.5% | 38.6
greater_than(date,date) | 6.702 | 0.488 | -92.7% | 12.7
greater_than(date,timestamp) | 7.562 | 2.103 | -72.2% | 2.6
greater_than(decimal,decimal) | 11.878 | 8.252 | -30.5% | 0.4
greater_than(decimal,float32) | 12.946 | 6.008 | -53.6% | 1.2
greater_than(decimal,float64) | 12.897 | 6.000 | -53.5% | 1.1
greater_than(decimal,int16) | 12.214 | 4.701 | -61.5% | 1.6
greater_than(decimal,int32) | 12.067 | 4.527 | -62.5% | 1.7
greater_than(decimal,int64) | 12.238 | 4.724 | -61.4% | 1.6
greater_than(float32,decimal) | 12.488 | 6.309 | -49.5% | 1.0
greater_than(float32,float32) | 7.017 | 0.818 | -88.3% | 7.6
greater_than(float32,float64) | 7.145 | 1.071 | -85.0% | 5.7
greater_than(float32,int16) | 7.701 | 1.176 | -84.7% | 5.5
greater_than(float32,int32) | 7.635 | 1.028 | -86.5% | 6.4
greater_than(float32,int64) | 7.552 | 0.989 | -86.9% | 6.6
greater_than(float64,decimal) | 12.434 | 6.205 | -50.1% | 1.0
greater_than(float64,float32) | 7.115 | 1.114 | -84.3% | 5.4
greater_than(float64,float64) | 7.162 | 1.161 | -83.8% | 5.2
greater_than(float64,int16) | 7.722 | 1.217 | -84.2% | 5.3
greater_than(float64,int32) | 7.700 | 1.095 | -85.8% | 6.0
greater_than(float64,int64) | 7.598 | 1.052 | -86.1% | 6.2
greater_than(int16,decimal) | 11.414 | 5.041 | -55.8% | 1.3
greater_than(int16,float32) | 6.936 | 0.848 | -87.8% | 7.2
greater_than(int16,float64) | 6.858 | 0.879 | -87.2% | 6.8
greater_than(int16,int16) | 7.271 | 0.469 | -93.6% | 14.5
greater_than(int16,int32) | 6.766 | 0.496 | -92.7% | 12.6
greater_than(int16,int64) | 6.682 | 0.570 | -91.5% | 10.7
greater_than(int32,decimal) | 11.305 | 4.732 | -58.1% | 1.4
greater_than(int32,float32) | 6.886 | 0.714 | -89.6% | 8.6
greater_than(int32,float64) | 6.820 | 0.745 | -89.1% | 8.2
greater_than(int32,int16) | 6.668 | 0.493 | -92.6% | 12.5
greater_than(int32,int32) | 6.983 | 0.491 | -93.0% | 13.2
greater_than(int32,int64) | 6.699 | 0.547 | -91.8% | 11.2
greater_than(int64,decimal) | 11.416 | 5.031 | -55.9% | 1.3
greater_than(int64,float32) | 6.898 | 0.679 | -90.2% | 9.2
greater_than(int64,float64) | 6.808 | 0.730 | -89.3% | 8.3
greater_than(int64,int16) | 6.772 | 0.573 | -91.5% | 10.8
greater_than(int64,int32) | 6.692 | 0.551 | -91.8% | 11.1
greater_than(int64,int64) | 6.719 | 0.547 | -91.9% | 11.3
greater_than(interval,interval) | 7.674 | 2.447 | -68.1% | 2.1
greater_than(interval,time) | 8.831 | 2.754 | -68.8% | 2.2
greater_than(time,interval) | 7.853 | 3.016 | -61.6% | 1.6
greater_than(time,time) | 7.229 | 0.778 | -89.2% | 8.3
greater_than(timestamp,date) | 7.123 | 1.456 | -79.6% | 3.9
greater_than(timestamp,timestamp) | 7.808 | 2.193 | -71.9% | 2.6
greater_than(timestampz,timestampz) | 6.691 | 0.561 | -91.6% | 10.9
greater_than(varchar,varchar) | 11.620 | 11.953 | 2.9% | -0.0
is_distinct_from(boolean,boolean) | 7.967 | 0.293 | -96.3% | 26.2
is_distinct_from(date,date) | 6.830 | 0.546 | -92.0% | 11.5
is_distinct_from(date,timestamp) | 7.031 | 1.068 | -84.8% | 5.6
is_distinct_from(decimal,decimal) | 12.050 | 8.189 | -32.0% | 0.5
is_distinct_from(decimal,float32) | 12.670 | 6.270 | -50.5% | 1.0
is_distinct_from(decimal,float64) | 12.565 | 6.312 | -49.8% | 1.0
is_distinct_from(decimal,int16) | 12.300 | 4.674 | -62.0% | 1.6
is_distinct_from(decimal,int32) | 12.231 | 4.553 | -62.8% | 1.7
is_distinct_from(decimal,int64) | 12.417 | 4.640 | -62.6% | 1.7
is_distinct_from(float32,decimal) | 12.468 | 6.309 | -49.4% | 1.0
is_distinct_from(float32,float32) | 6.885 | 0.707 | -89.7% | 8.7
is_distinct_from(float32,float64) | 6.894 | 0.899 | -87.0% | 6.7
is_distinct_from(float32,int16) | 7.597 | 0.927 | -87.8% | 7.2
is_distinct_from(float32,int32) | 7.529 | 0.804 | -89.3% | 8.4
is_distinct_from(float32,int64) | 7.413 | 0.768 | -89.6% | 8.7
is_distinct_from(float64,decimal) | 12.303 | 6.279 | -49.0% | 1.0
is_distinct_from(float64,float32) | 6.850 | 0.899 | -86.9% | 6.6
is_distinct_from(float64,float64) | 6.917 | 0.944 | -86.4% | 6.3
is_distinct_from(float64,int16) | 7.578 | 0.992 | -86.9% | 6.6
is_distinct_from(float64,int32) | 7.427 | 0.866 | -88.3% | 7.6
is_distinct_from(float64,int64) | 7.443 | 0.835 | -88.8% | 7.9
is_distinct_from(int16,decimal) | 12.005 | 4.642 | -61.3% | 1.6
is_distinct_from(int16,float32) | 7.492 | 0.839 | -88.8% | 7.9
is_distinct_from(int16,float64) | 7.403 | 0.801 | -89.2% | 8.2
is_distinct_from(int16,int16) | 7.458 | 0.500 | -93.3% | 13.9
is_distinct_from(int16,int32) | 6.834 | 0.549 | -92.0% | 11.5
is_distinct_from(int16,int64) | 6.749 | 0.653 | -90.3% | 9.3
is_distinct_from(int32,decimal) | 11.828 | 4.436 | -62.5% | 1.7
is_distinct_from(int32,float32) | 7.353 | 0.727 | -90.1% | 9.1
is_distinct_from(int32,float64) | 7.265 | 0.683 | -90.6% | 9.6
is_distinct_from(int32,int16) | 6.729 | 0.556 | -91.7% | 11.1
is_distinct_from(int32,int32) | 6.771 | 0.550 | -91.9% | 11.3
is_distinct_from(int32,int64) | 6.755 | 0.653 | -90.3% | 9.3
is_distinct_from(int64,decimal) | 11.955 | 4.716 | -60.6% | 1.5
is_distinct_from(int64,float32) | 7.365 | 0.671 | -90.9% | 10.0
is_distinct_from(int64,float64) | 7.292 | 0.654 | -91.0% | 10.2
is_distinct_from(int64,int16) | 6.727 | 0.672 | -90.0% | 9.0
is_distinct_from(int64,int32) | 6.734 | 0.624 | -90.7% | 9.8
is_distinct_from(int64,int64) | 6.767 | 0.614 | -90.9% | 10.0
is_distinct_from(interval,interval) | 7.889 | 2.420 | -69.3% | 2.3
is_distinct_from(interval,time) | 8.556 | 1.927 | -77.5% | 3.4
is_distinct_from(time,interval) | 8.243 | 2.084 | -74.7% | 3.0
is_distinct_from(time,time) | 6.954 | 0.600 | -91.4% | 10.6
is_distinct_from(timestamp,date) | 7.219 | 1.087 | -84.9% | 5.6
is_distinct_from(timestamp,timestamp) | 7.202 | 1.205 | -83.3% | 5.0
is_distinct_from(timestampz,timestampz) | 6.762 | 0.617 | -90.9% | 10.0
is_distinct_from(varchar,varchar) | 11.326 | 10.983 | -3.0% | 0.0
is_false(boolean) | 6.597 | 0.161 | -97.6% | 39.9
is_not_distinct_from(boolean,boolean) | 9.049 | 0.306 | -96.6% | 28.6
is_not_distinct_from(date,date) | 7.295 | 0.545 | -92.5% | 12.4
is_not_distinct_from(date,timestamp) | 7.507 | 1.080 | -85.6% | 6.0
is_not_distinct_from(decimal,decimal) | 12.880 | 8.241 | -36.0% | 0.6
is_not_distinct_from(decimal,float32) | 13.258 | 6.351 | -52.1% | 1.1
is_not_distinct_from(decimal,float64) | 13.361 | 6.318 | -52.7% | 1.1
is_not_distinct_from(decimal,int16) | 11.600 | 4.708 | -59.4% | 1.5
is_not_distinct_from(decimal,int32) | 11.458 | 4.586 | -60.0% | 1.5
is_not_distinct_from(decimal,int64) | 11.601 | 4.656 | -59.9% | 1.5
is_not_distinct_from(float32,decimal) | 13.335 | 6.294 | -52.8% | 1.1
is_not_distinct_from(float32,float32) | 7.374 | 0.717 | -90.3% | 9.3
is_not_distinct_from(float32,float64) | 7.293 | 0.905 | -87.6% | 7.1
is_not_distinct_from(float32,int16) | 7.000 | 0.945 | -86.5% | 6.4
is_not_distinct_from(float32,int32) | 6.987 | 0.827 | -88.2% | 7.4
is_not_distinct_from(float32,int64) | 7.081 | 0.785 | -88.9% | 8.0
is_not_distinct_from(float64,decimal) | 13.475 | 6.267 | -53.5% | 1.2
is_not_distinct_from(float64,float32) | 7.395 | 0.946 | -87.2% | 6.8
is_not_distinct_from(float64,float64) | 7.324 | 0.962 | -86.9% | 6.6
is_not_distinct_from(float64,int16) | 7.040 | 1.003 | -85.8% | 6.0
is_not_distinct_from(float64,int32) | 6.928 | 0.881 | -87.3% | 6.9
is_not_distinct_from(float64,int64) | 7.086 | 0.868 | -87.8% | 7.2
is_not_distinct_from(int16,decimal) | 11.357 | 4.619 | -59.3% | 1.5
is_not_distinct_from(int16,float32) | 6.800 | 0.846 | -87.6% | 7.0
is_not_distinct_from(int16,float64) | 6.768 | 0.812 | -88.0% | 7.3
is_not_distinct_from(int16,int16) | 8.060 | 0.515 | -93.6% | 14.6
is_not_distinct_from(int16,int32) | 7.237 | 0.558 | -92.3% | 12.0
is_not_distinct_from(int16,int64) | 7.135 | 0.657 | -90.8% | 9.9
is_not_distinct_from(int32,decimal) | 11.236 | 4.422 | -60.6% | 1.5
is_not_distinct_from(int32,float32) | 6.788 | 0.725 | -89.3% | 8.4
is_not_distinct_from(int32,float64) | 6.731 | 0.683 | -89.9% | 8.9
is_not_distinct_from(int32,int16) | 7.208 | 0.554 | -92.3% | 12.0
is_not_distinct_from(int32,int32) | 7.249 | 0.550 | -92.4% | 12.2
is_not_distinct_from(int32,int64) | 7.185 | 0.653 | -90.9% | 10.0
is_not_distinct_from(int64,decimal) | 11.405 | 4.683 | -58.9% | 1.4
is_not_distinct_from(int64,float32) | 6.803 | 0.690 | -89.9% | 8.9
is_not_distinct_from(int64,float64) | 6.781 | 0.660 | -90.3% | 9.3
is_not_distinct_from(int64,int16) | 7.143 | 0.655 | -90.8% | 9.9
is_not_distinct_from(int64,int32) | 7.235 | 0.635 | -91.2% | 10.4
is_not_distinct_from(int64,int64) | 7.289 | 0.625 | -91.4% | 10.7
is_not_distinct_from(interval,interval) | 8.328 | 2.402 | -71.2% | 2.5
is_not_distinct_from(interval,time) | 7.876 | 1.927 | -75.5% | 3.1
is_not_distinct_from(time,interval) | 7.637 | 2.105 | -72.4% | 2.6
is_not_distinct_from(time,time) | 7.426 | 0.632 | -91.5% | 10.7
is_not_distinct_from(timestamp,date) | 7.630 | 1.090 | -85.7% | 6.0
is_not_distinct_from(timestamp,timestamp) | 8.216 | 1.266 | -84.6% | 5.5
is_not_distinct_from(timestampz,timestampz) | 7.269 | 0.616 | -91.5% | 10.8
is_not_distinct_from(varchar,varchar) | 12.216 | 11.491 | -5.9% | 0.1
is_not_false(boolean) | 6.718 | 0.164 | -97.6% | 40.0
is_not_true(boolean) | 6.574 | 0.127 | -98.1% | 50.7
is_true(boolean) | 6.666 | 0.123 | -98.2% | 53.2
less_than_or_equal(boolean,boolean) | 7.450 | 0.177 | -97.6% | 41.0
less_than_or_equal(date,date) | 7.148 | 0.497 | -93.0% | 13.4
less_than_or_equal(date,timestamp) | 7.991 | 2.113 | -73.6% | 2.8
less_than_or_equal(decimal,decimal) | 12.874 | 8.402 | -34.7% | 0.5
less_than_or_equal(decimal,float32) | 13.990 | 6.012 | -57.0% | 1.3
less_than_or_equal(decimal,float64) | 13.920 | 6.017 | -56.8% | 1.3
less_than_or_equal(decimal,int16) | 11.602 | 4.707 | -59.4% | 1.5
less_than_or_equal(decimal,int32) | 11.361 | 4.481 | -60.6% | 1.5
less_than_or_equal(decimal,int64) | 11.618 | 4.653 | -59.9% | 1.5
less_than_or_equal(float32,decimal) | 13.420 | 6.116 | -54.4% | 1.2
less_than_or_equal(float32,float32) | 7.716 | 0.806 | -89.6% | 8.6
less_than_or_equal(float32,float64) | 7.734 | 1.062 | -86.3% | 6.3
less_than_or_equal(float32,int16) | 7.127 | 1.099 | -84.6% | 5.5
less_than_or_equal(float32,int32) | 7.178 | 0.974 | -86.4% | 6.4
less_than_or_equal(float32,int64) | 7.202 | 0.937 | -87.0% | 6.7
less_than_or_equal(float64,decimal) | 13.405 | 6.104 | -54.5% | 1.2
less_than_or_equal(float64,float32) | 7.756 | 1.101 | -85.8% | 6.0
less_than_or_equal(float64,float64) | 7.655 | 1.146 | -85.0% | 5.7
less_than_or_equal(float64,int16) | 7.175 | 1.170 | -83.7% | 5.1
less_than_or_equal(float64,int32) | 7.156 | 1.059 | -85.2% | 5.8
less_than_or_equal(float64,int64) | 7.170 | 1.017 | -85.8% | 6.1
less_than_or_equal(int16,decimal) | 12.477 | 4.948 | -60.3% | 1.5
less_than_or_equal(int16,float32) | 7.668 | 1.053 | -86.3% | 6.3
less_than_or_equal(int16,float64) | 7.630 | 0.956 | -87.5% | 7.0
less_than_or_equal(int16,int16) | 7.952 | 0.464 | -94.2% | 16.1
less_than_or_equal(int16,int32) | 7.172 | 0.486 | -93.2% | 13.8
less_than_or_equal(int16,int64) | 7.159 | 0.564 | -92.1% | 11.7
less_than_or_equal(int32,decimal) | 12.198 | 4.726 | -61.3% | 1.6
less_than_or_equal(int32,float32) | 7.588 | 0.854 | -88.7% | 7.9
less_than_or_equal(int32,float64) | 7.596 | 0.812 | -89.3% | 8.4
less_than_or_equal(int32,int16) | 7.163 | 0.495 | -93.1% | 13.5
less_than_or_equal(int32,int32) | 7.173 | 0.489 | -93.2% | 13.7
less_than_or_equal(int32,int64) | 7.208 | 0.555 | -92.3% | 12.0
less_than_or_equal(int64,decimal) | 12.139 | 4.885 | -59.8% | 1.5
less_than_or_equal(int64,float32) | 7.583 | 0.809 | -89.3% | 8.4
less_than_or_equal(int64,float64) | 7.609 | 0.778 | -89.8% | 8.8
less_than_or_equal(int64,int16) | 7.210 | 0.569 | -92.1% | 11.7
less_than_or_equal(int64,int32) | 7.174 | 0.549 | -92.3% | 12.1
less_than_or_equal(int64,int64) | 7.111 | 0.546 | -92.3% | 12.0
less_than_or_equal(interval,interval) | 8.402 | 2.372 | -71.8% | 2.5
less_than_or_equal(interval,time) | 8.266 | 2.701 | -67.3% | 2.1
less_than_or_equal(time,interval) | 8.573 | 2.868 | -66.5% | 2.0
less_than_or_equal(time,time) | 7.855 | 0.796 | -89.9% | 8.9
less_than_or_equal(timestamp,date) | 7.834 | 1.416 | -81.9% | 4.5
less_than_or_equal(timestamp,timestamp) | 8.426 | 2.240 | -73.4% | 2.8
less_than_or_equal(timestampz,timestampz) | 7.180 | 0.553 | -92.3% | 12.0
less_than_or_equal(varchar,varchar) | 12.270 | 12.479 | 1.7% | -0.0
less_than(boolean,boolean) | 7.098 | 0.180 | -97.5% | 38.4
less_than(date,date) | 6.805 | 0.492 | -92.8% | 12.8
less_than(date,timestamp) | 7.420 | 2.114 | -71.5% | 2.5
less_than(decimal,decimal) | 12.030 | 8.266 | -31.3% | 0.5
less_than(decimal,float32) | 12.915 | 5.947 | -54.0% | 1.2
less_than(decimal,float64) | 13.107 | 5.917 | -54.9% | 1.2
less_than(decimal,int16) | 11.521 | 4.635 | -59.8% | 1.5
less_than(decimal,int32) | 11.229 | 4.622 | -58.8% | 1.4
less_than(decimal,int64) | 11.420 | 4.744 | -58.5% | 1.4
less_than(float32,decimal) | 12.660 | 6.060 | -52.1% | 1.1
less_than(float32,float32) | 7.281 | 0.716 | -90.2% | 9.2
less_than(float32,float64) | 7.183 | 0.892 | -87.6% | 7.1
less_than(float32,int16) | 7.036 | 1.183 | -83.2% | 4.9
less_than(float32,int32) | 7.286 | 1.036 | -85.8% | 6.0
less_than(float32,int64) | 7.204 | 1.002 | -86.1% | 6.2
less_than(float64,decimal) | 12.515 | 5.981 | -52.2% | 1.1
less_than(float64,float32) | 7.029 | 1.009 | -85.6% | 6.0
less_than(float64,float64) | 7.042 | 0.995 | -85.9% | 6.1
less_than(float64,int16) | 6.965 | 1.280 | -81.6% | 4.4
less_than(float64,int32) | 7.003 | 1.137 | -83.8% | 5.2
less_than(float64,int64) | 7.010 | 1.091 | -84.4% | 5.4
less_than(int16,decimal) | 12.217 | 4.980 | -59.2% | 1.5
less_than(int16,float32) | 7.573 | 0.850 | -88.8% | 7.9
less_than(int16,float64) | 7.505 | 0.914 | -87.8% | 7.2
less_than(int16,int16) | 7.270 | 0.468 | -93.6% | 14.5
less_than(int16,int32) | 6.732 | 0.494 | -92.7% | 12.6
less_than(int16,int64) | 6.798 | 0.561 | -91.7% | 11.1
less_than(int32,decimal) | 11.969 | 4.819 | -59.7% | 1.5
less_than(int32,float32) | 7.480 | 0.730 | -90.2% | 9.2
less_than(int32,float64) | 7.452 | 0.791 | -89.4% | 8.4
less_than(int32,int16) | 6.776 | 0.488 | -92.8% | 12.9
less_than(int32,int32) | 6.739 | 0.492 | -92.7% | 12.7
less_than(int32,int64) | 6.785 | 0.567 | -91.6% | 11.0
less_than(int64,decimal) | 11.980 | 5.072 | -57.7% | 1.4
less_than(int64,float32) | 7.400 | 0.705 | -90.5% | 9.5
less_than(int64,float64) | 7.475 | 0.759 | -89.8% | 8.8
less_than(int64,int16) | 6.747 | 0.580 | -91.4% | 10.6
less_than(int64,int32) | 6.773 | 0.558 | -91.8% | 11.1
less_than(int64,int64) | 6.725 | 0.544 | -91.9% | 11.4
less_than(interval,interval) | 7.719 | 2.367 | -69.3% | 2.3
less_than(interval,time) | 8.346 | 2.622 | -68.6% | 2.2
less_than(time,interval) | 8.451 | 2.687 | -68.2% | 2.1
less_than(time,time) | 7.207 | 0.636 | -91.2% | 10.3
less_than(timestamp,date) | 6.849 | 1.414 | -79.4% | 3.8
less_than(timestamp,timestamp) | 7.881 | 2.070 | -73.7% | 2.8
less_than(timestampz,timestampz) | 6.719 | 0.550 | -91.8% | 11.2
less_than(varchar,varchar) | 11.779 | 11.933 | 1.3% | -0.0
not_equal(boolean,boolean) | 7.161 | 0.132 | -98.2% | 53.2
not_equal(date,date) | 6.731 | 0.511 | -92.4% | 12.2
not_equal(date,timestamp) | 7.162 | 1.032 | -85.6% | 5.9
not_equal(decimal,decimal) | 12.252 | 8.220 | -32.9% | 0.5
not_equal(decimal,float32) | 12.396 | 5.577 | -55.0% | 1.2
not_equal(decimal,float64) | 12.360 | 5.417 | -56.2% | 1.3
not_equal(decimal,int16) | 12.200 | 4.613 | -62.2% | 1.6
not_equal(decimal,int32) | 12.079 | 4.486 | -62.9% | 1.7
not_equal(decimal,int64) | 12.280 | 4.547 | -63.0% | 1.7
not_equal(float32,decimal) | 12.211 | 5.471 | -55.2% | 1.2
not_equal(float32,float32) | 6.836 | 0.666 | -90.3% | 9.3
not_equal(float32,float64) | 6.852 | 0.844 | -87.7% | 7.1
not_equal(float32,int16) | 7.538 | 0.890 | -88.2% | 7.5
not_equal(float32,int32) | 7.495 | 0.766 | -89.8% | 8.8
not_equal(float32,int64) | 7.502 | 0.730 | -90.3% | 9.3
not_equal(float64,decimal) | 12.281 | 5.416 | -55.9% | 1.3
not_equal(float64,float32) | 6.858 | 0.858 | -87.5% | 7.0
not_equal(float64,float64) | 6.806 | 0.903 | -86.7% | 6.5
not_equal(float64,int16) | 7.421 | 0.938 | -87.4% | 6.9
not_equal(float64,int32) | 7.378 | 0.821 | -88.9% | 8.0
not_equal(float64,int64) | 7.384 | 0.785 | -89.4% | 8.4
not_equal(int16,decimal) | 12.429 | 4.587 | -63.1% | 1.7
not_equal(int16,float32) | 7.415 | 0.792 | -89.3% | 8.4
not_equal(int16,float64) | 7.339 | 0.753 | -89.7% | 8.7
not_equal(int16,int16) | 7.355 | 0.473 | -93.6% | 14.6
not_equal(int16,int32) | 6.783 | 0.508 | -92.5% | 12.4
not_equal(int16,int64) | 6.795 | 0.605 | -91.1% | 10.2
not_equal(int32,decimal) | 11.881 | 4.394 | -63.0% | 1.7
not_equal(int32,float32) | 7.334 | 0.668 | -90.9% | 10.0
not_equal(int32,float64) | 7.339 | 0.620 | -91.5% | 10.8
not_equal(int32,int16) | 6.722 | 0.510 | -92.4% | 12.2
not_equal(int32,int32) | 6.797 | 0.506 | -92.5% | 12.4
not_equal(int32,int64) | 6.793 | 0.587 | -91.4% | 10.6
not_equal(int64,decimal) | 12.196 | 4.537 | -62.8% | 1.7
not_equal(int64,float32) | 7.398 | 0.621 | -91.6% | 10.9
not_equal(int64,float64) | 7.314 | 0.606 | -91.7% | 11.1
not_equal(int64,int16) | 6.778 | 0.598 | -91.2% | 10.3
not_equal(int64,int32) | 6.762 | 0.594 | -91.2% | 10.4
not_equal(int64,int64) | 6.821 | 0.569 | -91.7% | 11.0
not_equal(interval,interval) | 7.768 | 2.430 | -68.7% | 2.2
not_equal(interval,time) | 8.638 | 1.908 | -77.9% | 3.5
not_equal(time,interval) | 8.337 | 2.062 | -75.3% | 3.0
not_equal(time,time) | 7.151 | 0.574 | -92.0% | 11.5
not_equal(timestamp,date) | 7.156 | 1.053 | -85.3% | 5.8
not_equal(timestamp,timestamp) | 7.566 | 1.186 | -84.3% | 5.4
not_equal(timestampz,timestampz) | 6.760 | 0.574 | -91.5% | 10.8
not_equal(varchar,varchar) | 12.022 | 12.334 | 2.6% | -0.0
or(boolean,boolean) | 11.085 | 0.265 | -97.6% | 40.8
round_digit(decimal,int32) | 9.480 | 2.505 | -73.6% | 2.8
round(decimal) | 7.697 | 3.512 | -54.4% | 1.2
round(float64) | 4.430 | 0.201 | -95.5% | 21.1

</details>

To apply this optimization, we introduced several expression templates in the new `template_fast` module. They are specific to `PrimitiveArray` or `BoolArray`. Operations will be applied to array elements one by one, regardless of the null bitmap and without any branching. Thus the compiler can automatically vectorize them using SIMD instructions.

But given the no-branch requirement, we can not apply this technique on fallible operations such as arithmetics and most of the type casts. Although some of them can be addressed with pre- or post-checks (e.g. pre-check 0 for divide-by-zero error, post-check overflow for addition), they are highly operation-specific and hard to generalize. We'll explore the way to vectorize these fallible operations in the future.

Approved-By: soundOfDestiny
Approved-By: BugenZhao

* refactor(batch): refine visibility and dml executors (#7040)

- Be aware of visibility.
- Split batch chunks with chunk builder before inserting them into the streaming jobs.

Also refactor the implementation of `append_chunk` (`trunc_data_chunk`).

Approved-By: liurenjie1024

* perf(expr): optimize casting to varchar (#7066)

This PR optimizes the performance of casting values to varchar.

It introduced write API for `ToText`, so that strings can be directly written to array buffers without generating String.
The display function of interval and timestampz was also optimized.

<img width="581" alt="perf-cast" src="https://user-images.githubusercontent.com/15158738/209610088-859f0f77-5272-4cb8-bbe3-f743bc0cbe97.png">

<details>
<summary>Click to show full results</summary>

bench | Before time(us) | After time(us) | Change(%) | Speedup
-- | -- | -- | -- | --
cast(timestampz->varchar) | 508.640 | 121.600 | -76.1% | 3.2
cast(timestamp->varchar) | 166.200 | 58.245 | -65.0% | 1.9
cast(float64->varchar) | 78.386 | 57.597 | -26.5% | 0.4
cast(float32->varchar) | 57.903 | 37.384 | -35.4% | 0.5
cast(date->varchar) | 86.896 | 32.669 | -62.4% | 1.7
cast(time->varchar) | 47.508 | 28.428 | -40.2% | 0.7
cast(decimal->varchar) | 67.682 | 28.317 | -58.2% | 1.4
cast(int16->varchar) | 29.532 | 12.337 | -58.2% | 1.4
cast(int64->varchar) | 52.043 | 12.319 | -76.3% | 3.2
cast(int32->varchar) | 28.863 | 12.258 | -57.5% | 1.4
cast(boolean->varchar) | 26.826 | 6.396 | -76.2% | 3.2
bool_out(boolean) | 25.480 | 5.126 | -79.9% | 4.0

</details>

The `writer` argument of string functions was also changed from `StringWriter<'_>` to `&mut dyn Write`, making them decouple from array. I tried to use `&mut impl Write` but was blocked by annoying lifetime issues. Anyways, the performance of these operations is still slightly improved:

<img width="600" alt="perf-string-ops" src="https://user-images.githubusercontent.com/15158738/209610928-8036e4d1-e994-4178-8ce4-ff1340877e47.png">

<details>
<summary>Click to show full results</summary>

bench | Before time(us) | After time(us) | Change(%) | Speedup
-- | -- | -- | -- | --
rtrim(varchar,varchar) | 21.780 | 15.768 | -27.6% | 0.4
substr(varchar,int32,int32) | 11.126 | 8.090 | -27.3% | 0.4
rtrim(varchar) | 10.537 | 7.712 | -26.8% | 0.4
substr(varchar,int32) | 9.198 | 7.111 | -22.7% | 0.3
ltrim(varchar) | 9.661 | 8.010 | -17.1% | 0.2
trim(varchar) | 11.308 | 9.618 | -14.9% | 0.2
overlay(varchar,varchar,int32,int32) | 17.107 | 14.697 | -14.1% | 0.2
overlay(varchar,varchar,int32) | 13.408 | 12.007 | -10.4% | 0.1
ltrim(varchar,varchar) | 21.198 | 19.021 | -10.3% | 0.1
trim(varchar,varchar) | 20.876 | 19.205 | -8.0% | 0.1
split_part(varchar,varchar,int32) | 30.708 | 29.293 | -4.6% | 0.0
md5(varchar) | 346.010 | 331.670 | -4.1% | 0.0

</details>

Approved-By: BowenXiao1999
Approved-By: BugenZhao

* feat(optimizer): support share operator (#6956)

- `LogicalShare` operator is used to represent reusing of existing operators. It could have multiple parents which makes it different from other operators.
- Because most of our optimizations assume that our plan is a tree structure, in order to represent the DAG structured plan, we need to modify our optimizations and prevent them break our DAG plan back to a tree plan accidentally.
- Optimization including predicate pushdown, column pruning, heuristic optimizer, stream rewrite and to stream, all of them can break DAG plan back to a tree plan if we don't take care.
- Let me take predicate pushdown as an example to illustrate how to implement predicate pushdown for `LogicalShare`. We use a context for `LogicalShare` to keep track of how many times predicate has been pushdown for `LogicalShare`. Once pushdown times equal the parent number of the `LogicalShare`, we can merge all the previous predicates into one and then push it down for the input of `LogicalShare`.
- Heuristic optimizer's previous rules won't match any `LogicalShare`, so `LogicalShare` wouldn't affect its correctness.
- At the end of optimizer for batch query, we try to convert DAG back to Tree for now by removing `LogicalShare` (the rule named `DagToTreeRule`), because our batch executor doesn't support execute DAG plan directly currently.
- This PR also supports reusing source by `ShareSourceRewriter`. `ShareSourceRewriter` will replace all the sources occurred more than once in the streaming query with share operator.

Approved-By: st1page

Co-Authored-By: Dylan Chen <zilin@singularity-data.com>
Co-Authored-By: Dylan <chenzl25@mail2.sysu.edu.cn>

* perf(expr): vectorize infallible casts (#7079)

Similar to #7055, this PR vectorizes infallible casts.

<img width="516" alt="perf-infallible-cast" src="https://user-images.githubusercontent.com/15158738/209652449-61dc1513-7255-436c-aa36-e5a0d1dec384.png">

<details>
<summary>Click to show full results</summary>

bench | Before time(us) | After time(us) | Change(%) | Speedup
-- | -- | -- | -- | --
cast(int16->float32) | 4.434 | 0.146 | -96.7% | 29.3
cast(int16->int32) | 4.408 | 0.154 | -96.5% | 27.7
cast(float32->float64) | 4.432 | 0.187 | -95.8% | 22.7
cast(int32->int64) | 4.415 | 0.192 | -95.7% | 22.0
cast(int32->float64) | 4.422 | 0.194 | -95.6% | 21.8
cast(int16->int64) | 4.412 | 0.212 | -95.2% | 19.8
cast(timestamp->date) | 4.409 | 0.226 | -94.9% | 18.5
cast(timestamp->time) | 5.443 | 0.300 | -94.5% | 17.1
cast(date->timestamp) | 5.504 | 0.304 | -94.5% | 17.1
cast(int16->float64) | 4.430 | 0.298 | -93.3% | 13.9
cast(int32->decimal) | 5.582 | 0.592 | -89.4% | 8.4
cast(time->interval) | 5.511 | 0.727 | -86.8% | 6.6
cast(int64->decimal) | 5.739 | 0.766 | -86.7% | 6.5
cast(int16->decimal) | 5.760 | 0.845 | -85.3% | 5.8
cast(interval->time) | 5.903 | 1.289 | -78.2% | 3.6
cast(float32->decimal) | 21.970 | 18.170 | -17.3% | 0.2
cast(float64->decimal) | 40.131 | 36.049 | -10.2% | 0.1

</details>

Approved-By: BowenXiao1999

* fix: clean states in local barrier manager after actor dropped (#7082)

Trying to fix continuous recovery found in longevity and chaos test. I found that two problems might be the root cause of continuous recovery:
1. Fixed, unnecessary recovery triggered as described in #6989 . As I tested locally, when workload was very high, there were many ongoing barrier collect responses(up to 80+) when recovery. After recovery finished, each response would trigger a recovery process, because the whole cluster has already reset to previous committed epoch.
2. Before this PR, when force stopping actors in CN, the local manger will clean all states and then abort all actors. The problem is between cleaning states and aborting actors, the actors could also report epoch collected or error status to local barrier manager especially when the number of actors is high. This will cause a chain reaction in recovery.

I tested it locally and the recovery became normal. Besides, it could also be the cause of #6639 , #6715 .

Approved-By: fuyufjh
Approved-By: BugenZhao

* fix: return error if source executor failed to receive the first barrier (#7086)

**This section will be used as the commit message. Please do not leave this empty!**

Please explain **IN DETAIL** what the changes are in this PR and why they are needed:

this is a temp fix for #6931, and I believe that this can happen in rare cases.

From the log described in the issue, there is a failover that occurred before the panic. I think some part of meta failed to recover and the barrier channel closed for some reason.

It shows a possibility that meta node could fail to recover and compute node should be robots enough rather than panicking.

Approved-By: waruto210
Approved-By: xx01cyx

* fix(optimizer): fix hop window column pruning (#7085)

- Fix hop window column pruning.

Approved-By: st1page

* fix(streaming): fix memory leaks in streaming hash join (#7089)

Fix #6942. See the detailed discussions there.

The bug is inside the BTreeMap. For now, I will just remove that part of code because we don't rely on Allocator API to get memory usage now.

![image](https://user-images.githubusercontent.com/10192522/209688807-84ae0f84-9e17-44ae-8498-a378ee6b951e.png)

Approved-By: yuhao-su
Approved-By: BugenZhao

* fix: remove redundant `append_only: true` in explain (#7119)

fix: remove redundant `append_only: true` in explain result, since it has been expressed by the name "StreamAppendOnlyHashJoin".

Approved-By: chenzl25

* feat: Failover follower to leader (#6937)

https://github.com/risingwavelabs/risingwave/issues/6936

Approved-By: yezizp2012

* feat(meta): validate CDC connector properties during create source (#6938)

Validate connector properties on Meta (in `DebeziumSplitEnumerator`) when creating CDC source.
As the conclusion of https://github.com/risingwavelabs/rfcs/pull/29, we will deploy a sidecar connector node colocated with Meta on the cloud  to validate the connector properties.

Examples:
1. Wrong password
```
create materialized source products ( id INT,
name STRING,
description STRING,
PRIMARY KEY (id)
) with (
connector = 'mysql-cdc',
hostname = '127.0.0.1',
port = '3306',
username = 'root',
password = '12346',
database.name = 'mydb',
table.name = 'prodts',
server.id = '5085',
debezium.a.b = 'test'
) row format debezium_json;
ERROR:  QueryError: internal error: gRPC error (Client specified an invalid argument): Access denied for user 'root'@'localhost' (using password: YES)
```
2. Wrong table name
```
dev=> create materialized source products ( id INT,
name STRING,
description STRING,
PRIMARY KEY (id)
) with (
connector = 'mysql-cdc',
hostname = '127.0.0.1',
port = '3306',
username = 'root',
password = '123456',
database.name = 'mydb',
table.name = 'prodts',
server.id = '5085',
debezium.a.b = 'test'
) row format debezium_json;
ERROR:  QueryError: internal error: gRPC error (Client specified an invalid argument): table doesn't exist
```

Approved-By: tabVersion

* refactor: decouple memory management from stream, make it accessible for both batch and streaming (#7004)

Main idea:
1. Rename `LruManager` to `GlobalMemoryManager`, move it from `stream` crate to `compute` crate. Can not move to `common` as it depends on `risingwave_stream` and `risingwave_batch`. This comes from what we have discussed in the memory management rfc: https://github.com/risingwavelabs/rfcs/pull/26.

2. Fully decouple `risingwave_stream` and memory manager. Before this pr, streaming executor access to lru manager to create cache. However, this will cause cyclic reference if we move `LruManager` out from `risingwave_stream`. What executor really need is the watermark epoch, so instead of let `risingwave_stream` access to Memory Manager, just store the watermark epoch in the `LocalStreamManager` and when executors are building, they can read this value and then they can create cache with their own. Personally I think this is more clean: memory manager have access to stream/batch two components, and vic versa no.

3. Currently the memory manager ref is not stored anywhere. Thinking of where to store it. 🤔

Approved-By: liurenjie1024

* feat: support plan generating & execution for new DDL & DML design (#6836)

This PR applies `SourceExecutorV2`, `DmlExecutor`, `RowIdGenExecutor`, and `DmlManager` to query execution.
For example, the query plan for `CREATE TABLE t (v int)` will be:

```SQL
StreamMaterialize { columns: [v, _row_id(hidden)], pk_columns: [_row_id] }
└─StreamExchange { dist: HashShard(_row_id) }
└─StreamRowIdGen { row_id_index: 1 }
└─StreamDml { columns: [v, _row_id] }
└─StreamSource
```

Some explanations:
- `StreamSource` here contains no actual external streaming source. It is only responsible for receiving barriers.
- `StreamDml` will receive data from `InsertExecutor`, `DeleteExecutor`, and `UpdateExecutor`.
- `StreamRowIdGen` will generate row id for the data. In this case, the primary key is not defined by the user, so we internally add a `_row_id` column as the primary key. If the table has a user-defined primary key, then this executor can be eliminated.

Note that now **"source" stands for streaming source only**. There is **NO table source** now. Though `CREATE TABLE` will create a `StreamSource`, it actually contains nothing related to a source (catalog).

Approved-By: st1page
Approved-By: BugenZhao
Approved-By: yezizp2012

Co-Authored-By: xx01cyx <caoyuanxin0531@outlook.com>
Co-Authored-By: st1page <1245835950@qq.com>

* feat(frontend): avoid pk duplication  (#7095)

avoid pk duplication for streaming executors, and still allow join key duplication.

Approved-By: st1page

* feat(optimizer): improve column pruning for share operator and perform share source at the beginning (#7111)

- Improve column pruning for share operator. We need 2 round column pruning for DAG plan.
- Perform share source at the beginning so that we can benefit from predicate pushdown and column pruning.

Approved-By: st1page
Approved-By: fuyufjh

* fix(streaming): handle scaling for row id gen executor (#7122)

Correctly handle the scaling for RowIdGen executor. The logic used to work fine, but was lost in the refactoring in #6529.

Approved-By: yezizp2012
Approved-By: xx01cyx

* fix(optimizer): fix logical join o2i_col_mapping (#7108)

- Fix logical join `o2i_col_mapping` by using `output_indices` directly instead of inverse `i2o_col_mapping`.

Approved-By: fuyufjh

Co-Authored-By: Dylan Chen <zilin@singularity-data.com>
Co-Authored-By: xxchan <xxchan22f@gmail.com>

* fix(frontend): hash join do not deduplicate input pk (#7123)

Previously we want to deduplicate pk for streaming executors, however:
- agg will do prefix scan by group key, so we can not deduplicate group key
- hash join will do prefix scan by join key, so we can not deduplicate join key
- hash join need to be aware of input pk, and there might be an inconsistency between the pk of hash join state table and the input pk got in hash join executor

so we decided not to handle deduplicated input pk now, we may  complete the dedup task case by case, just like agg instead of add a general method in catalog builder.

Approved-By: yuhao-su

Co-Authored-By: congyi <15605187270@163.com>
Co-Authored-By: congyi wang <58715567+wcy-fdu@users.noreply.github.com>

* fix: fix NULL regexp capture group (#7129)

If a particular capture group didn't participate in the match, we should return `NULL` instead of skipping it.

fix https://github.com/risingwavelabs/risingwave/issues/7126

Approved-By: TennyZhuang

* chore(test): compress the test data (#7007)

Reduce size from 16MB to 400KB.

Approved-By: tabVersion

* fix: support kafka sink for struct and list type  (#7098)

**This section will be used as the commit message. Please do not leave this empty!**

fix type matches for struct and list in kafka sink

& add struct and list test cases in ut
& add script command for compress test cases into zip file

Approved-By: lmatz

Co-Authored-By: tabVersion <tabvision@bupt.icu>
Co-Authored-By: lmatz <lmatz823@gmail.com>

Signed-off-by: Runji Wang <wangrunji0408@163.com>
Co-authored-by: Eric Fu <eric@singularity-data.com>
Co-authored-by: Dylan <chenzl25@mail2.sysu.edu.cn>
Co-authored-by: Dylan Chen <zilin@singularity-data.com>
Co-authored-by: xxchan <xxchan22f@gmail.com>
Co-authored-by: Bugen Zhao <i@bugenzhao.com>
Co-authored-by: zwang28 <70626450+zwang28@users.noreply.github.com>
Co-authored-by: zwang28 <84491488@qq.com>
Co-authored-by: Runji Wang <wangrunji0408@163.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: August <pin@singularity-data.com>
Co-authored-by: Noel Kwan <47273164+kwannoel@users.noreply.github.com>
Co-authored-by: Yuhao Su <31772373+yuhao-su@users.noreply.github.com>
Co-authored-by: TennyZhuang <zty0826@gmail.com>
Co-authored-by: Bohan Zhang <tabvision@bupt.icu>
Co-authored-by: CAJan93 <jan.mensch@gmx.net>
Co-authored-by: StrikeW <wangsiyuanse@gmail.com>
Co-authored-by: Bowen <36908971+BowenXiao1999@users.noreply.github.com>
Co-authored-by: Yuanxin Cao <60498509+xx01cyx@users.noreply.github.com>
Co-authored-by: xx01cyx <caoyuanxin0531@outlook.com>
Co-authored-by: st1page <1245835950@qq.com>
Co-authored-by: congyi wang <58715567+wcy-fdu@users.noreply.github.com>
Co-authored-by: congyi <15605187270@163.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants