Skip to content

Commit

Permalink
chore: merge fixes into v0.1.15 (#7162)
Browse files Browse the repository at this point in the history
* fix: clean the verbose logs of "failed to send message to actor" (#6973)

As title.

```
// println!("{:?}", &chunk);
StreamChunk { cardinality: 4, capacity: 4, .. }

// println!("{:#?}", &chunk);
StreamChunk { cardinality: 4, capacity: 4, data:
+----+---+---+
|  + | 1 | 6 |
|  - | 2 |   |
| U- | 3 | 7 |
| U+ | 4 |   |
+----+---+---+
}
```


Approved-By: BugenZhao

Co-Authored-By: Eric Fu <eric@singularity-data.com>

* feat(stream): Make scale DAG aware (#7013)

**This section will be used as the commit message. Please do not leave this empty!**

Please explain **IN DETAIL** what the changes are in this PR and why they are needed:

- Let `Reschedule` accept more than one downstream fragment.
- Let the dispatcher id calculated from the exchange operator id together with the upstream and downstream id.


Approved-By: BugenZhao

Co-Authored-By: Dylan Chen <zilin@singularity-data.com>

* chore: log AST in sqlparser (#7012)

Log AST in sqlparser for better debugging. Also make it easier to configure logs.


Approved-By: lmatz

Co-Authored-By: xxchan <xxchan22f@gmail.com>

* feat(streaming): do not backfill for empty table (#7009)

If the snapshot is empty, we don't need to backfill and can immediately finish the progress. This can speed up some tests.

```
dev=> create materialized view mv2 as select * from t;
CREATE_MATERIALIZED_VIEW
Time: 1033.834 ms (00:01.034)

dev=> delete from t;
DELETE 1
Time: 9.869 ms

dev=> create materialized view mv3 as select * from t;
CREATE_MATERIALIZED_VIEW
Time: 18.550 ms
```

Note that every executor requires a barrier for the first message. So if there are few records in the table (but not empty), we cannot adapt this optimization. The further plan might be to issue next checkpoints more frequently for this case.


Approved-By: chenzl25

Co-Authored-By: Bugen Zhao <i@bugenzhao.com>

* refactor(logging): be aware of RUST_LOG env (#7016)

This PR supports overwriting log filters predefined in `init_risingwave_logger`, by specifying RUST_LOG environment variable.
One use case is I want to suppress certain logs in CI, to avoid large log size.


Approved-By: BugenZhao
Approved-By: xxchan

Co-Authored-By: zwang28 <84491488@qq.com>

* perf(bitmap): change the buffer unit from `u8` to `usize` (#7030)

* bitmap: use pointer-sized element for underlying buffer

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* use `Box<[usize]>` to save 8 bytes

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* rename functions and add docs

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* add bench for bitmap

Signed-off-by: Runji Wang <wangrunji0408@163.com>

Signed-off-by: Runji Wang <wangrunji0408@163.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* refactor(optimizer): rename optimizer rule filename. (#7038)

rename optimizer rule

* chore(connector): fix log level (#7043)

`info -> debug` otherwise print every record.

Approved-By: tabVersion

* fix: check sink properties and validate them in advance to avoid panic or recovery (#7041)

Check whether sink properties provided in frontend and simply validate them when building executors to avoid panic or recovery.

Approved-By: BugenZhao

* fix(batch): match `probe_row` once for `SemiJoin` with `non_equi` predicates (#7033)

### Problem

- For a probe key, after chunk is spilled, we may continue appending rows for processing.
- This happens even though the probe key already has found matching build row.
- These rows should be discarded instead / not appended.
- As a result, when processing the next spilled chunk, they are also included inside, and finally returned in the results as duplicate rows.
- As you can see in the [expected results before fix](https://github.com/risingwavelabs/risingwave/pull/7033/commits/ec6eac5fbac700ed9c03662592a30886a198d248), we have 5 probe rows, but results return 8 output.

### Solution

- if probe row match, break and do matching for next probe row. Then we won't process duplicate matches for matched probe rows.
- If probe row not match, continue appending. The appended rows will be contained in buffered chunk and processed later.

Approved-By: chenzl25

* perf(expr): complete expression benchmark framework (#6995)

This PR completes the micro-benchmark framework for expressions.

It can be run with:
```sh
cd src/expr
# run all benches
cargo bench --bench expr -- --quick
# list all benches
cargo bench --bench expr -- --list
# run specified benches
cargo bench --bench expr -- --quick "add\(int32,int32\)"
```

The detailed bench results have been updated to #6868.
Here is a statistical overview of all expressions:

<img width="470" alt="截屏2022-12-20 21 13 24" src="https://user-images.githubusercontent.com/15158738/208812177-529f62aa-4590-4dbc-b93b-9d8f1151418c.png">

To enumerate all valid expressions, we utilize the function signature maps defined in the frontend. We moved them into the expr crate in order to avoid dependency on the frontend thus reducing the compilation time.

Approved-By: lmatz

* perf(expr): prebuild the AC for `to_char` (#7048)

As title, put it on lazy-static to avoid building every time.

bench | Before time(us) | After time(us) | Change(%)
-- | -- | -- | --
to_char(timestamp,varchar) | 11060.000 | 283.900 | -97.4%

Approved-By: TennyZhuang

* fix(alloc): missing padding in realloc (#7046)

as title

Approved-By: TennyZhuang

* fix(parser): fix parsing nested wildcard struct field access (#7024)

Fix #7011: nested wildcard struct field access panics when there are additional parentheses.

Also did some minor style refactoring and added some comments.

Approved-By: st1page
Approved-By: yezizp2012

* fix: fix parser test of wildcard struct field with additional parentheses (#7052)

Fix parser test of wildcard struct field with additional parentheses: https://buildkite.com/risingwavelabs/main-cron/builds/282#01854176-ffb0-4018-aa7b-fe128a411a41

Approved-By: xxchan

* feat(optimizer): support union merge rule (#7037)

**This section will be used as the commit message. Please do not leave this empty!**

Please explain **IN DETAIL** what the changes are in this PR and why they are needed:

- Merge all binary unions into multi inputs union.

Approved-By: st1page

* fix: fix period non-zero panic in election (#7065)

Fix #7061

Approved-By: lmatz

* perf(expr): optimize lower/upper/trim/md5 (#7047)

This PR optimizes `lower`/`upper`/`trim`/`md5` operations by avoiding generating String.

bench | Before time(us) | After time(us) | Change(%)
-- | -- | -- | --
md5(varchar) | 447.040 | 338.540 | -24.3%
ltrim(varchar,varchar) | 38.893 | 20.666 | -46.9%
rtrim(varchar,varchar) | 38.870 | 22.311 | -42.6%
trim(varchar,varchar) | 38.327 | 20.831 | -45.6%
lower(varchar) | 36.172 | 10.851 | -70.0%
upper(varchar) | 34.607 | 10.946 | -68.4%

Approved-By: lmatz

* perf(expr): reduce format parsing in to_char (#7051)

Reduce duplicated format parsing during evaluation of `to_char`.

The perf improve by about 20% with a constant format string "YYYY/MM/DD HH24:MI:SS", but downgraded with the current benchmark framework due to #7050 .

Approved-By: lmatz
Approved-By: wangrunji0408

* perf(expr): cover the fast path for `to_char(timestamp)` (#7056)

This PR changes the input of bench `to_char(timestamp,varchar)`.
It makes the second argument a constant format string to cover the fast path of evaluation.

```
to_char(timestamp,varchar)
time:   [245.77 µs 245.80 µs 245.89 µs]
change: [-70.241% -70.126% -70.010%] (p = 0.07 > 0.05)
```

Approved-By: lmatz

* perf(expr): vectorize infallible operations (10x speedup) (#7055)

This PR vectorizes the following infallible operations:
- `and/or/not`
- `bitwise_{and/or/xor/not}`
- `is[_not]_{true/false}`
- `eq/ne/gt/lt/ge/le`
- `is[_not]_distinct_from`
- `round/ceil/floor`

making them 10x faster on average, and up to 50x speed up.

The distribution curve of all operation times before and after this PR:
<img width="491" alt="perf-stat" src="https://user-images.githubusercontent.com/15158738/209499628-1631c779-b6d5-49be-b842-8dcf89a23e1a.png">

<details>
<summary>Click to show full results</summary>

bench | Before time(us) | After time(us) | Change(%) | Speedup
-- | -- | -- | -- | --
and(boolean,boolean) | 11.110 | 0.417 | -96.2% | 25.6
bitwise_and(int16,int16) | 5.284 | 0.142 | -97.3% | 36.2
bitwise_and(int16,int32) | 4.540 | 0.170 | -96.3% | 25.7
bitwise_and(int16,int64) | 4.535 | 0.257 | -94.3% | 16.7
bitwise_and(int32,int16) | 4.509 | 0.183 | -95.9% | 23.6
bitwise_and(int32,int32) | 4.495 | 0.185 | -95.9% | 23.3
bitwise_and(int32,int64) | 4.549 | 0.232 | -94.9% | 18.6
bitwise_and(int64,int16) | 4.489 | 0.241 | -94.6% | 17.6
bitwise_and(int64,int32) | 4.500 | 0.254 | -94.4% | 16.7
bitwise_and(int64,int64) | 4.508 | 0.268 | -94.0% | 15.8
bitwise_not(int16) | 4.363 | 0.116 | -97.3% | 36.5
bitwise_not(int32) | 4.353 | 0.164 | -96.2% | 25.6
bitwise_not(int64) | 4.358 | 0.219 | -95.0% | 18.9
bitwise_or(int16,int16) | 5.278 | 0.142 | -97.3% | 36.3
bitwise_or(int16,int32) | 4.603 | 0.171 | -96.3% | 25.9
bitwise_or(int16,int64) | 4.556 | 0.242 | -94.7% | 17.8
bitwise_or(int32,int16) | 4.531 | 0.169 | -96.3% | 25.8
bitwise_or(int32,int32) | 4.495 | 0.182 | -95.9% | 23.7
bitwise_or(int32,int64) | 4.532 | 0.250 | -94.5% | 17.1
bitwise_or(int64,int16) | 4.493 | 0.237 | -94.7% | 17.9
bitwise_or(int64,int32) | 4.600 | 0.251 | -94.5% | 17.3
bitwise_or(int64,int64) | 4.495 | 0.254 | -94.3% | 16.7
bitwise_xor(int16,int16) | 5.292 | 0.137 | -97.4% | 37.6
bitwise_xor(int16,int32) | 4.561 | 0.172 | -96.2% | 25.5
bitwise_xor(int16,int64) | 4.526 | 0.238 | -94.7% | 18.0
bitwise_xor(int32,int16) | 4.486 | 0.179 | -96.0% | 24.0
bitwise_xor(int32,int32) | 4.508 | 0.200 | -95.6% | 21.6
bitwise_xor(int32,int64) | 4.574 | 0.266 | -94.2% | 16.2
bitwise_xor(int64,int16) | 4.596 | 0.254 | -94.5% | 17.1
bitwise_xor(int64,int32) | 4.502 | 0.292 | -93.5% | 14.4
bitwise_xor(int64,int64) | 4.502 | 0.276 | -93.9% | 15.3
ceil(decimal) | 7.564 | 3.507 | -53.6% | 1.2
ceil(float64) | 4.358 | 0.208 | -95.2% | 20.0
equal(boolean,boolean) | 7.444 | 0.136 | -98.2% | 53.5
equal(date,date) | 7.079 | 0.491 | -93.1% | 13.4
equal(date,timestamp) | 7.572 | 1.033 | -86.4% | 6.3
equal(decimal,decimal) | 12.715 | 8.304 | -34.7% | 0.5
equal(decimal,float32) | 12.879 | 5.474 | -57.5% | 1.4
equal(decimal,float64) | 12.945 | 5.349 | -58.7% | 1.4
equal(decimal,int16) | 11.359 | 4.574 | -59.7% | 1.5
equal(decimal,int32) | 11.384 | 4.472 | -60.7% | 1.5
equal(decimal,int64) | 11.410 | 4.615 | -59.6% | 1.5
equal(float32,decimal) | 12.972 | 5.566 | -57.1% | 1.3
equal(float32,float32) | 7.219 | 0.678 | -90.6% | 9.6
equal(float32,float64) | 7.337 | 0.853 | -88.4% | 7.6
equal(float32,int16) | 6.858 | 0.841 | -87.7% | 7.2
equal(float32,int32) | 6.826 | 0.712 | -89.6% | 8.6
equal(float32,int64) | 7.003 | 0.667 | -90.5% | 9.5
equal(float64,decimal) | 12.835 | 5.507 | -57.1% | 1.3
equal(float64,float32) | 7.271 | 0.869 | -88.0% | 7.4
equal(float64,float64) | 7.220 | 0.922 | -87.2% | 6.8
equal(float64,int16) | 6.772 | 0.878 | -87.0% | 6.7
equal(float64,int32) | 6.752 | 0.757 | -88.8% | 7.9
equal(float64,int64) | 6.724 | 0.723 | -89.2% | 8.3
equal(int16,decimal) | 11.437 | 4.623 | -59.6% | 1.5
equal(int16,float32) | 6.785 | 0.746 | -89.0% | 8.1
equal(int16,float64) | 6.767 | 0.709 | -89.5% | 8.6
equal(int16,int16) | 7.911 | 0.476 | -94.0% | 15.6
equal(int16,int32) | 7.087 | 0.494 | -93.0% | 13.4
equal(int16,int64) | 7.105 | 0.565 | -92.0% | 11.6
equal(int32,decimal) | 11.105 | 4.449 | -59.9% | 1.5
equal(int32,float32) | 6.749 | 0.641 | -90.5% | 9.5
equal(int32,float64) | 6.714 | 0.596 | -91.1% | 10.3
equal(int32,int16) | 7.086 | 0.492 | -93.1% | 13.4
equal(int32,int32) | 7.094 | 0.489 | -93.1% | 13.5
equal(int32,int64) | 7.118 | 0.546 | -92.3% | 12.0
equal(int64,decimal) | 11.331 | 4.622 | -59.2% | 1.5
equal(int64,float32) | 6.732 | 0.593 | -91.2% | 10.3
equal(int64,float64) | 6.808 | 0.567 | -91.7% | 11.0
equal(int64,int16) | 7.103 | 0.572 | -92.0% | 11.4
equal(int64,int32) | 7.089 | 0.550 | -92.2% | 11.9
equal(int64,int64) | 7.086 | 0.556 | -92.1% | 11.7
equal(interval,interval) | 8.289 | 2.381 | -71.3% | 2.5
equal(interval,time) | 7.969 | 1.874 | -76.5% | 3.3
equal(time,interval) | 7.669 | 2.039 | -73.4% | 2.8
equal(time,time) | 7.523 | 0.546 | -92.7% | 12.8
equal(timestamp,date) | 7.622 | 1.040 | -86.4% | 6.3
equal(timestamp,timestamp) | 8.041 | 1.159 | -85.6% | 5.9
equal(timestampz,timestampz) | 7.099 | 0.545 | -92.3% | 12.0
equal(varchar,varchar) | 12.215 | 12.335 | 1.0% | -0.0
floor(decimal) | 7.532 | 2.890 | -61.6% | 1.6
floor(float64) | 4.357 | 0.211 | -95.2% | 19.6
greater_than_or_equal(boolean,boolean) | 7.376 | 0.174 | -97.6% | 41.3
greater_than_or_equal(date,date) | 7.082 | 0.509 | -92.8% | 12.9
greater_than_or_equal(date,timestamp) | 7.970 | 2.087 | -73.8% | 2.8
greater_than_or_equal(decimal,decimal) | 12.622 | 8.342 | -33.9% | 0.5
greater_than_or_equal(decimal,float32) | 13.727 | 6.046 | -56.0% | 1.3
greater_than_or_equal(decimal,float64) | 13.906 | 5.957 | -57.2% | 1.3
greater_than_or_equal(decimal,int16) | 12.191 | 4.627 | -62.0% | 1.6
greater_than_or_equal(decimal,int32) | 12.040 | 4.524 | -62.4% | 1.7
greater_than_or_equal(decimal,int64) | 12.128 | 4.636 | -61.8% | 1.6
greater_than_or_equal(float32,decimal) | 13.212 | 6.129 | -53.6% | 1.2
greater_than_or_equal(float32,float32) | 7.594 | 0.697 | -90.8% | 9.9
greater_than_or_equal(float32,float64) | 7.631 | 0.874 | -88.6% | 7.7
greater_than_or_equal(float32,int16) | 7.751 | 1.160 | -85.0% | 5.7
greater_than_or_equal(float32,int32) | 7.620 | 1.021 | -86.6% | 6.5
greater_than_or_equal(float32,int64) | 7.632 | 0.984 | -87.1% | 6.8
greater_than_or_equal(float64,decimal) | 13.328 | 5.996 | -55.0% | 1.2
greater_than_or_equal(float64,float32) | 7.758 | 0.963 | -87.6% | 7.1
greater_than_or_equal(float64,float64) | 7.607 | 0.930 | -87.8% | 7.2
greater_than_or_equal(float64,int16) | 7.764 | 1.217 | -84.3% | 5.4
greater_than_or_equal(float64,int32) | 7.673 | 1.102 | -85.6% | 6.0
greater_than_or_equal(float64,int64) | 7.621 | 1.035 | -86.4% | 6.4
greater_than_or_equal(int16,decimal) | 11.404 | 4.960 | -56.5% | 1.3
greater_than_or_equal(int16,float32) | 6.930 | 0.832 | -88.0% | 7.3
greater_than_or_equal(int16,float64) | 6.948 | 0.860 | -87.6% | 7.1
greater_than_or_equal(int16,int16) | 7.886 | 0.458 | -94.2% | 16.2
greater_than_or_equal(int16,int32) | 7.125 | 0.491 | -93.1% | 13.5
greater_than_or_equal(int16,int64) | 7.102 | 0.569 | -92.0% | 11.5
greater_than_or_equal(int32,decimal) | 11.355 | 4.701 | -58.6% | 1.4
greater_than_or_equal(int32,float32) | 6.858 | 0.707 | -89.7% | 8.7
greater_than_or_equal(int32,float64) | 6.852 | 0.761 | -88.9% | 8.0
greater_than_or_equal(int32,int16) | 7.129 | 0.509 | -92.9% | 13.0
greater_than_or_equal(int32,int32) | 7.176 | 0.506 | -93.0% | 13.2
greater_than_or_equal(int32,int64) | 7.121 | 0.566 | -92.1% | 11.6
greater_than_or_equal(int64,decimal) | 11.442 | 4.946 | -56.8% | 1.3
greater_than_or_equal(int64,float32) | 6.861 | 0.692 | -89.9% | 8.9
greater_than_or_equal(int64,float64) | 6.861 | 0.728 | -89.4% | 8.4
greater_than_or_equal(int64,int16) | 7.163 | 0.578 | -91.9% | 11.4
greater_than_or_equal(int64,int32) | 7.094 | 0.563 | -92.1% | 11.6
greater_than_or_equal(int64,int64) | 7.131 | 0.560 | -92.1% | 11.7
greater_than_or_equal(interval,interval) | 8.229 | 2.400 | -70.8% | 2.4
greater_than_or_equal(interval,time) | 8.928 | 2.687 | -69.9% | 2.3
greater_than_or_equal(time,interval) | 7.858 | 2.796 | -64.4% | 1.8
greater_than_or_equal(time,time) | 7.745 | 0.669 | -91.4% | 10.6
greater_than_or_equal(timestamp,date) | 7.252 | 1.433 | -80.2% | 4.1
greater_than_or_equal(timestamp,timestamp) | 8.461 | 2.235 | -73.6% | 2.8
greater_than_or_equal(timestampz,timestampz) | 7.088 | 0.547 | -92.3% | 12.0
greater_than_or_equal(varchar,varchar) | 12.385 | 12.439 | 0.4% | -0.0
greater_than(boolean,boolean) | 6.948 | 0.175 | -97.5% | 38.6
greater_than(date,date) | 6.702 | 0.488 | -92.7% | 12.7
greater_than(date,timestamp) | 7.562 | 2.103 | -72.2% | 2.6
greater_than(decimal,decimal) | 11.878 | 8.252 | -30.5% | 0.4
greater_than(decimal,float32) | 12.946 | 6.008 | -53.6% | 1.2
greater_than(decimal,float64) | 12.897 | 6.000 | -53.5% | 1.1
greater_than(decimal,int16) | 12.214 | 4.701 | -61.5% | 1.6
greater_than(decimal,int32) | 12.067 | 4.527 | -62.5% | 1.7
greater_than(decimal,int64) | 12.238 | 4.724 | -61.4% | 1.6
greater_than(float32,decimal) | 12.488 | 6.309 | -49.5% | 1.0
greater_than(float32,float32) | 7.017 | 0.818 | -88.3% | 7.6
greater_than(float32,float64) | 7.145 | 1.071 | -85.0% | 5.7
greater_than(float32,int16) | 7.701 | 1.176 | -84.7% | 5.5
greater_than(float32,int32) | 7.635 | 1.028 | -86.5% | 6.4
greater_than(float32,int64) | 7.552 | 0.989 | -86.9% | 6.6
greater_than(float64,decimal) | 12.434 | 6.205 | -50.1% | 1.0
greater_than(float64,float32) | 7.115 | 1.114 | -84.3% | 5.4
greater_than(float64,float64) | 7.162 | 1.161 | -83.8% | 5.2
greater_than(float64,int16) | 7.722 | 1.217 | -84.2% | 5.3
greater_than(float64,int32) | 7.700 | 1.095 | -85.8% | 6.0
greater_than(float64,int64) | 7.598 | 1.052 | -86.1% | 6.2
greater_than(int16,decimal) | 11.414 | 5.041 | -55.8% | 1.3
greater_than(int16,float32) | 6.936 | 0.848 | -87.8% | 7.2
greater_than(int16,float64) | 6.858 | 0.879 | -87.2% | 6.8
greater_than(int16,int16) | 7.271 | 0.469 | -93.6% | 14.5
greater_than(int16,int32) | 6.766 | 0.496 | -92.7% | 12.6
greater_than(int16,int64) | 6.682 | 0.570 | -91.5% | 10.7
greater_than(int32,decimal) | 11.305 | 4.732 | -58.1% | 1.4
greater_than(int32,float32) | 6.886 | 0.714 | -89.6% | 8.6
greater_than(int32,float64) | 6.820 | 0.745 | -89.1% | 8.2
greater_than(int32,int16) | 6.668 | 0.493 | -92.6% | 12.5
greater_than(int32,int32) | 6.983 | 0.491 | -93.0% | 13.2
greater_than(int32,int64) | 6.699 | 0.547 | -91.8% | 11.2
greater_than(int64,decimal) | 11.416 | 5.031 | -55.9% | 1.3
greater_than(int64,float32) | 6.898 | 0.679 | -90.2% | 9.2
greater_than(int64,float64) | 6.808 | 0.730 | -89.3% | 8.3
greater_than(int64,int16) | 6.772 | 0.573 | -91.5% | 10.8
greater_than(int64,int32) | 6.692 | 0.551 | -91.8% | 11.1
greater_than(int64,int64) | 6.719 | 0.547 | -91.9% | 11.3
greater_than(interval,interval) | 7.674 | 2.447 | -68.1% | 2.1
greater_than(interval,time) | 8.831 | 2.754 | -68.8% | 2.2
greater_than(time,interval) | 7.853 | 3.016 | -61.6% | 1.6
greater_than(time,time) | 7.229 | 0.778 | -89.2% | 8.3
greater_than(timestamp,date) | 7.123 | 1.456 | -79.6% | 3.9
greater_than(timestamp,timestamp) | 7.808 | 2.193 | -71.9% | 2.6
greater_than(timestampz,timestampz) | 6.691 | 0.561 | -91.6% | 10.9
greater_than(varchar,varchar) | 11.620 | 11.953 | 2.9% | -0.0
is_distinct_from(boolean,boolean) | 7.967 | 0.293 | -96.3% | 26.2
is_distinct_from(date,date) | 6.830 | 0.546 | -92.0% | 11.5
is_distinct_from(date,timestamp) | 7.031 | 1.068 | -84.8% | 5.6
is_distinct_from(decimal,decimal) | 12.050 | 8.189 | -32.0% | 0.5
is_distinct_from(decimal,float32) | 12.670 | 6.270 | -50.5% | 1.0
is_distinct_from(decimal,float64) | 12.565 | 6.312 | -49.8% | 1.0
is_distinct_from(decimal,int16) | 12.300 | 4.674 | -62.0% | 1.6
is_distinct_from(decimal,int32) | 12.231 | 4.553 | -62.8% | 1.7
is_distinct_from(decimal,int64) | 12.417 | 4.640 | -62.6% | 1.7
is_distinct_from(float32,decimal) | 12.468 | 6.309 | -49.4% | 1.0
is_distinct_from(float32,float32) | 6.885 | 0.707 | -89.7% | 8.7
is_distinct_from(float32,float64) | 6.894 | 0.899 | -87.0% | 6.7
is_distinct_from(float32,int16) | 7.597 | 0.927 | -87.8% | 7.2
is_distinct_from(float32,int32) | 7.529 | 0.804 | -89.3% | 8.4
is_distinct_from(float32,int64) | 7.413 | 0.768 | -89.6% | 8.7
is_distinct_from(float64,decimal) | 12.303 | 6.279 | -49.0% | 1.0
is_distinct_from(float64,float32) | 6.850 | 0.899 | -86.9% | 6.6
is_distinct_from(float64,float64) | 6.917 | 0.944 | -86.4% | 6.3
is_distinct_from(float64,int16) | 7.578 | 0.992 | -86.9% | 6.6
is_distinct_from(float64,int32) | 7.427 | 0.866 | -88.3% | 7.6
is_distinct_from(float64,int64) | 7.443 | 0.835 | -88.8% | 7.9
is_distinct_from(int16,decimal) | 12.005 | 4.642 | -61.3% | 1.6
is_distinct_from(int16,float32) | 7.492 | 0.839 | -88.8% | 7.9
is_distinct_from(int16,float64) | 7.403 | 0.801 | -89.2% | 8.2
is_distinct_from(int16,int16) | 7.458 | 0.500 | -93.3% | 13.9
is_distinct_from(int16,int32) | 6.834 | 0.549 | -92.0% | 11.5
is_distinct_from(int16,int64) | 6.749 | 0.653 | -90.3% | 9.3
is_distinct_from(int32,decimal) | 11.828 | 4.436 | -62.5% | 1.7
is_distinct_from(int32,float32) | 7.353 | 0.727 | -90.1% | 9.1
is_distinct_from(int32,float64) | 7.265 | 0.683 | -90.6% | 9.6
is_distinct_from(int32,int16) | 6.729 | 0.556 | -91.7% | 11.1
is_distinct_from(int32,int32) | 6.771 | 0.550 | -91.9% | 11.3
is_distinct_from(int32,int64) | 6.755 | 0.653 | -90.3% | 9.3
is_distinct_from(int64,decimal) | 11.955 | 4.716 | -60.6% | 1.5
is_distinct_from(int64,float32) | 7.365 | 0.671 | -90.9% | 10.0
is_distinct_from(int64,float64) | 7.292 | 0.654 | -91.0% | 10.2
is_distinct_from(int64,int16) | 6.727 | 0.672 | -90.0% | 9.0
is_distinct_from(int64,int32) | 6.734 | 0.624 | -90.7% | 9.8
is_distinct_from(int64,int64) | 6.767 | 0.614 | -90.9% | 10.0
is_distinct_from(interval,interval) | 7.889 | 2.420 | -69.3% | 2.3
is_distinct_from(interval,time) | 8.556 | 1.927 | -77.5% | 3.4
is_distinct_from(time,interval) | 8.243 | 2.084 | -74.7% | 3.0
is_distinct_from(time,time) | 6.954 | 0.600 | -91.4% | 10.6
is_distinct_from(timestamp,date) | 7.219 | 1.087 | -84.9% | 5.6
is_distinct_from(timestamp,timestamp) | 7.202 | 1.205 | -83.3% | 5.0
is_distinct_from(timestampz,timestampz) | 6.762 | 0.617 | -90.9% | 10.0
is_distinct_from(varchar,varchar) | 11.326 | 10.983 | -3.0% | 0.0
is_false(boolean) | 6.597 | 0.161 | -97.6% | 39.9
is_not_distinct_from(boolean,boolean) | 9.049 | 0.306 | -96.6% | 28.6
is_not_distinct_from(date,date) | 7.295 | 0.545 | -92.5% | 12.4
is_not_distinct_from(date,timestamp) | 7.507 | 1.080 | -85.6% | 6.0
is_not_distinct_from(decimal,decimal) | 12.880 | 8.241 | -36.0% | 0.6
is_not_distinct_from(decimal,float32) | 13.258 | 6.351 | -52.1% | 1.1
is_not_distinct_from(decimal,float64) | 13.361 | 6.318 | -52.7% | 1.1
is_not_distinct_from(decimal,int16) | 11.600 | 4.708 | -59.4% | 1.5
is_not_distinct_from(decimal,int32) | 11.458 | 4.586 | -60.0% | 1.5
is_not_distinct_from(decimal,int64) | 11.601 | 4.656 | -59.9% | 1.5
is_not_distinct_from(float32,decimal) | 13.335 | 6.294 | -52.8% | 1.1
is_not_distinct_from(float32,float32) | 7.374 | 0.717 | -90.3% | 9.3
is_not_distinct_from(float32,float64) | 7.293 | 0.905 | -87.6% | 7.1
is_not_distinct_from(float32,int16) | 7.000 | 0.945 | -86.5% | 6.4
is_not_distinct_from(float32,int32) | 6.987 | 0.827 | -88.2% | 7.4
is_not_distinct_from(float32,int64) | 7.081 | 0.785 | -88.9% | 8.0
is_not_distinct_from(float64,decimal) | 13.475 | 6.267 | -53.5% | 1.2
is_not_distinct_from(float64,float32) | 7.395 | 0.946 | -87.2% | 6.8
is_not_distinct_from(float64,float64) | 7.324 | 0.962 | -86.9% | 6.6
is_not_distinct_from(float64,int16) | 7.040 | 1.003 | -85.8% | 6.0
is_not_distinct_from(float64,int32) | 6.928 | 0.881 | -87.3% | 6.9
is_not_distinct_from(float64,int64) | 7.086 | 0.868 | -87.8% | 7.2
is_not_distinct_from(int16,decimal) | 11.357 | 4.619 | -59.3% | 1.5
is_not_distinct_from(int16,float32) | 6.800 | 0.846 | -87.6% | 7.0
is_not_distinct_from(int16,float64) | 6.768 | 0.812 | -88.0% | 7.3
is_not_distinct_from(int16,int16) | 8.060 | 0.515 | -93.6% | 14.6
is_not_distinct_from(int16,int32) | 7.237 | 0.558 | -92.3% | 12.0
is_not_distinct_from(int16,int64) | 7.135 | 0.657 | -90.8% | 9.9
is_not_distinct_from(int32,decimal) | 11.236 | 4.422 | -60.6% | 1.5
is_not_distinct_from(int32,float32) | 6.788 | 0.725 | -89.3% | 8.4
is_not_distinct_from(int32,float64) | 6.731 | 0.683 | -89.9% | 8.9
is_not_distinct_from(int32,int16) | 7.208 | 0.554 | -92.3% | 12.0
is_not_distinct_from(int32,int32) | 7.249 | 0.550 | -92.4% | 12.2
is_not_distinct_from(int32,int64) | 7.185 | 0.653 | -90.9% | 10.0
is_not_distinct_from(int64,decimal) | 11.405 | 4.683 | -58.9% | 1.4
is_not_distinct_from(int64,float32) | 6.803 | 0.690 | -89.9% | 8.9
is_not_distinct_from(int64,float64) | 6.781 | 0.660 | -90.3% | 9.3
is_not_distinct_from(int64,int16) | 7.143 | 0.655 | -90.8% | 9.9
is_not_distinct_from(int64,int32) | 7.235 | 0.635 | -91.2% | 10.4
is_not_distinct_from(int64,int64) | 7.289 | 0.625 | -91.4% | 10.7
is_not_distinct_from(interval,interval) | 8.328 | 2.402 | -71.2% | 2.5
is_not_distinct_from(interval,time) | 7.876 | 1.927 | -75.5% | 3.1
is_not_distinct_from(time,interval) | 7.637 | 2.105 | -72.4% | 2.6
is_not_distinct_from(time,time) | 7.426 | 0.632 | -91.5% | 10.7
is_not_distinct_from(timestamp,date) | 7.630 | 1.090 | -85.7% | 6.0
is_not_distinct_from(timestamp,timestamp) | 8.216 | 1.266 | -84.6% | 5.5
is_not_distinct_from(timestampz,timestampz) | 7.269 | 0.616 | -91.5% | 10.8
is_not_distinct_from(varchar,varchar) | 12.216 | 11.491 | -5.9% | 0.1
is_not_false(boolean) | 6.718 | 0.164 | -97.6% | 40.0
is_not_true(boolean) | 6.574 | 0.127 | -98.1% | 50.7
is_true(boolean) | 6.666 | 0.123 | -98.2% | 53.2
less_than_or_equal(boolean,boolean) | 7.450 | 0.177 | -97.6% | 41.0
less_than_or_equal(date,date) | 7.148 | 0.497 | -93.0% | 13.4
less_than_or_equal(date,timestamp) | 7.991 | 2.113 | -73.6% | 2.8
less_than_or_equal(decimal,decimal) | 12.874 | 8.402 | -34.7% | 0.5
less_than_or_equal(decimal,float32) | 13.990 | 6.012 | -57.0% | 1.3
less_than_or_equal(decimal,float64) | 13.920 | 6.017 | -56.8% | 1.3
less_than_or_equal(decimal,int16) | 11.602 | 4.707 | -59.4% | 1.5
less_than_or_equal(decimal,int32) | 11.361 | 4.481 | -60.6% | 1.5
less_than_or_equal(decimal,int64) | 11.618 | 4.653 | -59.9% | 1.5
less_than_or_equal(float32,decimal) | 13.420 | 6.116 | -54.4% | 1.2
less_than_or_equal(float32,float32) | 7.716 | 0.806 | -89.6% | 8.6
less_than_or_equal(float32,float64) | 7.734 | 1.062 | -86.3% | 6.3
less_than_or_equal(float32,int16) | 7.127 | 1.099 | -84.6% | 5.5
less_than_or_equal(float32,int32) | 7.178 | 0.974 | -86.4% | 6.4
less_than_or_equal(float32,int64) | 7.202 | 0.937 | -87.0% | 6.7
less_than_or_equal(float64,decimal) | 13.405 | 6.104 | -54.5% | 1.2
less_than_or_equal(float64,float32) | 7.756 | 1.101 | -85.8% | 6.0
less_than_or_equal(float64,float64) | 7.655 | 1.146 | -85.0% | 5.7
less_than_or_equal(float64,int16) | 7.175 | 1.170 | -83.7% | 5.1
less_than_or_equal(float64,int32) | 7.156 | 1.059 | -85.2% | 5.8
less_than_or_equal(float64,int64) | 7.170 | 1.017 | -85.8% | 6.1
less_than_or_equal(int16,decimal) | 12.477 | 4.948 | -60.3% | 1.5
less_than_or_equal(int16,float32) | 7.668 | 1.053 | -86.3% | 6.3
less_than_or_equal(int16,float64) | 7.630 | 0.956 | -87.5% | 7.0
less_than_or_equal(int16,int16) | 7.952 | 0.464 | -94.2% | 16.1
less_than_or_equal(int16,int32) | 7.172 | 0.486 | -93.2% | 13.8
less_than_or_equal(int16,int64) | 7.159 | 0.564 | -92.1% | 11.7
less_than_or_equal(int32,decimal) | 12.198 | 4.726 | -61.3% | 1.6
less_than_or_equal(int32,float32) | 7.588 | 0.854 | -88.7% | 7.9
less_than_or_equal(int32,float64) | 7.596 | 0.812 | -89.3% | 8.4
less_than_or_equal(int32,int16) | 7.163 | 0.495 | -93.1% | 13.5
less_than_or_equal(int32,int32) | 7.173 | 0.489 | -93.2% | 13.7
less_than_or_equal(int32,int64) | 7.208 | 0.555 | -92.3% | 12.0
less_than_or_equal(int64,decimal) | 12.139 | 4.885 | -59.8% | 1.5
less_than_or_equal(int64,float32) | 7.583 | 0.809 | -89.3% | 8.4
less_than_or_equal(int64,float64) | 7.609 | 0.778 | -89.8% | 8.8
less_than_or_equal(int64,int16) | 7.210 | 0.569 | -92.1% | 11.7
less_than_or_equal(int64,int32) | 7.174 | 0.549 | -92.3% | 12.1
less_than_or_equal(int64,int64) | 7.111 | 0.546 | -92.3% | 12.0
less_than_or_equal(interval,interval) | 8.402 | 2.372 | -71.8% | 2.5
less_than_or_equal(interval,time) | 8.266 | 2.701 | -67.3% | 2.1
less_than_or_equal(time,interval) | 8.573 | 2.868 | -66.5% | 2.0
less_than_or_equal(time,time) | 7.855 | 0.796 | -89.9% | 8.9
less_than_or_equal(timestamp,date) | 7.834 | 1.416 | -81.9% | 4.5
less_than_or_equal(timestamp,timestamp) | 8.426 | 2.240 | -73.4% | 2.8
less_than_or_equal(timestampz,timestampz) | 7.180 | 0.553 | -92.3% | 12.0
less_than_or_equal(varchar,varchar) | 12.270 | 12.479 | 1.7% | -0.0
less_than(boolean,boolean) | 7.098 | 0.180 | -97.5% | 38.4
less_than(date,date) | 6.805 | 0.492 | -92.8% | 12.8
less_than(date,timestamp) | 7.420 | 2.114 | -71.5% | 2.5
less_than(decimal,decimal) | 12.030 | 8.266 | -31.3% | 0.5
less_than(decimal,float32) | 12.915 | 5.947 | -54.0% | 1.2
less_than(decimal,float64) | 13.107 | 5.917 | -54.9% | 1.2
less_than(decimal,int16) | 11.521 | 4.635 | -59.8% | 1.5
less_than(decimal,int32) | 11.229 | 4.622 | -58.8% | 1.4
less_than(decimal,int64) | 11.420 | 4.744 | -58.5% | 1.4
less_than(float32,decimal) | 12.660 | 6.060 | -52.1% | 1.1
less_than(float32,float32) | 7.281 | 0.716 | -90.2% | 9.2
less_than(float32,float64) | 7.183 | 0.892 | -87.6% | 7.1
less_than(float32,int16) | 7.036 | 1.183 | -83.2% | 4.9
less_than(float32,int32) | 7.286 | 1.036 | -85.8% | 6.0
less_than(float32,int64) | 7.204 | 1.002 | -86.1% | 6.2
less_than(float64,decimal) | 12.515 | 5.981 | -52.2% | 1.1
less_than(float64,float32) | 7.029 | 1.009 | -85.6% | 6.0
less_than(float64,float64) | 7.042 | 0.995 | -85.9% | 6.1
less_than(float64,int16) | 6.965 | 1.280 | -81.6% | 4.4
less_than(float64,int32) | 7.003 | 1.137 | -83.8% | 5.2
less_than(float64,int64) | 7.010 | 1.091 | -84.4% | 5.4
less_than(int16,decimal) | 12.217 | 4.980 | -59.2% | 1.5
less_than(int16,float32) | 7.573 | 0.850 | -88.8% | 7.9
less_than(int16,float64) | 7.505 | 0.914 | -87.8% | 7.2
less_than(int16,int16) | 7.270 | 0.468 | -93.6% | 14.5
less_than(int16,int32) | 6.732 | 0.494 | -92.7% | 12.6
less_than(int16,int64) | 6.798 | 0.561 | -91.7% | 11.1
less_than(int32,decimal) | 11.969 | 4.819 | -59.7% | 1.5
less_than(int32,float32) | 7.480 | 0.730 | -90.2% | 9.2
less_than(int32,float64) | 7.452 | 0.791 | -89.4% | 8.4
less_than(int32,int16) | 6.776 | 0.488 | -92.8% | 12.9
less_than(int32,int32) | 6.739 | 0.492 | -92.7% | 12.7
less_than(int32,int64) | 6.785 | 0.567 | -91.6% | 11.0
less_than(int64,decimal) | 11.980 | 5.072 | -57.7% | 1.4
less_than(int64,float32) | 7.400 | 0.705 | -90.5% | 9.5
less_than(int64,float64) | 7.475 | 0.759 | -89.8% | 8.8
less_than(int64,int16) | 6.747 | 0.580 | -91.4% | 10.6
less_than(int64,int32) | 6.773 | 0.558 | -91.8% | 11.1
less_than(int64,int64) | 6.725 | 0.544 | -91.9% | 11.4
less_than(interval,interval) | 7.719 | 2.367 | -69.3% | 2.3
less_than(interval,time) | 8.346 | 2.622 | -68.6% | 2.2
less_than(time,interval) | 8.451 | 2.687 | -68.2% | 2.1
less_than(time,time) | 7.207 | 0.636 | -91.2% | 10.3
less_than(timestamp,date) | 6.849 | 1.414 | -79.4% | 3.8
less_than(timestamp,timestamp) | 7.881 | 2.070 | -73.7% | 2.8
less_than(timestampz,timestampz) | 6.719 | 0.550 | -91.8% | 11.2
less_than(varchar,varchar) | 11.779 | 11.933 | 1.3% | -0.0
not_equal(boolean,boolean) | 7.161 | 0.132 | -98.2% | 53.2
not_equal(date,date) | 6.731 | 0.511 | -92.4% | 12.2
not_equal(date,timestamp) | 7.162 | 1.032 | -85.6% | 5.9
not_equal(decimal,decimal) | 12.252 | 8.220 | -32.9% | 0.5
not_equal(decimal,float32) | 12.396 | 5.577 | -55.0% | 1.2
not_equal(decimal,float64) | 12.360 | 5.417 | -56.2% | 1.3
not_equal(decimal,int16) | 12.200 | 4.613 | -62.2% | 1.6
not_equal(decimal,int32) | 12.079 | 4.486 | -62.9% | 1.7
not_equal(decimal,int64) | 12.280 | 4.547 | -63.0% | 1.7
not_equal(float32,decimal) | 12.211 | 5.471 | -55.2% | 1.2
not_equal(float32,float32) | 6.836 | 0.666 | -90.3% | 9.3
not_equal(float32,float64) | 6.852 | 0.844 | -87.7% | 7.1
not_equal(float32,int16) | 7.538 | 0.890 | -88.2% | 7.5
not_equal(float32,int32) | 7.495 | 0.766 | -89.8% | 8.8
not_equal(float32,int64) | 7.502 | 0.730 | -90.3% | 9.3
not_equal(float64,decimal) | 12.281 | 5.416 | -55.9% | 1.3
not_equal(float64,float32) | 6.858 | 0.858 | -87.5% | 7.0
not_equal(float64,float64) | 6.806 | 0.903 | -86.7% | 6.5
not_equal(float64,int16) | 7.421 | 0.938 | -87.4% | 6.9
not_equal(float64,int32) | 7.378 | 0.821 | -88.9% | 8.0
not_equal(float64,int64) | 7.384 | 0.785 | -89.4% | 8.4
not_equal(int16,decimal) | 12.429 | 4.587 | -63.1% | 1.7
not_equal(int16,float32) | 7.415 | 0.792 | -89.3% | 8.4
not_equal(int16,float64) | 7.339 | 0.753 | -89.7% | 8.7
not_equal(int16,int16) | 7.355 | 0.473 | -93.6% | 14.6
not_equal(int16,int32) | 6.783 | 0.508 | -92.5% | 12.4
not_equal(int16,int64) | 6.795 | 0.605 | -91.1% | 10.2
not_equal(int32,decimal) | 11.881 | 4.394 | -63.0% | 1.7
not_equal(int32,float32) | 7.334 | 0.668 | -90.9% | 10.0
not_equal(int32,float64) | 7.339 | 0.620 | -91.5% | 10.8
not_equal(int32,int16) | 6.722 | 0.510 | -92.4% | 12.2
not_equal(int32,int32) | 6.797 | 0.506 | -92.5% | 12.4
not_equal(int32,int64) | 6.793 | 0.587 | -91.4% | 10.6
not_equal(int64,decimal) | 12.196 | 4.537 | -62.8% | 1.7
not_equal(int64,float32) | 7.398 | 0.621 | -91.6% | 10.9
not_equal(int64,float64) | 7.314 | 0.606 | -91.7% | 11.1
not_equal(int64,int16) | 6.778 | 0.598 | -91.2% | 10.3
not_equal(int64,int32) | 6.762 | 0.594 | -91.2% | 10.4
not_equal(int64,int64) | 6.821 | 0.569 | -91.7% | 11.0
not_equal(interval,interval) | 7.768 | 2.430 | -68.7% | 2.2
not_equal(interval,time) | 8.638 | 1.908 | -77.9% | 3.5
not_equal(time,interval) | 8.337 | 2.062 | -75.3% | 3.0
not_equal(time,time) | 7.151 | 0.574 | -92.0% | 11.5
not_equal(timestamp,date) | 7.156 | 1.053 | -85.3% | 5.8
not_equal(timestamp,timestamp) | 7.566 | 1.186 | -84.3% | 5.4
not_equal(timestampz,timestampz) | 6.760 | 0.574 | -91.5% | 10.8
not_equal(varchar,varchar) | 12.022 | 12.334 | 2.6% | -0.0
or(boolean,boolean) | 11.085 | 0.265 | -97.6% | 40.8
round_digit(decimal,int32) | 9.480 | 2.505 | -73.6% | 2.8
round(decimal) | 7.697 | 3.512 | -54.4% | 1.2
round(float64) | 4.430 | 0.201 | -95.5% | 21.1

</details>

To apply this optimization, we introduced several expression templates in the new `template_fast` module. They are specific to `PrimitiveArray` or `BoolArray`. Operations will be applied to array elements one by one, regardless of the null bitmap and without any branching. Thus the compiler can automatically vectorize them using SIMD instructions.

But given the no-branch requirement, we can not apply this technique on fallible operations such as arithmetics and most of the type casts. Although some of them can be addressed with pre- or post-checks (e.g. pre-check 0 for divide-by-zero error, post-check overflow for addition), they are highly operation-specific and hard to generalize. We'll explore the way to vectorize these fallible operations in the future.

Approved-By: soundOfDestiny
Approved-By: BugenZhao

* refactor(batch): refine visibility and dml executors (#7040)

- Be aware of visibility.
- Split batch chunks with chunk builder before inserting them into the streaming jobs.

Also refactor the implementation of `append_chunk` (`trunc_data_chunk`).

Approved-By: liurenjie1024

* perf(expr): optimize casting to varchar (#7066)

This PR optimizes the performance of casting values to varchar.

It introduced write API for `ToText`, so that strings can be directly written to array buffers without generating String.
The display function of interval and timestampz was also optimized.

<img width="581" alt="perf-cast" src="https://user-images.githubusercontent.com/15158738/209610088-859f0f77-5272-4cb8-bbe3-f743bc0cbe97.png">

<details>
<summary>Click to show full results</summary>

bench | Before time(us) | After time(us) | Change(%) | Speedup
-- | -- | -- | -- | --
cast(timestampz->varchar) | 508.640 | 121.600 | -76.1% | 3.2
cast(timestamp->varchar) | 166.200 | 58.245 | -65.0% | 1.9
cast(float64->varchar) | 78.386 | 57.597 | -26.5% | 0.4
cast(float32->varchar) | 57.903 | 37.384 | -35.4% | 0.5
cast(date->varchar) | 86.896 | 32.669 | -62.4% | 1.7
cast(time->varchar) | 47.508 | 28.428 | -40.2% | 0.7
cast(decimal->varchar) | 67.682 | 28.317 | -58.2% | 1.4
cast(int16->varchar) | 29.532 | 12.337 | -58.2% | 1.4
cast(int64->varchar) | 52.043 | 12.319 | -76.3% | 3.2
cast(int32->varchar) | 28.863 | 12.258 | -57.5% | 1.4
cast(boolean->varchar) | 26.826 | 6.396 | -76.2% | 3.2
bool_out(boolean) | 25.480 | 5.126 | -79.9% | 4.0

</details>

The `writer` argument of string functions was also changed from `StringWriter<'_>` to `&mut dyn Write`, making them decouple from array. I tried to use `&mut impl Write` but was blocked by annoying lifetime issues. Anyways, the performance of these operations is still slightly improved:

<img width="600" alt="perf-string-ops" src="https://user-images.githubusercontent.com/15158738/209610928-8036e4d1-e994-4178-8ce4-ff1340877e47.png">

<details>
<summary>Click to show full results</summary>

bench | Before time(us) | After time(us) | Change(%) | Speedup
-- | -- | -- | -- | --
rtrim(varchar,varchar) | 21.780 | 15.768 | -27.6% | 0.4
substr(varchar,int32,int32) | 11.126 | 8.090 | -27.3% | 0.4
rtrim(varchar) | 10.537 | 7.712 | -26.8% | 0.4
substr(varchar,int32) | 9.198 | 7.111 | -22.7% | 0.3
ltrim(varchar) | 9.661 | 8.010 | -17.1% | 0.2
trim(varchar) | 11.308 | 9.618 | -14.9% | 0.2
overlay(varchar,varchar,int32,int32) | 17.107 | 14.697 | -14.1% | 0.2
overlay(varchar,varchar,int32) | 13.408 | 12.007 | -10.4% | 0.1
ltrim(varchar,varchar) | 21.198 | 19.021 | -10.3% | 0.1
trim(varchar,varchar) | 20.876 | 19.205 | -8.0% | 0.1
split_part(varchar,varchar,int32) | 30.708 | 29.293 | -4.6% | 0.0
md5(varchar) | 346.010 | 331.670 | -4.1% | 0.0

</details>

Approved-By: BowenXiao1999
Approved-By: BugenZhao

* feat(optimizer): support share operator (#6956)

- `LogicalShare` operator is used to represent reusing of existing operators. It could have multiple parents which makes it different from other operators.
- Because most of our optimizations assume that our plan is a tree structure, in order to represent the DAG structured plan, we need to modify our optimizations and prevent them break our DAG plan back to a tree plan accidentally.
- Optimization including predicate pushdown, column pruning, heuristic optimizer, stream rewrite and to stream, all of them can break DAG plan back to a tree plan if we don't take care.
- Let me take predicate pushdown as an example to illustrate how to implement predicate pushdown for `LogicalShare`. We use a context for `LogicalShare` to keep track of how many times predicate has been pushdown for `LogicalShare`. Once pushdown times equal the parent number of the `LogicalShare`, we can merge all the previous predicates into one and then push it down for the input of `LogicalShare`.
- Heuristic optimizer's previous rules won't match any `LogicalShare`, so `LogicalShare` wouldn't affect its correctness.
- At the end of optimizer for batch query, we try to convert DAG back to Tree for now by removing `LogicalShare` (the rule named `DagToTreeRule`), because our batch executor doesn't support execute DAG plan directly currently.
- This PR also supports reusing source by `ShareSourceRewriter`. `ShareSourceRewriter` will replace all the sources occurred more than once in the streaming query with share operator.

Approved-By: st1page

Co-Authored-By: Dylan Chen <zilin@singularity-data.com>
Co-Authored-By: Dylan <chenzl25@mail2.sysu.edu.cn>

* perf(expr): vectorize infallible casts (#7079)

Similar to #7055, this PR vectorizes infallible casts.

<img width="516" alt="perf-infallible-cast" src="https://user-images.githubusercontent.com/15158738/209652449-61dc1513-7255-436c-aa36-e5a0d1dec384.png">

<details>
<summary>Click to show full results</summary>

bench | Before time(us) | After time(us) | Change(%) | Speedup
-- | -- | -- | -- | --
cast(int16->float32) | 4.434 | 0.146 | -96.7% | 29.3
cast(int16->int32) | 4.408 | 0.154 | -96.5% | 27.7
cast(float32->float64) | 4.432 | 0.187 | -95.8% | 22.7
cast(int32->int64) | 4.415 | 0.192 | -95.7% | 22.0
cast(int32->float64) | 4.422 | 0.194 | -95.6% | 21.8
cast(int16->int64) | 4.412 | 0.212 | -95.2% | 19.8
cast(timestamp->date) | 4.409 | 0.226 | -94.9% | 18.5
cast(timestamp->time) | 5.443 | 0.300 | -94.5% | 17.1
cast(date->timestamp) | 5.504 | 0.304 | -94.5% | 17.1
cast(int16->float64) | 4.430 | 0.298 | -93.3% | 13.9
cast(int32->decimal) | 5.582 | 0.592 | -89.4% | 8.4
cast(time->interval) | 5.511 | 0.727 | -86.8% | 6.6
cast(int64->decimal) | 5.739 | 0.766 | -86.7% | 6.5
cast(int16->decimal) | 5.760 | 0.845 | -85.3% | 5.8
cast(interval->time) | 5.903 | 1.289 | -78.2% | 3.6
cast(float32->decimal) | 21.970 | 18.170 | -17.3% | 0.2
cast(float64->decimal) | 40.131 | 36.049 | -10.2% | 0.1

</details>

Approved-By: BowenXiao1999

* fix: clean states in local barrier manager after actor dropped (#7082)

Trying to fix continuous recovery found in longevity and chaos test. I found that two problems might be the root cause of continuous recovery:
1. Fixed, unnecessary recovery triggered as described in #6989 . As I tested locally, when workload was very high, there were many ongoing barrier collect responses(up to 80+) when recovery. After recovery finished, each response would trigger a recovery process, because the whole cluster has already reset to previous committed epoch.
2. Before this PR, when force stopping actors in CN, the local manger will clean all states and then abort all actors. The problem is between cleaning states and aborting actors, the actors could also report epoch collected or error status to local barrier manager especially when the number of actors is high. This will cause a chain reaction in recovery.

I tested it locally and the recovery became normal. Besides, it could also be the cause of #6639 , #6715 .

Approved-By: fuyufjh
Approved-By: BugenZhao

* fix: return error if source executor failed to receive the first barrier (#7086)

**This section will be used as the commit message. Please do not leave this empty!**

Please explain **IN DETAIL** what the changes are in this PR and why they are needed:

this is a temp fix for #6931, and I believe that this can happen in rare cases.

From the log described in the issue, there is a failover that occurred before the panic. I think some part of meta failed to recover and the barrier channel closed for some reason.

It shows a possibility that meta node could fail to recover and compute node should be robots enough rather than panicking.

Approved-By: waruto210
Approved-By: xx01cyx

* fix(optimizer): fix hop window column pruning (#7085)

- Fix hop window column pruning.

Approved-By: st1page

* fix(streaming): fix memory leaks in streaming hash join (#7089)

Fix #6942. See the detailed discussions there.

The bug is inside the BTreeMap. For now, I will just remove that part of code because we don't rely on Allocator API to get memory usage now.

![image](https://user-images.githubusercontent.com/10192522/209688807-84ae0f84-9e17-44ae-8498-a378ee6b951e.png)

Approved-By: yuhao-su
Approved-By: BugenZhao

* fix: remove redundant `append_only: true` in explain (#7119)

fix: remove redundant `append_only: true` in explain result, since it has been expressed by the name "StreamAppendOnlyHashJoin".

Approved-By: chenzl25

* feat: Failover follower to leader (#6937)

https://github.com/risingwavelabs/risingwave/issues/6936

Approved-By: yezizp2012

* feat(meta): validate CDC connector properties during create source (#6938)

Validate connector properties on Meta (in `DebeziumSplitEnumerator`) when creating CDC source.
As the conclusion of https://github.com/risingwavelabs/rfcs/pull/29, we will deploy a sidecar connector node colocated with Meta on the cloud  to validate the connector properties.

Examples:
1. Wrong password
```
create materialized source products ( id INT,
name STRING,
description STRING,
PRIMARY KEY (id)
) with (
connector = 'mysql-cdc',
hostname = '127.0.0.1',
port = '3306',
username = 'root',
password = '12346',
database.name = 'mydb',
table.name = 'prodts',
server.id = '5085',
debezium.a.b = 'test'
) row format debezium_json;
ERROR:  QueryError: internal error: gRPC error (Client specified an invalid argument): Access denied for user 'root'@'localhost' (using password: YES)
```
2. Wrong table name
```
dev=> create materialized source products ( id INT,
name STRING,
description STRING,
PRIMARY KEY (id)
) with (
connector = 'mysql-cdc',
hostname = '127.0.0.1',
port = '3306',
username = 'root',
password = '123456',
database.name = 'mydb',
table.name = 'prodts',
server.id = '5085',
debezium.a.b = 'test'
) row format debezium_json;
ERROR:  QueryError: internal error: gRPC error (Client specified an invalid argument): table doesn't exist
```

Approved-By: tabVersion

* refactor: decouple memory management from stream, make it accessible for both batch and streaming (#7004)

Main idea:
1. Rename `LruManager` to `GlobalMemoryManager`, move it from `stream` crate to `compute` crate. Can not move to `common` as it depends on `risingwave_stream` and `risingwave_batch`. This comes from what we have discussed in the memory management rfc: https://github.com/risingwavelabs/rfcs/pull/26.

2. Fully decouple `risingwave_stream` and memory manager. Before this pr, streaming executor access to lru manager to create cache. However, this will cause cyclic reference if we move `LruManager` out from `risingwave_stream`. What executor really need is the watermark epoch, so instead of let `risingwave_stream` access to Memory Manager, just store the watermark epoch in the `LocalStreamManager` and when executors are building, they can read this value and then they can create cache with their own. Personally I think this is more clean: memory manager have access to stream/batch two components, and vic versa no.

3. Currently the memory manager ref is not stored anywhere. Thinking of where to store it. 🤔

Approved-By: liurenjie1024

* feat: support plan generating & execution for new DDL & DML design (#6836)

This PR applies `SourceExecutorV2`, `DmlExecutor`, `RowIdGenExecutor`, and `DmlManager` to query execution.
For example, the query plan for `CREATE TABLE t (v int)` will be:

```SQL
StreamMaterialize { columns: [v, _row_id(hidden)], pk_columns: [_row_id] }
└─StreamExchange { dist: HashShard(_row_id) }
└─StreamRowIdGen { row_id_index: 1 }
└─StreamDml { columns: [v, _row_id] }
└─StreamSource
```

Some explanations:
- `StreamSource` here contains no actual external streaming source. It is only responsible for receiving barriers.
- `StreamDml` will receive data from `InsertExecutor`, `DeleteExecutor`, and `UpdateExecutor`.
- `StreamRowIdGen` will generate row id for the data. In this case, the primary key is not defined by the user, so we internally add a `_row_id` column as the primary key. If the table has a user-defined primary key, then this executor can be eliminated.

Note that now **"source" stands for streaming source only**. There is **NO table source** now. Though `CREATE TABLE` will create a `StreamSource`, it actually contains nothing related to a source (catalog).

Approved-By: st1page
Approved-By: BugenZhao
Approved-By: yezizp2012

Co-Authored-By: xx01cyx <caoyuanxin0531@outlook.com>
Co-Authored-By: st1page <1245835950@qq.com>

* feat(frontend): avoid pk duplication  (#7095)

avoid pk duplication for streaming executors, and still allow join key duplication.

Approved-By: st1page

* feat(optimizer): improve column pruning for share operator and perform share source at the beginning (#7111)

- Improve column pruning for share operator. We need 2 round column pruning for DAG plan.
- Perform share source at the beginning so that we can benefit from predicate pushdown and column pruning.

Approved-By: st1page
Approved-By: fuyufjh

* fix(streaming): handle scaling for row id gen executor (#7122)

Correctly handle the scaling for RowIdGen executor. The logic used to work fine, but was lost in the refactoring in #6529.

Approved-By: yezizp2012
Approved-By: xx01cyx

* fix(optimizer): fix logical join o2i_col_mapping (#7108)

- Fix logical join `o2i_col_mapping` by using `output_indices` directly instead of inverse `i2o_col_mapping`.

Approved-By: fuyufjh

Co-Authored-By: Dylan Chen <zilin@singularity-data.com>
Co-Authored-By: xxchan <xxchan22f@gmail.com>

* fix(frontend): hash join do not deduplicate input pk (#7123)

Previously we want to deduplicate pk for streaming executors, however:
- agg will do prefix scan by group key, so we can not deduplicate group key
- hash join will do prefix scan by join key, so we can not deduplicate join key
- hash join need to be aware of input pk, and there might be an inconsistency between the pk of hash join state table and the input pk got in hash join executor

so we decided not to handle deduplicated input pk now, we may  complete the dedup task case by case, just like agg instead of add a general method in catalog builder.

Approved-By: yuhao-su

Co-Authored-By: congyi <15605187270@163.com>
Co-Authored-By: congyi wang <58715567+wcy-fdu@users.noreply.github.com>

* fix: fix NULL regexp capture group (#7129)

If a particular capture group didn't participate in the match, we should return `NULL` instead of skipping it.

fix https://github.com/risingwavelabs/risingwave/issues/7126

Approved-By: TennyZhuang

* chore(test): compress the test data (#7007)

Reduce size from 16MB to 400KB.

Approved-By: tabVersion

* fix: support kafka sink for struct and list type  (#7098)

**This section will be used as the commit message. Please do not leave this empty!**

fix type matches for struct and list in kafka sink

& add struct and list test cases in ut
& add script command for compress test cases into zip file

Approved-By: lmatz

Co-Authored-By: tabVersion <tabvision@bupt.icu>
Co-Authored-By: lmatz <lmatz823@gmail.com>

Signed-off-by: Runji Wang <wangrunji0408@163.com>
Co-authored-by: Eric Fu <eric@singularity-data.com>
Co-authored-by: Dylan <chenzl25@mail2.sysu.edu.cn>
Co-authored-by: Dylan Chen <zilin@singularity-data.com>
Co-authored-by: xxchan <xxchan22f@gmail.com>
Co-authored-by: Bugen Zhao <i@bugenzhao.com>
Co-authored-by: zwang28 <70626450+zwang28@users.noreply.github.com>
Co-authored-by: zwang28 <84491488@qq.com>
Co-authored-by: Runji Wang <wangrunji0408@163.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: August <pin@singularity-data.com>
Co-authored-by: Noel Kwan <47273164+kwannoel@users.noreply.github.com>
Co-authored-by: Yuhao Su <31772373+yuhao-su@users.noreply.github.com>
Co-authored-by: TennyZhuang <zty0826@gmail.com>
Co-authored-by: Bohan Zhang <tabvision@bupt.icu>
Co-authored-by: CAJan93 <jan.mensch@gmx.net>
Co-authored-by: StrikeW <wangsiyuanse@gmail.com>
Co-authored-by: Bowen <36908971+BowenXiao1999@users.noreply.github.com>
Co-authored-by: Yuanxin Cao <60498509+xx01cyx@users.noreply.github.com>
Co-authored-by: xx01cyx <caoyuanxin0531@outlook.com>
Co-authored-by: st1page <1245835950@qq.com>
Co-authored-by: congyi wang <58715567+wcy-fdu@users.noreply.github.com>
Co-authored-by: congyi <15605187270@163.com>
  • Loading branch information
23 people authored Jan 3, 2023
1 parent a82fca6 commit fbb3c31
Show file tree
Hide file tree
Showing 355 changed files with 9,031 additions and 12,356 deletions.
38 changes: 38 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion ci/scripts/build-other.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ set -euo pipefail
source ci/scripts/common.env.sh

# Should set a stable version of connector node
STABLE_VERSION=9d2e1331680335661fd3e55ee234532358683201
STABLE_VERSION=bd12fb55c75f09b234d1b75b8671b7582ca533f3

echo "--- Build Java connector node"
git clone https://"$GITHUB_TOKEN"@github.com/risingwavelabs/risingwave-connector-node.git
Expand Down
6 changes: 6 additions & 0 deletions ci/scripts/deterministic-e2e-test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,12 @@ echo "--- Download artifacts"
buildkite-agent artifact download risingwave_simulation .
chmod +x ./risingwave_simulation

echo "--- Extract data for Kafka"
cd ./scripts/source/
mkdir -p ./test_data
unzip -o test_data.zip -d .
cd ../../

export RUST_LOG=info
export LOGDIR=.risingwave/log

Expand Down
6 changes: 6 additions & 0 deletions ci/scripts/e2e-source-test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,12 @@ nohup java -jar ./connector-service.jar --port 60061 > .risingwave/log/connector
# start risingwave cluster
cargo make ci-start ci-1cn-1fe-with-recovery
sleep 2

echo "---- mysql & postgres cdc validate test"
sqllogictest -p 4566 -d dev './e2e_test/source/cdc/cdc.validate.mysql.slt'
sqllogictest -p 4566 -d dev './e2e_test/source/cdc/cdc.validate.postgres.slt'

echo "---- mysql & postgres load and check"
sqllogictest -p 4566 -d dev './e2e_test/source/cdc/cdc.load.slt'
# wait for cdc loading
sleep 10
Expand Down
2 changes: 1 addition & 1 deletion ci/scripts/run-e2e-test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ if [[ "$RUN_COMPACTION" -eq "1" ]]; then
chmod +x ./target/debug/compaction-test
# Use the config of ci-compaction-test for replay.
config_path=".risingwave/config/risingwave.toml"
./target/debug/compaction-test --ci-mode true --state-store hummock+minio://hummockadmin:hummockadmin@127.0.0.1:9301/hummock001 --config-path "${config_path}"
RUST_LOG="info,risingwave_stream=info,risingwave_batch=info,risingwave_storage=info" ./target/debug/compaction-test --ci-mode true --state-store hummock+minio://hummockadmin:hummockadmin@127.0.0.1:9301/hummock001 --config-path "${config_path}"

echo "--- Kill cluster"
cargo make ci-kill
Expand Down
2 changes: 1 addition & 1 deletion dashboard/pages/data_sources.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ export default function DataSources() {
<Td>{source.id}</Td>
<Td>{source.name}</Td>
<Td>{source.owner}</Td>
<Td>{source.info?.$case}</Td>
<Td>{source.info}</Td>
<Td>
<Button
size="sm"
Expand Down
84 changes: 41 additions & 43 deletions dashboard/proto/gen/batch_plan.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit fbb3c31

Please sign in to comment.