Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix velox circle ci issue caused by moving master to main branch #1

Closed
wants to merge 1 commit into from
Closed

Fix velox circle ci issue caused by moving master to main branch #1

wants to merge 1 commit into from

Conversation

mshang816
Copy link
Contributor

Differential Revision: D30185670

Differential Revision: D30185670

fbshipit-source-id: 38c848286dd6d6bd717a114a1965504a5b782146
@facebook-github-bot facebook-github-bot added CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported labels Aug 9, 2021
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D30185670

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 9a36858.

winningsix pushed a commit to winningsix/velox-1 that referenced this pull request Sep 27, 2021
winningsix pushed a commit to winningsix/velox-1 that referenced this pull request Sep 27, 2021
@pedroerp pedroerp mentioned this pull request Nov 8, 2021
facebook-github-bot pushed a commit that referenced this pull request Mar 29, 2022
Summary:
The actual fix is to replace

```
lookup.normalizedKeys.resize(numRows);
```

with

```
lookup.normalizedKeys.resize(lookup.rows.back() + 1);
```

Since this logic is duplicated in groupProbe and joinProbe, this PR extracts it into a helper function.

Added a test. Before the fix the test would crash in ASAN mode:

```
==1665828==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x631001514800 at pc 0x7fb451ebad8f bp 0x7ffe2d8b1130 sp 0x7ffe2d8b1128
WRITE of size 8 at 0x631001514800 thread T0
SCARINESS: 42 (8-byte-write-heap-buffer-overflow)
    #0 0x7fb451ebad8e in facebook::velox::exec::HashTable<false>::groupProbe(facebook::velox::exec::HashLookup&) velox/exec/HashTable.cpp:373
    #1 0x8bcb57 in HashTableTest::insertGroups(facebook::velox::RowVector const&, facebook::velox::SelectivityVector const&, facebook::velox::exec::HashLookup&, facebook::velox::exec::HashTable<false>&) velox/exec/tests/HashTableTest.cpp:186
```

Pull Request resolved: #1302

Reviewed By: kgpai

Differential Revision: D35205256

Pulled By: mbasmanova

fbshipit-source-id: b3719df55513192f64b01997d293807310ebad51
facebook-github-bot pushed a commit that referenced this pull request Apr 28, 2022
Summary:
Enhance printExprWithStats to identify common-sub expressions.

For example, `c0 + c1` is a common sub-expression in
`"(c0 + c1) % 5", " (c0 + c1) % 3"` expression set. It is evaluated only once and
there is a single Expr object that represents it. That object appears in the
expression tree twice. printExprWithStats does not show the runtime stats for
second instance of that expression and instead annotates it with `[CSE https://github.com/facebookincubator/velox/issues/2]`,
where CSE stands for common sub-expression and 2 refers to the first instance
of the expression.

```
mod [cpu time: 50.49us, rows: 1024] -> BIGINT [#1]
   cast(plus as BIGINT) [cpu time: 68.15us, rows: 1024] -> BIGINT [#2]
      plus [cpu time: 51.84us, rows: 1024] -> INTEGER [#3]
         c0 [cpu time: 0ns, rows: 0] -> INTEGER [#4]
         c1 [cpu time: 0ns, rows: 0] -> INTEGER [#5]
   5:BIGINT [cpu time: 0ns, rows: 0] -> BIGINT [#6]

mod [cpu time: 49.29us, rows: 1024] -> BIGINT [#7]
   cast((plus(c0, c1)) as BIGINT) -> BIGINT [CSE #2]
   3:BIGINT [cpu time: 0ns, rows: 0] -> BIGINT [#8]
```

Pull Request resolved: #1500

Reviewed By: Yuhta

Differential Revision: D35994836

Pulled By: mbasmanova

fbshipit-source-id: 6bacbbe61b68dad97ce2fd5f99610c4ad55897be
shiyu-bytedance pushed a commit to shiyu-bytedance/velox-1 that referenced this pull request Aug 18, 2022
Summary:
The actual fix is to replace

```
lookup.normalizedKeys.resize(numRows);
```

with

```
lookup.normalizedKeys.resize(lookup.rows.back() + 1);
```

Since this logic is duplicated in groupProbe and joinProbe, this PR extracts it into a helper function.

Added a test. Before the fix the test would crash in ASAN mode:

```
==1665828==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x631001514800 at pc 0x7fb451ebad8f bp 0x7ffe2d8b1130 sp 0x7ffe2d8b1128
WRITE of size 8 at 0x631001514800 thread T0
SCARINESS: 42 (8-byte-write-heap-buffer-overflow)
    #0 0x7fb451ebad8e in facebook::velox::exec::HashTable<false>::groupProbe(facebook::velox::exec::HashLookup&) velox/exec/HashTable.cpp:373
    facebookincubator#1 0x8bcb57 in HashTableTest::insertGroups(facebook::velox::RowVector const&, facebook::velox::SelectivityVector const&, facebook::velox::exec::HashLookup&, facebook::velox::exec::HashTable<false>&) velox/exec/tests/HashTableTest.cpp:186
```

Pull Request resolved: facebookincubator#1302

Reviewed By: kgpai

Differential Revision: D35205256

Pulled By: mbasmanova

fbshipit-source-id: b3719df55513192f64b01997d293807310ebad51
shiyu-bytedance pushed a commit to shiyu-bytedance/velox-1 that referenced this pull request Aug 18, 2022
…tor#1500)

Summary:
Enhance printExprWithStats to identify common-sub expressions.

For example, `c0 + c1` is a common sub-expression in
`"(c0 + c1) % 5", " (c0 + c1) % 3"` expression set. It is evaluated only once and
there is a single Expr object that represents it. That object appears in the
expression tree twice. printExprWithStats does not show the runtime stats for
second instance of that expression and instead annotates it with `[CSE https://github.com/facebookincubator/velox/issues/2]`,
where CSE stands for common sub-expression and 2 refers to the first instance
of the expression.

```
mod [cpu time: 50.49us, rows: 1024] -> BIGINT [facebookincubator#1]
   cast(plus as BIGINT) [cpu time: 68.15us, rows: 1024] -> BIGINT [facebookincubator#2]
      plus [cpu time: 51.84us, rows: 1024] -> INTEGER [facebookincubator#3]
         c0 [cpu time: 0ns, rows: 0] -> INTEGER [facebookincubator#4]
         c1 [cpu time: 0ns, rows: 0] -> INTEGER [facebookincubator#5]
   5:BIGINT [cpu time: 0ns, rows: 0] -> BIGINT [facebookincubator#6]

mod [cpu time: 49.29us, rows: 1024] -> BIGINT [facebookincubator#7]
   cast((plus(c0, c1)) as BIGINT) -> BIGINT [CSE facebookincubator#2]
   3:BIGINT [cpu time: 0ns, rows: 0] -> BIGINT [facebookincubator#8]
```

Pull Request resolved: facebookincubator#1500

Reviewed By: Yuhta

Differential Revision: D35994836

Pulled By: mbasmanova

fbshipit-source-id: 6bacbbe61b68dad97ce2fd5f99610c4ad55897be
liujiayi771 referenced this pull request in liujiayi771/velox Mar 9, 2023
mbasmanova added a commit to mbasmanova/velox-1 that referenced this pull request Sep 18, 2023
Summary:

FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment)

```
FlatVector.h

  void copyRanges(
      const BaseVector* source,
      const folly::Range<const BaseVector::CopyRange*>& ranges) override {
    for (auto& range : ranges) {
      copy(source, range.targetIndex, range.sourceIndex, range.count);
    }
  }
```

This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same.

An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps.

```
Before:

array_constructor_ARRAY_nullfree##2_const                  54.31ms     18.41
array_constructor_ARRAY_nulls#facebookincubator#1                           33.50ms     29.85
array_constructor_ARRAY_nulls#facebookincubator#2                           59.05ms     16.93
array_constructor_ARRAY_nulls#facebookincubator#3                           88.36ms     11.32
array_constructor_ARRAY_nulls##2_null                      74.53ms     13.42
array_constructor_ARRAY_nulls##2_const                    102.54ms      9.75

After:

array_constructor_ARRAY_nullfree##2_const                  41.36ms     24.18
array_constructor_ARRAY_nulls#facebookincubator#1                           30.51ms     32.78
array_constructor_ARRAY_nulls#facebookincubator#2                           55.13ms     18.14
array_constructor_ARRAY_nulls#facebookincubator#3                           77.93ms     12.83
array_constructor_ARRAY_nulls##2_null                      68.84ms     14.53
array_constructor_ARRAY_nulls##2_const                     83.91ms     11.92

Before:

array_constructor_MAP_nullfree##2_const                    67.44ms     14.83
array_constructor_MAP_nulls#facebookincubator#1                             37.00ms     27.03
array_constructor_MAP_nulls#facebookincubator#2                             67.76ms     14.76
array_constructor_MAP_nulls#facebookincubator#3                            100.88ms      9.91
array_constructor_MAP_nulls##2_null                        84.22ms     11.87
array_constructor_MAP_nulls##2_const                      122.55ms      8.16

After:

array_constructor_MAP_nullfree##2_const                    49.94ms     20.03
array_constructor_MAP_nulls#facebookincubator#1                             34.34ms     29.12
array_constructor_MAP_nulls#facebookincubator#2                             55.23ms     18.11
array_constructor_MAP_nulls#facebookincubator#3                             82.64ms     12.10
array_constructor_MAP_nulls##2_null                        70.74ms     14.14
array_constructor_MAP_nulls##2_const                       88.13ms     11.35
```

Differential Revision: D49269500
mbasmanova added a commit to mbasmanova/velox-1 that referenced this pull request Sep 18, 2023
Summary:
Pull Request resolved: facebookincubator#6566

FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment)

```
FlatVector.h

  void copyRanges(
      const BaseVector* source,
      const folly::Range<const BaseVector::CopyRange*>& ranges) override {
    for (auto& range : ranges) {
      copy(source, range.targetIndex, range.sourceIndex, range.count);
    }
  }
```

This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same.

An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps.

```
Before:

array_constructor_ARRAY_nullfree##2_const                  54.31ms     18.41
array_constructor_ARRAY_nulls#facebookincubator#1                           33.50ms     29.85
array_constructor_ARRAY_nulls#facebookincubator#2                           59.05ms     16.93
array_constructor_ARRAY_nulls#facebookincubator#3                           88.36ms     11.32
array_constructor_ARRAY_nulls##2_null                      74.53ms     13.42
array_constructor_ARRAY_nulls##2_const                    102.54ms      9.75

After:

array_constructor_ARRAY_nullfree##2_const                  41.36ms     24.18
array_constructor_ARRAY_nulls#facebookincubator#1                           30.51ms     32.78
array_constructor_ARRAY_nulls#facebookincubator#2                           55.13ms     18.14
array_constructor_ARRAY_nulls#facebookincubator#3                           77.93ms     12.83
array_constructor_ARRAY_nulls##2_null                      68.84ms     14.53
array_constructor_ARRAY_nulls##2_const                     83.91ms     11.92

Before:

array_constructor_MAP_nullfree##2_const                    67.44ms     14.83
array_constructor_MAP_nulls#facebookincubator#1                             37.00ms     27.03
array_constructor_MAP_nulls#facebookincubator#2                             67.76ms     14.76
array_constructor_MAP_nulls#facebookincubator#3                            100.88ms      9.91
array_constructor_MAP_nulls##2_null                        84.22ms     11.87
array_constructor_MAP_nulls##2_const                      122.55ms      8.16

After:

array_constructor_MAP_nullfree##2_const                    49.94ms     20.03
array_constructor_MAP_nulls#facebookincubator#1                             34.34ms     29.12
array_constructor_MAP_nulls#facebookincubator#2                             55.23ms     18.11
array_constructor_MAP_nulls#facebookincubator#3                             82.64ms     12.10
array_constructor_MAP_nulls##2_null                        70.74ms     14.14
array_constructor_MAP_nulls##2_const                       88.13ms     11.35
```

Differential Revision: D49269500

fbshipit-source-id: 8fc304cce31f782aef818e6d8ce052360af8e832
mbasmanova added a commit to mbasmanova/velox-1 that referenced this pull request Sep 18, 2023
Summary:
Pull Request resolved: facebookincubator#6566

FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment)

```
FlatVector.h

  void copyRanges(
      const BaseVector* source,
      const folly::Range<const BaseVector::CopyRange*>& ranges) override {
    for (auto& range : ranges) {
      copy(source, range.targetIndex, range.sourceIndex, range.count);
    }
  }
```

This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same.

An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps.

```
Before:

array_constructor_ARRAY_nullfree##2_const                  54.31ms     18.41
array_constructor_ARRAY_nulls#facebookincubator#1                           33.50ms     29.85
array_constructor_ARRAY_nulls#facebookincubator#2                           59.05ms     16.93
array_constructor_ARRAY_nulls#facebookincubator#3                           88.36ms     11.32
array_constructor_ARRAY_nulls##2_null                      74.53ms     13.42
array_constructor_ARRAY_nulls##2_const                    102.54ms      9.75

After:

array_constructor_ARRAY_nullfree##2_const                  41.36ms     24.18
array_constructor_ARRAY_nulls#facebookincubator#1                           30.51ms     32.78
array_constructor_ARRAY_nulls#facebookincubator#2                           55.13ms     18.14
array_constructor_ARRAY_nulls#facebookincubator#3                           77.93ms     12.83
array_constructor_ARRAY_nulls##2_null                      68.84ms     14.53
array_constructor_ARRAY_nulls##2_const                     83.91ms     11.92

Before:

array_constructor_MAP_nullfree##2_const                    67.44ms     14.83
array_constructor_MAP_nulls#facebookincubator#1                             37.00ms     27.03
array_constructor_MAP_nulls#facebookincubator#2                             67.76ms     14.76
array_constructor_MAP_nulls#facebookincubator#3                            100.88ms      9.91
array_constructor_MAP_nulls##2_null                        84.22ms     11.87
array_constructor_MAP_nulls##2_const                      122.55ms      8.16

After:

array_constructor_MAP_nullfree##2_const                    49.94ms     20.03
array_constructor_MAP_nulls#facebookincubator#1                             34.34ms     29.12
array_constructor_MAP_nulls#facebookincubator#2                             55.23ms     18.11
array_constructor_MAP_nulls#facebookincubator#3                             82.64ms     12.10
array_constructor_MAP_nulls##2_null                        70.74ms     14.14
array_constructor_MAP_nulls##2_const                       88.13ms     11.35
```

Differential Revision: D49269500

fbshipit-source-id: 054dcec94cc3768443440ffb57d58c96f9b1f006
mbasmanova added a commit to mbasmanova/velox-1 that referenced this pull request Sep 18, 2023
Summary:
Pull Request resolved: facebookincubator#6566

FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment)

```
FlatVector.h

  void copyRanges(
      const BaseVector* source,
      const folly::Range<const BaseVector::CopyRange*>& ranges) override {
    for (auto& range : ranges) {
      copy(source, range.targetIndex, range.sourceIndex, range.count);
    }
  }
```

This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same.

An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps.

```
Before:

array_constructor_ARRAY_nullfree##2_const                  54.31ms     18.41
array_constructor_ARRAY_nulls#facebookincubator#1                           33.50ms     29.85
array_constructor_ARRAY_nulls#facebookincubator#2                           59.05ms     16.93
array_constructor_ARRAY_nulls#facebookincubator#3                           88.36ms     11.32
array_constructor_ARRAY_nulls##2_null                      74.53ms     13.42
array_constructor_ARRAY_nulls##2_const                    102.54ms      9.75

After:

array_constructor_ARRAY_nullfree##2_const                  41.36ms     24.18
array_constructor_ARRAY_nulls#facebookincubator#1                           30.51ms     32.78
array_constructor_ARRAY_nulls#facebookincubator#2                           55.13ms     18.14
array_constructor_ARRAY_nulls#facebookincubator#3                           77.93ms     12.83
array_constructor_ARRAY_nulls##2_null                      68.84ms     14.53
array_constructor_ARRAY_nulls##2_const                     83.91ms     11.92

Before:

array_constructor_MAP_nullfree##2_const                    67.44ms     14.83
array_constructor_MAP_nulls#facebookincubator#1                             37.00ms     27.03
array_constructor_MAP_nulls#facebookincubator#2                             67.76ms     14.76
array_constructor_MAP_nulls#facebookincubator#3                            100.88ms      9.91
array_constructor_MAP_nulls##2_null                        84.22ms     11.87
array_constructor_MAP_nulls##2_const                      122.55ms      8.16

After:

array_constructor_MAP_nullfree##2_const                    49.94ms     20.03
array_constructor_MAP_nulls#facebookincubator#1                             34.34ms     29.12
array_constructor_MAP_nulls#facebookincubator#2                             55.23ms     18.11
array_constructor_MAP_nulls#facebookincubator#3                             82.64ms     12.10
array_constructor_MAP_nulls##2_null                        70.74ms     14.14
array_constructor_MAP_nulls##2_const                       88.13ms     11.35
```

Reviewed By: laithsakka

Differential Revision: D49269500

fbshipit-source-id: 2fdf6f2a717a3bc4bf54fb3d9bd970157a8c415b
mbasmanova added a commit to mbasmanova/velox-1 that referenced this pull request Sep 18, 2023
Summary:
Pull Request resolved: facebookincubator#6566

FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment)

```
FlatVector.h

  void copyRanges(
      const BaseVector* source,
      const folly::Range<const BaseVector::CopyRange*>& ranges) override {
    for (auto& range : ranges) {
      copy(source, range.targetIndex, range.sourceIndex, range.count);
    }
  }
```

This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same.

An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps.

```
Before:

array_constructor_ARRAY_nullfree##2_const                  54.31ms     18.41
array_constructor_ARRAY_nulls#facebookincubator#1                           33.50ms     29.85
array_constructor_ARRAY_nulls#facebookincubator#2                           59.05ms     16.93
array_constructor_ARRAY_nulls#facebookincubator#3                           88.36ms     11.32
array_constructor_ARRAY_nulls##2_null                      74.53ms     13.42
array_constructor_ARRAY_nulls##2_const                    102.54ms      9.75

After:

array_constructor_ARRAY_nullfree##2_const                  41.36ms     24.18
array_constructor_ARRAY_nulls#facebookincubator#1                           30.51ms     32.78
array_constructor_ARRAY_nulls#facebookincubator#2                           55.13ms     18.14
array_constructor_ARRAY_nulls#facebookincubator#3                           77.93ms     12.83
array_constructor_ARRAY_nulls##2_null                      68.84ms     14.53
array_constructor_ARRAY_nulls##2_const                     83.91ms     11.92

Before:

array_constructor_MAP_nullfree##2_const                    67.44ms     14.83
array_constructor_MAP_nulls#facebookincubator#1                             37.00ms     27.03
array_constructor_MAP_nulls#facebookincubator#2                             67.76ms     14.76
array_constructor_MAP_nulls#facebookincubator#3                            100.88ms      9.91
array_constructor_MAP_nulls##2_null                        84.22ms     11.87
array_constructor_MAP_nulls##2_const                      122.55ms      8.16

After:

array_constructor_MAP_nullfree##2_const                    49.94ms     20.03
array_constructor_MAP_nulls#facebookincubator#1                             34.34ms     29.12
array_constructor_MAP_nulls#facebookincubator#2                             55.23ms     18.11
array_constructor_MAP_nulls#facebookincubator#3                             82.64ms     12.10
array_constructor_MAP_nulls##2_null                        70.74ms     14.14
array_constructor_MAP_nulls##2_const                       88.13ms     11.35
```

Reviewed By: laithsakka

Differential Revision: D49269500

fbshipit-source-id: b7271a981d33885b2d4bc3219bc15f069bcd06d3
facebook-github-bot pushed a commit that referenced this pull request Sep 19, 2023
Summary:
Pull Request resolved: #6566

FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: #5958 (comment)

```
FlatVector.h

  void copyRanges(
      const BaseVector* source,
      const folly::Range<const BaseVector::CopyRange*>& ranges) override {
    for (auto& range : ranges) {
      copy(source, range.targetIndex, range.sourceIndex, range.count);
    }
  }
```

This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same.

An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps.

```
Before:

array_constructor_ARRAY_nullfree##2_const                  54.31ms     18.41
array_constructor_ARRAY_nulls##1                           33.50ms     29.85
array_constructor_ARRAY_nulls##2                           59.05ms     16.93
array_constructor_ARRAY_nulls##3                           88.36ms     11.32
array_constructor_ARRAY_nulls##2_null                      74.53ms     13.42
array_constructor_ARRAY_nulls##2_const                    102.54ms      9.75

After:

array_constructor_ARRAY_nullfree##2_const                  41.36ms     24.18
array_constructor_ARRAY_nulls##1                           30.51ms     32.78
array_constructor_ARRAY_nulls##2                           55.13ms     18.14
array_constructor_ARRAY_nulls##3                           77.93ms     12.83
array_constructor_ARRAY_nulls##2_null                      68.84ms     14.53
array_constructor_ARRAY_nulls##2_const                     83.91ms     11.92

Before:

array_constructor_MAP_nullfree##2_const                    67.44ms     14.83
array_constructor_MAP_nulls##1                             37.00ms     27.03
array_constructor_MAP_nulls##2                             67.76ms     14.76
array_constructor_MAP_nulls##3                            100.88ms      9.91
array_constructor_MAP_nulls##2_null                        84.22ms     11.87
array_constructor_MAP_nulls##2_const                      122.55ms      8.16

After:

array_constructor_MAP_nullfree##2_const                    49.94ms     20.03
array_constructor_MAP_nulls##1                             34.34ms     29.12
array_constructor_MAP_nulls##2                             55.23ms     18.11
array_constructor_MAP_nulls##3                             82.64ms     12.10
array_constructor_MAP_nulls##2_null                        70.74ms     14.14
array_constructor_MAP_nulls##2_const                       88.13ms     11.35
```

Reviewed By: laithsakka

Differential Revision: D49269500

fbshipit-source-id: 7b702921202f4bb8d10a252eb0ab20f0e5792ae6
codyschierbeck pushed a commit to codyschierbeck/velox that referenced this pull request Sep 27, 2023
Summary:
Pull Request resolved: facebookincubator#6568

array_constructor is very slow: facebookincubator#5958 (comment)

array_constructor uses BaseVector::copyRanges, which is somewhat fast for arrays and maps, but very slow for primitive types:

```
FlatVector.h

  void copyRanges(
      const BaseVector* source,
      const folly::Range<const BaseVector::CopyRange*>& ranges) override {
    for (auto& range : ranges) {
      copy(source, range.targetIndex, range.sourceIndex, range.count);
    }
  }
```

FlatVector<T>::copy(source, rows, toSourceRow) is faster.

Switching from copyRanges to copy speeds up array_constructor for primitive types and structs significantly. Yet, this change makes arrays and maps slower.

The slowness is due to ArrayVector and MapVector not having implementation for copy(source, rows, toSourceRow). They rely on BaseVector::copy to translate rows + toSourceRow to ranges. This extra processing causes perf regression.

Hence, we use copy for primitive types and structs of these and copyRanges for everything else.

```
Before:

array_constructor_ARRAY_nullfree#facebookincubator#1                        16.80ms     59.53
array_constructor_ARRAY_nullfree#facebookincubator#2                        27.02ms     37.01
array_constructor_ARRAY_nullfree#facebookincubator#3                        38.03ms     26.30
array_constructor_ARRAY_nullfree##2_null                   52.86ms     18.92
array_constructor_ARRAY_nullfree##2_const                  54.97ms     18.19
array_constructor_ARRAY_nulls#facebookincubator#1                           30.61ms     32.66
array_constructor_ARRAY_nulls#facebookincubator#2                           55.01ms     18.18
array_constructor_ARRAY_nulls#facebookincubator#3                           80.69ms     12.39
array_constructor_ARRAY_nulls##2_null                      69.10ms     14.47
array_constructor_ARRAY_nulls##2_const                    103.85ms      9.63

After:

array_constructor_ARRAY_nullfree#facebookincubator#1                        15.25ms     65.58
array_constructor_ARRAY_nullfree#facebookincubator#2                        25.11ms     39.82
array_constructor_ARRAY_nullfree#facebookincubator#3                        34.59ms     28.91
array_constructor_ARRAY_nullfree##2_null                   53.61ms     18.65
array_constructor_ARRAY_nullfree##2_const                  51.48ms     19.42
array_constructor_ARRAY_nulls#facebookincubator#1                           29.99ms     33.34
array_constructor_ARRAY_nulls#facebookincubator#2                           55.91ms     17.89
array_constructor_ARRAY_nulls#facebookincubator#3                           81.73ms     12.24
array_constructor_ARRAY_nulls##2_null                      66.97ms     14.93
array_constructor_ARRAY_nulls##2_const                     92.96ms     10.76

Before:

array_constructor_INTEGER_nullfree#facebookincubator#1                      19.72ms     50.71
array_constructor_INTEGER_nullfree#facebookincubator#2                      34.51ms     28.97
array_constructor_INTEGER_nullfree#facebookincubator#3                      47.95ms     20.86
array_constructor_INTEGER_nullfree##2_null                 58.68ms     17.04
array_constructor_INTEGER_nullfree##2_const                45.15ms     22.15
array_constructor_INTEGER_nulls#facebookincubator#1                         29.99ms     33.34
array_constructor_INTEGER_nulls#facebookincubator#2                         55.32ms     18.08
array_constructor_INTEGER_nulls#facebookincubator#3                         78.53ms     12.73
array_constructor_INTEGER_nulls##2_null                    72.24ms     13.84
array_constructor_INTEGER_nulls##2_const                   71.13ms     14.06

After:

array_constructor_INTEGER_nullfree#facebookincubator#1                       3.39ms    294.89
array_constructor_INTEGER_nullfree#facebookincubator#2                       7.35ms    136.10
array_constructor_INTEGER_nullfree#facebookincubator#3                      10.78ms     92.74
array_constructor_INTEGER_nullfree##2_null                 11.29ms     88.57
array_constructor_INTEGER_nullfree##2_const                10.14ms     98.65
array_constructor_INTEGER_nulls#facebookincubator#1                          4.49ms    222.53
array_constructor_INTEGER_nulls#facebookincubator#2                          9.78ms    102.29
array_constructor_INTEGER_nulls#facebookincubator#3                         14.69ms     68.08
array_constructor_INTEGER_nulls##2_null                    12.14ms     82.36
array_constructor_INTEGER_nulls##2_const                   12.27ms     81.53

Before:

array_constructor_MAP_nullfree#facebookincubator#1                          17.34ms     57.65
array_constructor_MAP_nullfree#facebookincubator#2                          29.84ms     33.51
array_constructor_MAP_nullfree#facebookincubator#3                          41.51ms     24.09
array_constructor_MAP_nullfree##2_null                     56.57ms     17.68
array_constructor_MAP_nullfree##2_const                    71.68ms     13.95
array_constructor_MAP_nulls#facebookincubator#1                             36.22ms     27.61
array_constructor_MAP_nulls#facebookincubator#2                             68.18ms     14.67
array_constructor_MAP_nulls#facebookincubator#3                             95.12ms     10.51
array_constructor_MAP_nulls##2_null                        86.42ms     11.57
array_constructor_MAP_nulls##2_const                      120.10ms      8.33

After:

array_constructor_MAP_nullfree#facebookincubator#1                          17.05ms     58.66
array_constructor_MAP_nullfree#facebookincubator#2                          28.42ms     35.18
array_constructor_MAP_nullfree#facebookincubator#3                          36.96ms     27.06
array_constructor_MAP_nullfree##2_null                     55.64ms     17.97
array_constructor_MAP_nullfree##2_const                    67.53ms     14.81
array_constructor_MAP_nulls#facebookincubator#1                             32.91ms     30.39
array_constructor_MAP_nulls#facebookincubator#2                             64.50ms     15.50
array_constructor_MAP_nulls#facebookincubator#3                             95.71ms     10.45
array_constructor_MAP_nulls##2_null                        77.22ms     12.95
array_constructor_MAP_nulls##2_const                      114.91ms      8.70

Before:

array_constructor_ROW_nullfree#facebookincubator#1                          33.88ms     29.52
array_constructor_ROW_nullfree#facebookincubator#2                          62.00ms     16.13
array_constructor_ROW_nullfree#facebookincubator#3                          89.54ms     11.17
array_constructor_ROW_nullfree##2_null                     78.46ms     12.75
array_constructor_ROW_nullfree##2_const                    95.53ms     10.47
array_constructor_ROW_nulls#facebookincubator#1                             44.11ms     22.67
array_constructor_ROW_nulls#facebookincubator#2                            115.43ms      8.66
array_constructor_ROW_nulls#facebookincubator#3                            173.61ms      5.76
array_constructor_ROW_nulls##2_null                       130.40ms      7.67
array_constructor_ROW_nulls##2_const                      169.97ms      5.88

After:

array_constructor_ROW_nullfree#facebookincubator#1                           5.55ms    180.15
array_constructor_ROW_nullfree#facebookincubator#2                          12.83ms     77.94
array_constructor_ROW_nullfree#facebookincubator#3                          18.89ms     52.95
array_constructor_ROW_nullfree##2_null                     18.74ms     53.36
array_constructor_ROW_nullfree##2_const                    18.16ms     55.07
array_constructor_ROW_nulls#facebookincubator#1                             11.29ms     88.61
array_constructor_ROW_nulls#facebookincubator#2                             18.57ms     53.86
array_constructor_ROW_nulls#facebookincubator#3                             34.20ms     29.24
array_constructor_ROW_nulls##2_null                        25.05ms     39.92
array_constructor_ROW_nulls##2_const                       25.15ms     39.77
```

Reviewed By: laithsakka

Differential Revision: D49272797

fbshipit-source-id: 55d83de7b69c7ae4b72b5a5ae62a7868f36b0e19
codyschierbeck pushed a commit to codyschierbeck/velox that referenced this pull request Sep 27, 2023
Summary:
Pull Request resolved: facebookincubator#6566

FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment)

```
FlatVector.h

  void copyRanges(
      const BaseVector* source,
      const folly::Range<const BaseVector::CopyRange*>& ranges) override {
    for (auto& range : ranges) {
      copy(source, range.targetIndex, range.sourceIndex, range.count);
    }
  }
```

This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same.

An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps.

```
Before:

array_constructor_ARRAY_nullfree##2_const                  54.31ms     18.41
array_constructor_ARRAY_nulls#facebookincubator#1                           33.50ms     29.85
array_constructor_ARRAY_nulls#facebookincubator#2                           59.05ms     16.93
array_constructor_ARRAY_nulls#facebookincubator#3                           88.36ms     11.32
array_constructor_ARRAY_nulls##2_null                      74.53ms     13.42
array_constructor_ARRAY_nulls##2_const                    102.54ms      9.75

After:

array_constructor_ARRAY_nullfree##2_const                  41.36ms     24.18
array_constructor_ARRAY_nulls#facebookincubator#1                           30.51ms     32.78
array_constructor_ARRAY_nulls#facebookincubator#2                           55.13ms     18.14
array_constructor_ARRAY_nulls#facebookincubator#3                           77.93ms     12.83
array_constructor_ARRAY_nulls##2_null                      68.84ms     14.53
array_constructor_ARRAY_nulls##2_const                     83.91ms     11.92

Before:

array_constructor_MAP_nullfree##2_const                    67.44ms     14.83
array_constructor_MAP_nulls#facebookincubator#1                             37.00ms     27.03
array_constructor_MAP_nulls#facebookincubator#2                             67.76ms     14.76
array_constructor_MAP_nulls#facebookincubator#3                            100.88ms      9.91
array_constructor_MAP_nulls##2_null                        84.22ms     11.87
array_constructor_MAP_nulls##2_const                      122.55ms      8.16

After:

array_constructor_MAP_nullfree##2_const                    49.94ms     20.03
array_constructor_MAP_nulls#facebookincubator#1                             34.34ms     29.12
array_constructor_MAP_nulls#facebookincubator#2                             55.23ms     18.11
array_constructor_MAP_nulls#facebookincubator#3                             82.64ms     12.10
array_constructor_MAP_nulls##2_null                        70.74ms     14.14
array_constructor_MAP_nulls##2_const                       88.13ms     11.35
```

Reviewed By: laithsakka

Differential Revision: D49269500

fbshipit-source-id: 7b702921202f4bb8d10a252eb0ab20f0e5792ae6
codyschierbeck pushed a commit to codyschierbeck/velox that referenced this pull request Sep 27, 2023
Summary:
Pull Request resolved: facebookincubator#6568

array_constructor is very slow: facebookincubator#5958 (comment)

array_constructor uses BaseVector::copyRanges, which is somewhat fast for arrays and maps, but very slow for primitive types:

```
FlatVector.h

  void copyRanges(
      const BaseVector* source,
      const folly::Range<const BaseVector::CopyRange*>& ranges) override {
    for (auto& range : ranges) {
      copy(source, range.targetIndex, range.sourceIndex, range.count);
    }
  }
```

FlatVector<T>::copy(source, rows, toSourceRow) is faster.

Switching from copyRanges to copy speeds up array_constructor for primitive types and structs significantly. Yet, this change makes arrays and maps slower.

The slowness is due to ArrayVector and MapVector not having implementation for copy(source, rows, toSourceRow). They rely on BaseVector::copy to translate rows + toSourceRow to ranges. This extra processing causes perf regression.

Hence, we use copy for primitive types and structs of these and copyRanges for everything else.

```
Before:

array_constructor_ARRAY_nullfree#facebookincubator#1                        16.80ms     59.53
array_constructor_ARRAY_nullfree#facebookincubator#2                        27.02ms     37.01
array_constructor_ARRAY_nullfree#facebookincubator#3                        38.03ms     26.30
array_constructor_ARRAY_nullfree##2_null                   52.86ms     18.92
array_constructor_ARRAY_nullfree##2_const                  54.97ms     18.19
array_constructor_ARRAY_nulls#facebookincubator#1                           30.61ms     32.66
array_constructor_ARRAY_nulls#facebookincubator#2                           55.01ms     18.18
array_constructor_ARRAY_nulls#facebookincubator#3                           80.69ms     12.39
array_constructor_ARRAY_nulls##2_null                      69.10ms     14.47
array_constructor_ARRAY_nulls##2_const                    103.85ms      9.63

After:

array_constructor_ARRAY_nullfree#facebookincubator#1                        15.25ms     65.58
array_constructor_ARRAY_nullfree#facebookincubator#2                        25.11ms     39.82
array_constructor_ARRAY_nullfree#facebookincubator#3                        34.59ms     28.91
array_constructor_ARRAY_nullfree##2_null                   53.61ms     18.65
array_constructor_ARRAY_nullfree##2_const                  51.48ms     19.42
array_constructor_ARRAY_nulls#facebookincubator#1                           29.99ms     33.34
array_constructor_ARRAY_nulls#facebookincubator#2                           55.91ms     17.89
array_constructor_ARRAY_nulls#facebookincubator#3                           81.73ms     12.24
array_constructor_ARRAY_nulls##2_null                      66.97ms     14.93
array_constructor_ARRAY_nulls##2_const                     92.96ms     10.76

Before:

array_constructor_INTEGER_nullfree#facebookincubator#1                      19.72ms     50.71
array_constructor_INTEGER_nullfree#facebookincubator#2                      34.51ms     28.97
array_constructor_INTEGER_nullfree#facebookincubator#3                      47.95ms     20.86
array_constructor_INTEGER_nullfree##2_null                 58.68ms     17.04
array_constructor_INTEGER_nullfree##2_const                45.15ms     22.15
array_constructor_INTEGER_nulls#facebookincubator#1                         29.99ms     33.34
array_constructor_INTEGER_nulls#facebookincubator#2                         55.32ms     18.08
array_constructor_INTEGER_nulls#facebookincubator#3                         78.53ms     12.73
array_constructor_INTEGER_nulls##2_null                    72.24ms     13.84
array_constructor_INTEGER_nulls##2_const                   71.13ms     14.06

After:

array_constructor_INTEGER_nullfree#facebookincubator#1                       3.39ms    294.89
array_constructor_INTEGER_nullfree#facebookincubator#2                       7.35ms    136.10
array_constructor_INTEGER_nullfree#facebookincubator#3                      10.78ms     92.74
array_constructor_INTEGER_nullfree##2_null                 11.29ms     88.57
array_constructor_INTEGER_nullfree##2_const                10.14ms     98.65
array_constructor_INTEGER_nulls#facebookincubator#1                          4.49ms    222.53
array_constructor_INTEGER_nulls#facebookincubator#2                          9.78ms    102.29
array_constructor_INTEGER_nulls#facebookincubator#3                         14.69ms     68.08
array_constructor_INTEGER_nulls##2_null                    12.14ms     82.36
array_constructor_INTEGER_nulls##2_const                   12.27ms     81.53

Before:

array_constructor_MAP_nullfree#facebookincubator#1                          17.34ms     57.65
array_constructor_MAP_nullfree#facebookincubator#2                          29.84ms     33.51
array_constructor_MAP_nullfree#facebookincubator#3                          41.51ms     24.09
array_constructor_MAP_nullfree##2_null                     56.57ms     17.68
array_constructor_MAP_nullfree##2_const                    71.68ms     13.95
array_constructor_MAP_nulls#facebookincubator#1                             36.22ms     27.61
array_constructor_MAP_nulls#facebookincubator#2                             68.18ms     14.67
array_constructor_MAP_nulls#facebookincubator#3                             95.12ms     10.51
array_constructor_MAP_nulls##2_null                        86.42ms     11.57
array_constructor_MAP_nulls##2_const                      120.10ms      8.33

After:

array_constructor_MAP_nullfree#facebookincubator#1                          17.05ms     58.66
array_constructor_MAP_nullfree#facebookincubator#2                          28.42ms     35.18
array_constructor_MAP_nullfree#facebookincubator#3                          36.96ms     27.06
array_constructor_MAP_nullfree##2_null                     55.64ms     17.97
array_constructor_MAP_nullfree##2_const                    67.53ms     14.81
array_constructor_MAP_nulls#facebookincubator#1                             32.91ms     30.39
array_constructor_MAP_nulls#facebookincubator#2                             64.50ms     15.50
array_constructor_MAP_nulls#facebookincubator#3                             95.71ms     10.45
array_constructor_MAP_nulls##2_null                        77.22ms     12.95
array_constructor_MAP_nulls##2_const                      114.91ms      8.70

Before:

array_constructor_ROW_nullfree#facebookincubator#1                          33.88ms     29.52
array_constructor_ROW_nullfree#facebookincubator#2                          62.00ms     16.13
array_constructor_ROW_nullfree#facebookincubator#3                          89.54ms     11.17
array_constructor_ROW_nullfree##2_null                     78.46ms     12.75
array_constructor_ROW_nullfree##2_const                    95.53ms     10.47
array_constructor_ROW_nulls#facebookincubator#1                             44.11ms     22.67
array_constructor_ROW_nulls#facebookincubator#2                            115.43ms      8.66
array_constructor_ROW_nulls#facebookincubator#3                            173.61ms      5.76
array_constructor_ROW_nulls##2_null                       130.40ms      7.67
array_constructor_ROW_nulls##2_const                      169.97ms      5.88

After:

array_constructor_ROW_nullfree#facebookincubator#1                           5.55ms    180.15
array_constructor_ROW_nullfree#facebookincubator#2                          12.83ms     77.94
array_constructor_ROW_nullfree#facebookincubator#3                          18.89ms     52.95
array_constructor_ROW_nullfree##2_null                     18.74ms     53.36
array_constructor_ROW_nullfree##2_const                    18.16ms     55.07
array_constructor_ROW_nulls#facebookincubator#1                             11.29ms     88.61
array_constructor_ROW_nulls#facebookincubator#2                             18.57ms     53.86
array_constructor_ROW_nulls#facebookincubator#3                             34.20ms     29.24
array_constructor_ROW_nulls##2_null                        25.05ms     39.92
array_constructor_ROW_nulls##2_const                       25.15ms     39.77
```

Reviewed By: laithsakka

Differential Revision: D49272797

fbshipit-source-id: 55d83de7b69c7ae4b72b5a5ae62a7868f36b0e19
codyschierbeck pushed a commit to codyschierbeck/velox that referenced this pull request Sep 27, 2023
Summary:
Pull Request resolved: facebookincubator#6566

FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment)

```
FlatVector.h

  void copyRanges(
      const BaseVector* source,
      const folly::Range<const BaseVector::CopyRange*>& ranges) override {
    for (auto& range : ranges) {
      copy(source, range.targetIndex, range.sourceIndex, range.count);
    }
  }
```

This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same.

An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps.

```
Before:

array_constructor_ARRAY_nullfree##2_const                  54.31ms     18.41
array_constructor_ARRAY_nulls#facebookincubator#1                           33.50ms     29.85
array_constructor_ARRAY_nulls#facebookincubator#2                           59.05ms     16.93
array_constructor_ARRAY_nulls#facebookincubator#3                           88.36ms     11.32
array_constructor_ARRAY_nulls##2_null                      74.53ms     13.42
array_constructor_ARRAY_nulls##2_const                    102.54ms      9.75

After:

array_constructor_ARRAY_nullfree##2_const                  41.36ms     24.18
array_constructor_ARRAY_nulls#facebookincubator#1                           30.51ms     32.78
array_constructor_ARRAY_nulls#facebookincubator#2                           55.13ms     18.14
array_constructor_ARRAY_nulls#facebookincubator#3                           77.93ms     12.83
array_constructor_ARRAY_nulls##2_null                      68.84ms     14.53
array_constructor_ARRAY_nulls##2_const                     83.91ms     11.92

Before:

array_constructor_MAP_nullfree##2_const                    67.44ms     14.83
array_constructor_MAP_nulls#facebookincubator#1                             37.00ms     27.03
array_constructor_MAP_nulls#facebookincubator#2                             67.76ms     14.76
array_constructor_MAP_nulls#facebookincubator#3                            100.88ms      9.91
array_constructor_MAP_nulls##2_null                        84.22ms     11.87
array_constructor_MAP_nulls##2_const                      122.55ms      8.16

After:

array_constructor_MAP_nullfree##2_const                    49.94ms     20.03
array_constructor_MAP_nulls#facebookincubator#1                             34.34ms     29.12
array_constructor_MAP_nulls#facebookincubator#2                             55.23ms     18.11
array_constructor_MAP_nulls#facebookincubator#3                             82.64ms     12.10
array_constructor_MAP_nulls##2_null                        70.74ms     14.14
array_constructor_MAP_nulls##2_const                       88.13ms     11.35
```

Reviewed By: laithsakka

Differential Revision: D49269500

fbshipit-source-id: 7b702921202f4bb8d10a252eb0ab20f0e5792ae6
codyschierbeck pushed a commit to codyschierbeck/velox that referenced this pull request Sep 27, 2023
Summary:
Pull Request resolved: facebookincubator#6568

array_constructor is very slow: facebookincubator#5958 (comment)

array_constructor uses BaseVector::copyRanges, which is somewhat fast for arrays and maps, but very slow for primitive types:

```
FlatVector.h

  void copyRanges(
      const BaseVector* source,
      const folly::Range<const BaseVector::CopyRange*>& ranges) override {
    for (auto& range : ranges) {
      copy(source, range.targetIndex, range.sourceIndex, range.count);
    }
  }
```

FlatVector<T>::copy(source, rows, toSourceRow) is faster.

Switching from copyRanges to copy speeds up array_constructor for primitive types and structs significantly. Yet, this change makes arrays and maps slower.

The slowness is due to ArrayVector and MapVector not having implementation for copy(source, rows, toSourceRow). They rely on BaseVector::copy to translate rows + toSourceRow to ranges. This extra processing causes perf regression.

Hence, we use copy for primitive types and structs of these and copyRanges for everything else.

```
Before:

array_constructor_ARRAY_nullfree#facebookincubator#1                        16.80ms     59.53
array_constructor_ARRAY_nullfree#facebookincubator#2                        27.02ms     37.01
array_constructor_ARRAY_nullfree#facebookincubator#3                        38.03ms     26.30
array_constructor_ARRAY_nullfree##2_null                   52.86ms     18.92
array_constructor_ARRAY_nullfree##2_const                  54.97ms     18.19
array_constructor_ARRAY_nulls#facebookincubator#1                           30.61ms     32.66
array_constructor_ARRAY_nulls#facebookincubator#2                           55.01ms     18.18
array_constructor_ARRAY_nulls#facebookincubator#3                           80.69ms     12.39
array_constructor_ARRAY_nulls##2_null                      69.10ms     14.47
array_constructor_ARRAY_nulls##2_const                    103.85ms      9.63

After:

array_constructor_ARRAY_nullfree#facebookincubator#1                        15.25ms     65.58
array_constructor_ARRAY_nullfree#facebookincubator#2                        25.11ms     39.82
array_constructor_ARRAY_nullfree#facebookincubator#3                        34.59ms     28.91
array_constructor_ARRAY_nullfree##2_null                   53.61ms     18.65
array_constructor_ARRAY_nullfree##2_const                  51.48ms     19.42
array_constructor_ARRAY_nulls#facebookincubator#1                           29.99ms     33.34
array_constructor_ARRAY_nulls#facebookincubator#2                           55.91ms     17.89
array_constructor_ARRAY_nulls#facebookincubator#3                           81.73ms     12.24
array_constructor_ARRAY_nulls##2_null                      66.97ms     14.93
array_constructor_ARRAY_nulls##2_const                     92.96ms     10.76

Before:

array_constructor_INTEGER_nullfree#facebookincubator#1                      19.72ms     50.71
array_constructor_INTEGER_nullfree#facebookincubator#2                      34.51ms     28.97
array_constructor_INTEGER_nullfree#facebookincubator#3                      47.95ms     20.86
array_constructor_INTEGER_nullfree##2_null                 58.68ms     17.04
array_constructor_INTEGER_nullfree##2_const                45.15ms     22.15
array_constructor_INTEGER_nulls#facebookincubator#1                         29.99ms     33.34
array_constructor_INTEGER_nulls#facebookincubator#2                         55.32ms     18.08
array_constructor_INTEGER_nulls#facebookincubator#3                         78.53ms     12.73
array_constructor_INTEGER_nulls##2_null                    72.24ms     13.84
array_constructor_INTEGER_nulls##2_const                   71.13ms     14.06

After:

array_constructor_INTEGER_nullfree#facebookincubator#1                       3.39ms    294.89
array_constructor_INTEGER_nullfree#facebookincubator#2                       7.35ms    136.10
array_constructor_INTEGER_nullfree#facebookincubator#3                      10.78ms     92.74
array_constructor_INTEGER_nullfree##2_null                 11.29ms     88.57
array_constructor_INTEGER_nullfree##2_const                10.14ms     98.65
array_constructor_INTEGER_nulls#facebookincubator#1                          4.49ms    222.53
array_constructor_INTEGER_nulls#facebookincubator#2                          9.78ms    102.29
array_constructor_INTEGER_nulls#facebookincubator#3                         14.69ms     68.08
array_constructor_INTEGER_nulls##2_null                    12.14ms     82.36
array_constructor_INTEGER_nulls##2_const                   12.27ms     81.53

Before:

array_constructor_MAP_nullfree#facebookincubator#1                          17.34ms     57.65
array_constructor_MAP_nullfree#facebookincubator#2                          29.84ms     33.51
array_constructor_MAP_nullfree#facebookincubator#3                          41.51ms     24.09
array_constructor_MAP_nullfree##2_null                     56.57ms     17.68
array_constructor_MAP_nullfree##2_const                    71.68ms     13.95
array_constructor_MAP_nulls#facebookincubator#1                             36.22ms     27.61
array_constructor_MAP_nulls#facebookincubator#2                             68.18ms     14.67
array_constructor_MAP_nulls#facebookincubator#3                             95.12ms     10.51
array_constructor_MAP_nulls##2_null                        86.42ms     11.57
array_constructor_MAP_nulls##2_const                      120.10ms      8.33

After:

array_constructor_MAP_nullfree#facebookincubator#1                          17.05ms     58.66
array_constructor_MAP_nullfree#facebookincubator#2                          28.42ms     35.18
array_constructor_MAP_nullfree#facebookincubator#3                          36.96ms     27.06
array_constructor_MAP_nullfree##2_null                     55.64ms     17.97
array_constructor_MAP_nullfree##2_const                    67.53ms     14.81
array_constructor_MAP_nulls#facebookincubator#1                             32.91ms     30.39
array_constructor_MAP_nulls#facebookincubator#2                             64.50ms     15.50
array_constructor_MAP_nulls#facebookincubator#3                             95.71ms     10.45
array_constructor_MAP_nulls##2_null                        77.22ms     12.95
array_constructor_MAP_nulls##2_const                      114.91ms      8.70

Before:

array_constructor_ROW_nullfree#facebookincubator#1                          33.88ms     29.52
array_constructor_ROW_nullfree#facebookincubator#2                          62.00ms     16.13
array_constructor_ROW_nullfree#facebookincubator#3                          89.54ms     11.17
array_constructor_ROW_nullfree##2_null                     78.46ms     12.75
array_constructor_ROW_nullfree##2_const                    95.53ms     10.47
array_constructor_ROW_nulls#facebookincubator#1                             44.11ms     22.67
array_constructor_ROW_nulls#facebookincubator#2                            115.43ms      8.66
array_constructor_ROW_nulls#facebookincubator#3                            173.61ms      5.76
array_constructor_ROW_nulls##2_null                       130.40ms      7.67
array_constructor_ROW_nulls##2_const                      169.97ms      5.88

After:

array_constructor_ROW_nullfree#facebookincubator#1                           5.55ms    180.15
array_constructor_ROW_nullfree#facebookincubator#2                          12.83ms     77.94
array_constructor_ROW_nullfree#facebookincubator#3                          18.89ms     52.95
array_constructor_ROW_nullfree##2_null                     18.74ms     53.36
array_constructor_ROW_nullfree##2_const                    18.16ms     55.07
array_constructor_ROW_nulls#facebookincubator#1                             11.29ms     88.61
array_constructor_ROW_nulls#facebookincubator#2                             18.57ms     53.86
array_constructor_ROW_nulls#facebookincubator#3                             34.20ms     29.24
array_constructor_ROW_nulls##2_null                        25.05ms     39.92
array_constructor_ROW_nulls##2_const                       25.15ms     39.77
```

Reviewed By: laithsakka

Differential Revision: D49272797

fbshipit-source-id: 55d83de7b69c7ae4b72b5a5ae62a7868f36b0e19
codyschierbeck pushed a commit to codyschierbeck/velox that referenced this pull request Sep 27, 2023
Summary:
Pull Request resolved: facebookincubator#6566

FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment)

```
FlatVector.h

  void copyRanges(
      const BaseVector* source,
      const folly::Range<const BaseVector::CopyRange*>& ranges) override {
    for (auto& range : ranges) {
      copy(source, range.targetIndex, range.sourceIndex, range.count);
    }
  }
```

This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same.

An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps.

```
Before:

array_constructor_ARRAY_nullfree##2_const                  54.31ms     18.41
array_constructor_ARRAY_nulls#facebookincubator#1                           33.50ms     29.85
array_constructor_ARRAY_nulls#facebookincubator#2                           59.05ms     16.93
array_constructor_ARRAY_nulls#facebookincubator#3                           88.36ms     11.32
array_constructor_ARRAY_nulls##2_null                      74.53ms     13.42
array_constructor_ARRAY_nulls##2_const                    102.54ms      9.75

After:

array_constructor_ARRAY_nullfree##2_const                  41.36ms     24.18
array_constructor_ARRAY_nulls#facebookincubator#1                           30.51ms     32.78
array_constructor_ARRAY_nulls#facebookincubator#2                           55.13ms     18.14
array_constructor_ARRAY_nulls#facebookincubator#3                           77.93ms     12.83
array_constructor_ARRAY_nulls##2_null                      68.84ms     14.53
array_constructor_ARRAY_nulls##2_const                     83.91ms     11.92

Before:

array_constructor_MAP_nullfree##2_const                    67.44ms     14.83
array_constructor_MAP_nulls#facebookincubator#1                             37.00ms     27.03
array_constructor_MAP_nulls#facebookincubator#2                             67.76ms     14.76
array_constructor_MAP_nulls#facebookincubator#3                            100.88ms      9.91
array_constructor_MAP_nulls##2_null                        84.22ms     11.87
array_constructor_MAP_nulls##2_const                      122.55ms      8.16

After:

array_constructor_MAP_nullfree##2_const                    49.94ms     20.03
array_constructor_MAP_nulls#facebookincubator#1                             34.34ms     29.12
array_constructor_MAP_nulls#facebookincubator#2                             55.23ms     18.11
array_constructor_MAP_nulls#facebookincubator#3                             82.64ms     12.10
array_constructor_MAP_nulls##2_null                        70.74ms     14.14
array_constructor_MAP_nulls##2_const                       88.13ms     11.35
```

Reviewed By: laithsakka

Differential Revision: D49269500

fbshipit-source-id: 7b702921202f4bb8d10a252eb0ab20f0e5792ae6
ericyuliu pushed a commit to ericyuliu/velox that referenced this pull request Oct 12, 2023
Summary:
Pull Request resolved: facebookincubator#6568

array_constructor is very slow: facebookincubator#5958 (comment)

array_constructor uses BaseVector::copyRanges, which is somewhat fast for arrays and maps, but very slow for primitive types:

```
FlatVector.h

  void copyRanges(
      const BaseVector* source,
      const folly::Range<const BaseVector::CopyRange*>& ranges) override {
    for (auto& range : ranges) {
      copy(source, range.targetIndex, range.sourceIndex, range.count);
    }
  }
```

FlatVector<T>::copy(source, rows, toSourceRow) is faster.

Switching from copyRanges to copy speeds up array_constructor for primitive types and structs significantly. Yet, this change makes arrays and maps slower.

The slowness is due to ArrayVector and MapVector not having implementation for copy(source, rows, toSourceRow). They rely on BaseVector::copy to translate rows + toSourceRow to ranges. This extra processing causes perf regression.

Hence, we use copy for primitive types and structs of these and copyRanges for everything else.

```
Before:

array_constructor_ARRAY_nullfree#facebookincubator#1                        16.80ms     59.53
array_constructor_ARRAY_nullfree#facebookincubator#2                        27.02ms     37.01
array_constructor_ARRAY_nullfree#facebookincubator#3                        38.03ms     26.30
array_constructor_ARRAY_nullfree##2_null                   52.86ms     18.92
array_constructor_ARRAY_nullfree##2_const                  54.97ms     18.19
array_constructor_ARRAY_nulls#facebookincubator#1                           30.61ms     32.66
array_constructor_ARRAY_nulls#facebookincubator#2                           55.01ms     18.18
array_constructor_ARRAY_nulls#facebookincubator#3                           80.69ms     12.39
array_constructor_ARRAY_nulls##2_null                      69.10ms     14.47
array_constructor_ARRAY_nulls##2_const                    103.85ms      9.63

After:

array_constructor_ARRAY_nullfree#facebookincubator#1                        15.25ms     65.58
array_constructor_ARRAY_nullfree#facebookincubator#2                        25.11ms     39.82
array_constructor_ARRAY_nullfree#facebookincubator#3                        34.59ms     28.91
array_constructor_ARRAY_nullfree##2_null                   53.61ms     18.65
array_constructor_ARRAY_nullfree##2_const                  51.48ms     19.42
array_constructor_ARRAY_nulls#facebookincubator#1                           29.99ms     33.34
array_constructor_ARRAY_nulls#facebookincubator#2                           55.91ms     17.89
array_constructor_ARRAY_nulls#facebookincubator#3                           81.73ms     12.24
array_constructor_ARRAY_nulls##2_null                      66.97ms     14.93
array_constructor_ARRAY_nulls##2_const                     92.96ms     10.76

Before:

array_constructor_INTEGER_nullfree#facebookincubator#1                      19.72ms     50.71
array_constructor_INTEGER_nullfree#facebookincubator#2                      34.51ms     28.97
array_constructor_INTEGER_nullfree#facebookincubator#3                      47.95ms     20.86
array_constructor_INTEGER_nullfree##2_null                 58.68ms     17.04
array_constructor_INTEGER_nullfree##2_const                45.15ms     22.15
array_constructor_INTEGER_nulls#facebookincubator#1                         29.99ms     33.34
array_constructor_INTEGER_nulls#facebookincubator#2                         55.32ms     18.08
array_constructor_INTEGER_nulls#facebookincubator#3                         78.53ms     12.73
array_constructor_INTEGER_nulls##2_null                    72.24ms     13.84
array_constructor_INTEGER_nulls##2_const                   71.13ms     14.06

After:

array_constructor_INTEGER_nullfree#facebookincubator#1                       3.39ms    294.89
array_constructor_INTEGER_nullfree#facebookincubator#2                       7.35ms    136.10
array_constructor_INTEGER_nullfree#facebookincubator#3                      10.78ms     92.74
array_constructor_INTEGER_nullfree##2_null                 11.29ms     88.57
array_constructor_INTEGER_nullfree##2_const                10.14ms     98.65
array_constructor_INTEGER_nulls#facebookincubator#1                          4.49ms    222.53
array_constructor_INTEGER_nulls#facebookincubator#2                          9.78ms    102.29
array_constructor_INTEGER_nulls#facebookincubator#3                         14.69ms     68.08
array_constructor_INTEGER_nulls##2_null                    12.14ms     82.36
array_constructor_INTEGER_nulls##2_const                   12.27ms     81.53

Before:

array_constructor_MAP_nullfree#facebookincubator#1                          17.34ms     57.65
array_constructor_MAP_nullfree#facebookincubator#2                          29.84ms     33.51
array_constructor_MAP_nullfree#facebookincubator#3                          41.51ms     24.09
array_constructor_MAP_nullfree##2_null                     56.57ms     17.68
array_constructor_MAP_nullfree##2_const                    71.68ms     13.95
array_constructor_MAP_nulls#facebookincubator#1                             36.22ms     27.61
array_constructor_MAP_nulls#facebookincubator#2                             68.18ms     14.67
array_constructor_MAP_nulls#facebookincubator#3                             95.12ms     10.51
array_constructor_MAP_nulls##2_null                        86.42ms     11.57
array_constructor_MAP_nulls##2_const                      120.10ms      8.33

After:

array_constructor_MAP_nullfree#facebookincubator#1                          17.05ms     58.66
array_constructor_MAP_nullfree#facebookincubator#2                          28.42ms     35.18
array_constructor_MAP_nullfree#facebookincubator#3                          36.96ms     27.06
array_constructor_MAP_nullfree##2_null                     55.64ms     17.97
array_constructor_MAP_nullfree##2_const                    67.53ms     14.81
array_constructor_MAP_nulls#facebookincubator#1                             32.91ms     30.39
array_constructor_MAP_nulls#facebookincubator#2                             64.50ms     15.50
array_constructor_MAP_nulls#facebookincubator#3                             95.71ms     10.45
array_constructor_MAP_nulls##2_null                        77.22ms     12.95
array_constructor_MAP_nulls##2_const                      114.91ms      8.70

Before:

array_constructor_ROW_nullfree#facebookincubator#1                          33.88ms     29.52
array_constructor_ROW_nullfree#facebookincubator#2                          62.00ms     16.13
array_constructor_ROW_nullfree#facebookincubator#3                          89.54ms     11.17
array_constructor_ROW_nullfree##2_null                     78.46ms     12.75
array_constructor_ROW_nullfree##2_const                    95.53ms     10.47
array_constructor_ROW_nulls#facebookincubator#1                             44.11ms     22.67
array_constructor_ROW_nulls#facebookincubator#2                            115.43ms      8.66
array_constructor_ROW_nulls#facebookincubator#3                            173.61ms      5.76
array_constructor_ROW_nulls##2_null                       130.40ms      7.67
array_constructor_ROW_nulls##2_const                      169.97ms      5.88

After:

array_constructor_ROW_nullfree#facebookincubator#1                           5.55ms    180.15
array_constructor_ROW_nullfree#facebookincubator#2                          12.83ms     77.94
array_constructor_ROW_nullfree#facebookincubator#3                          18.89ms     52.95
array_constructor_ROW_nullfree##2_null                     18.74ms     53.36
array_constructor_ROW_nullfree##2_const                    18.16ms     55.07
array_constructor_ROW_nulls#facebookincubator#1                             11.29ms     88.61
array_constructor_ROW_nulls#facebookincubator#2                             18.57ms     53.86
array_constructor_ROW_nulls#facebookincubator#3                             34.20ms     29.24
array_constructor_ROW_nulls##2_null                        25.05ms     39.92
array_constructor_ROW_nulls##2_const                       25.15ms     39.77
```

Reviewed By: laithsakka

Differential Revision: D49272797

fbshipit-source-id: 55d83de7b69c7ae4b72b5a5ae62a7868f36b0e19
ericyuliu pushed a commit to ericyuliu/velox that referenced this pull request Oct 12, 2023
Summary:
Pull Request resolved: facebookincubator#6566

FlatVector::copyRanges is slow when there are many small ranges. I noticed this while investigating slowness of array_constructor which used created 1-row ranges: facebookincubator#5958 (comment)

```
FlatVector.h

  void copyRanges(
      const BaseVector* source,
      const folly::Range<const BaseVector::CopyRange*>& ranges) override {
    for (auto& range : ranges) {
      copy(source, range.targetIndex, range.sourceIndex, range.count);
    }
  }
```

This change optimizes FlatVector<T>::copyRanges using code from copy(source, targetIndex, sourceIndex, count) which copies one range. This change also replaces the latter call with a call to copyRanges, hence, the overall amount of code is about the same.

An earlier change optimized array_constructor to not use copyRanges for primitive types, but it is still used by Array/MapVector::copyRanges, although, these do not create 1-row ranges. Still the array_constructor benchmark shows some wins for arrays and maps.

```
Before:

array_constructor_ARRAY_nullfree##2_const                  54.31ms     18.41
array_constructor_ARRAY_nulls#facebookincubator#1                           33.50ms     29.85
array_constructor_ARRAY_nulls#facebookincubator#2                           59.05ms     16.93
array_constructor_ARRAY_nulls#facebookincubator#3                           88.36ms     11.32
array_constructor_ARRAY_nulls##2_null                      74.53ms     13.42
array_constructor_ARRAY_nulls##2_const                    102.54ms      9.75

After:

array_constructor_ARRAY_nullfree##2_const                  41.36ms     24.18
array_constructor_ARRAY_nulls#facebookincubator#1                           30.51ms     32.78
array_constructor_ARRAY_nulls#facebookincubator#2                           55.13ms     18.14
array_constructor_ARRAY_nulls#facebookincubator#3                           77.93ms     12.83
array_constructor_ARRAY_nulls##2_null                      68.84ms     14.53
array_constructor_ARRAY_nulls##2_const                     83.91ms     11.92

Before:

array_constructor_MAP_nullfree##2_const                    67.44ms     14.83
array_constructor_MAP_nulls#facebookincubator#1                             37.00ms     27.03
array_constructor_MAP_nulls#facebookincubator#2                             67.76ms     14.76
array_constructor_MAP_nulls#facebookincubator#3                            100.88ms      9.91
array_constructor_MAP_nulls##2_null                        84.22ms     11.87
array_constructor_MAP_nulls##2_const                      122.55ms      8.16

After:

array_constructor_MAP_nullfree##2_const                    49.94ms     20.03
array_constructor_MAP_nulls#facebookincubator#1                             34.34ms     29.12
array_constructor_MAP_nulls#facebookincubator#2                             55.23ms     18.11
array_constructor_MAP_nulls#facebookincubator#3                             82.64ms     12.10
array_constructor_MAP_nulls##2_null                        70.74ms     14.14
array_constructor_MAP_nulls##2_const                       88.13ms     11.35
```

Reviewed By: laithsakka

Differential Revision: D49269500

fbshipit-source-id: 7b702921202f4bb8d10a252eb0ab20f0e5792ae6
laithsakka added a commit to laithsakka/velox that referenced this pull request Oct 12, 2023
Summary:
```
============================================================================
[...]hmarks/ExpressionBenchmarkBuilder.cpp     relative  time/iter   iters/s
============================================================================
map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#1               71.48ms     13.99
map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#2               76.58ms     13.06
map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#3               85.31ms     11.72
map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#4              121.56ms      8.23
map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#1                      27.19ms     36.78
map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#2                      33.10ms     30.21
map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#3                      33.47ms     29.88
map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#4                      31.70ms     31.55
map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#1                      26.92ms     37.14
map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#2                      36.62ms     27.31
map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#3                      34.19ms     29.24
map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#4                      33.76ms     29.62
```

Differential Revision: D50237919
laithsakka added a commit to laithsakka/velox that referenced this pull request Oct 12, 2023
Summary:

```
============================================================================
[...]hmarks/ExpressionBenchmarkBuilder.cpp     relative  time/iter   iters/s
============================================================================
map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#1               71.48ms     13.99
map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#2               76.58ms     13.06
map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#3               85.31ms     11.72
map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#4              121.56ms      8.23
map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#1                      27.19ms     36.78
map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#2                      33.10ms     30.21
map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#3                      33.47ms     29.88
map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#4                      31.70ms     31.55
map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#1                      26.92ms     37.14
map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#2                      36.62ms     27.31
map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#3                      34.19ms     29.24
map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#4                      33.76ms     29.62
```

Differential Revision: D50237919
laithsakka added a commit to laithsakka/velox that referenced this pull request Oct 12, 2023
Summary:

```
============================================================================
[...]hmarks/ExpressionBenchmarkBuilder.cpp     relative  time/iter   iters/s
============================================================================
map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#1               71.48ms     13.99
map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#2               76.58ms     13.06
map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#3               85.31ms     11.72
map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#4              121.56ms      8.23
map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#1                      27.19ms     36.78
map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#2                      33.10ms     30.21
map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#3                      33.47ms     29.88
map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#4                      31.70ms     31.55
map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#1                      26.92ms     37.14
map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#2                      36.62ms     27.31
map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#3                      34.19ms     29.24
map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#4                      33.76ms     29.62
```

Differential Revision: D50237919
laithsakka added a commit to laithsakka/velox that referenced this pull request Oct 12, 2023
Summary:

```
============================================================================
[...]hmarks/ExpressionBenchmarkBuilder.cpp     relative  time/iter   iters/s
============================================================================
map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#1               71.48ms     13.99
map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#2               76.58ms     13.06
map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#3               85.31ms     11.72
map_subscript_MAP<ARRAY<VARCHAR>,INTEGER>#facebookincubator#4              121.56ms      8.23
map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#1                      27.19ms     36.78
map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#2                      33.10ms     30.21
map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#3                      33.47ms     29.88
map_subscript_MAP<INTEGER,INTEGER>#facebookincubator#4                      31.70ms     31.55
map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#1                      26.92ms     37.14
map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#2                      36.62ms     27.31
map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#3                      34.19ms     29.24
map_subscript_MAP<VARCHAR,INTEGER>#facebookincubator#4                      33.76ms     29.62
```

Differential Revision: D50237919
TatianaJin pushed a commit to TatianaJin/velox that referenced this pull request Nov 13, 2023
facebook-github-bot pushed a commit that referenced this pull request Dec 19, 2023
Summary:
Velox GHA CI job `Run Benchmarks - Baseline` is now failing by this error:

```
terminate called after throwing an instance of 'facebook::velox::VeloxRuntimeError'
  what():  Exception: VeloxRuntimeError
Error Source: RUNTIME
Error Code: INVALID_STATE
Reason: The memory manager is not set
Retriable: False
Expression: instanceRef != nullptr
Function: getInstance
File: ../../velox/common/memory/Memory.cpp
Line: 129
Stack trace:
# 0  std::shared_ptr<facebook::velox::VeloxException::State const> facebook::velox::VeloxException::State::make<facebook::velox::VeloxException::make(char const*, unsigned long, char const*, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, bool, facebook::velox::VeloxException::Type, std::basic_string_view<char, std::char_traits<char> >)::{lambda(auto:1&)https://github.com/facebookincubator/velox/issues/1}>(facebook::velox::VeloxException::Type, facebook::velox::VeloxException::make(char const*, unsigned long, char const*, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, bool, facebook::velox::VeloxException::Type, std::basic_string_view<char, std::char_traits<char> >)::{lambda(auto:1&)#1})
# 1  facebook::velox::VeloxException::VeloxException(char const*, unsigned long, char const*, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, bool, facebook::velox::VeloxException::Type, std::basic_string_view<char, std::char_traits<char> >)
# 2  facebook::velox::VeloxRuntimeError::VeloxRuntimeError(char const*, unsigned long, char const*, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, bool, std::basic_string_view<char, std::char_traits<char> >)
# 3  void facebook::velox::detail::veloxCheckFail<facebook::velox::VeloxRuntimeError, char const*>(facebook::velox::detail::VeloxCheckFailArgs const&, char const*)
# 4  facebook::velox::memory::MemoryManager::getInstance()
# 5  __static_initialization_and_destruction_0(int, int)
# 6  _GLOBAL__sub_I__ZN5fLI6417FLAGS_fuzzer_seedE
# 7  __libc_csu_init
# 8  __libc_start_main
# 9  _start
```

The error should be fixed by this patch.

Follow up to #8079
Related to #7168

Pull Request resolved: #8098

Reviewed By: tanjialiang

Differential Revision: D52301375

Pulled By: xiaoxmeng

fbshipit-source-id: 01bbd4bed6b6b4aa5d018b9d13c126d4bee44cd4
facebook-github-bot pushed a commit that referenced this pull request Aug 23, 2024
Summary:
The default algorithm used is MD5. However, MD5 is not supported with
fips and can cause a SIGSEGV. Set CRC32 instead which is a standard for
checksum computation and is not restricted by fips.
crc32 is also faster than md5.

Internally at IBM, we hit the following SIGSEGV

```
0x0000000000000000 in ?? ()
Missing separate debuginfos, use: dnf debuginfo-install openssl-fips-provider-3.0.7-2.el9.x86_64 xz-libs-5.2.5-8.el9_0.x86_64
(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x0000000004e5f89b in Aws::Utils::Crypto::MD5OpenSSLImpl::Calculate(std::istream&) ()
#2  0x0000000004efd298 in Aws::Utils::Crypto::MD5::Calculate(std::istream&) ()
#3  0x0000000004ef71b9 in Aws::Utils::HashingUtils::CalculateMD5(std::iostream&) ()
#4  0x0000000004e8ebe8 in Aws::Client::AWSClient::AddChecksumToRequest(std::shared_ptr<Aws::Http::HttpRequest> const&, Aws::AmazonWebServiceRequest const&) const ()
#5  0x0000000004e8ed15 in Aws::Client::AWSClient::BuildHttpRequest(Aws::AmazonWebServiceRequest const&, std::shared_ptr<Aws::Http::HttpRequest> const&) const ()
#6  0x0000000004e977f9 in Aws::Client::AWSClient::AttemptOneRequest(std::shared_ptr<Aws::Http::HttpRequest> const&, Aws::AmazonWebServiceRequest const&, char const*, char const*, char const*) const ()
#7  0x0000000004e9e1c0 in Aws::Client::AWSClient::AttemptExhaustively(Aws::Http::URI const&, Aws::AmazonWebServiceRequest const&, Aws::Http::HttpMethod, char const*, char const*, char const*) const ()
#8  0x0000000004ea15e8 in Aws::Client::AWSXMLClient::MakeRequest(Aws::Http::URI const&, Aws::AmazonWebServiceRequest const&, Aws::Http::HttpMethod, char const*, char const*, char const*) const ()
#9  0x0000000004ea1f70 in Aws::Client::AWSXMLClient::MakeRequest(Aws::AmazonWebServiceRequest const&, Aws::Endpoint::AWSEndpoint const&, Aws::Http::HttpMethod, char const*, char const*, char const*) const ()
#10 0x0000000004de0933 in Aws::S3::S3Client::UploadPart(Aws::S3::Model::UploadPartRequest const&) const::{lambda()https://github.com/facebookincubator/velox/issues/1}::operator()() const ()
#11 0x0000000004de0b8c in std::_Function_handler<Aws::Utils::Outcome<Aws::S3::Model::UploadPartResult, Aws::S3::S3Error> (), Aws::S3::S3Client::UploadPart(Aws::S3::Model::UploadPartRequest const&) const::{lambda()https://github.com/facebookincubator/velox/issues/1}>::_M_invoke(std::_Any_data const&) ()
#12 0x0000000004e19317 in Aws::Utils::Outcome<Aws::S3::Model::UploadPartResult, Aws::S3::S3Error> smithy::components::tracing::TracingUtils::MakeCallWithTiming<Aws::Utils::Outcome<Aws::S3::Model::UploadPartResult, Aws::S3::S3Error> >(std::function<Aws::Utils::Outcome<Aws::S3::Model::UploadPartResult, Aws::S3::S3Error> ()>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, smithy::components::tracing::Meter const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
#13 0x0000000004d7cdcf in Aws::S3::S3Client::UploadPart(Aws::S3::Model::UploadPartRequest const&) const ()
#14 0x0000000004ca4aa6 in facebook::velox::filesystems::S3WriteFile::Impl::uploadPart (this=0x7fffec2f09a0, part=..., isLast=true)
    at /root/velox/velox/connectors/hive/storage_adapters/s3fs/S3FileSystem.cpp:380
```

Pull Request resolved: #10801

Reviewed By: amitkdutta

Differential Revision: D61671574

Pulled By: kgpai

fbshipit-source-id: 34c7b777b3fde0659ef74c4fbfd93740fdfa3f7c
facebook-github-bot pushed a commit that referenced this pull request Aug 28, 2024
Summary:
We got the following exception when running window fuzzer test with `ROWS between k0 PRECEDING and -2157330091482954703 FOLLOWING` frame in ubuntu 20.04 os.
```
I0822 23:45:06.386909 2118850 WindowFuzzer.cpp:277] ==============================> Done with iteration 228
I0822 23:45:06.387009 2118850 WindowFuzzer.cpp:217] ==============================> Started iteration 229 (seed: 2266604503)
I0822 23:45:06.413928 2118850 AggregationFuzzerBase.cpp:427] Executing query plan:
-- Window[1][partition by [p0, p1] order by [s0 ASC NULLS FIRST, s1 DESC NULLS FIRST, row_number ASC NULLS LAST] w0 := approx_percentile(ROW["c0"],ROW["c1"],ROW["c2"]) ROWS between k0 FOLLOWING and -8268856359831993385 FOLLOWING] -> c0:DOUBLE, c1:BIGINT, c2:DOUBLE, s0:ROW<f0:BOOLEAN,f1:INTERVAL DAY TO SECOND,f2:BOOLEAN,f3:VARCHAR,f4:INTEGER>, s1:SMALLINT, p0:ARRAY<DOUBLE>, p1:ARRAY<DOUBLE>, k0:INTEGER, row_number:BIGINT, w0:DOUBLE
  -- Values[0][1000 rows in 10 vectors] -> c0:DOUBLE, c1:BIGINT, c2:DOUBLE, s0:ROW<f0:BOOLEAN,f1:INTERVAL DAY TO SECOND,f2:BOOLEAN,f3:VARCHAR,f4:INTEGER>, s1:SMALLINT, p0:ARRAY<DOUBLE>, p1:ARRAY<DOUBLE>, k0:INTEGER, row_number:BIGINT
I0822 23:45:06.414655 2121801 Task.cpp:1920] Terminating task test_cursor 1170 with state Failed after running for 0ms
I0822 23:45:06.414741 2121801 Task.cpp:1155] All drivers (1) finished for task test_cursor 1170 after running for 0ms
E0822 23:45:06.414860 2121801 MemoryPool.cpp:448] [MEM] Memory leak (Used memory): Memory Pool[op.1.0.0.Window LEAF root[query.TaskCursorQuery_1169.1619] parent[node.1] MALLOC track-usage thread-safe]<unlimited max capacity unlimited capacity used 128B available 1023.88KB reservation [used 128B, reserved 1.00MB, min 0B] counters [allocs 9, frees 8, reserves 0, releases 1, collisions 0])>
E0822 23:45:06.414930 2121801 Exceptions.h:67] Line: /mnt/DP_disk3/jk/projects/fb-velox/velox/velox/common/memory/MemoryArbitrator.cpp:100, Function:removePool, Expression: pool->reservedBytes() == 0 (1048576 vs. 0), Source: RUNTIME, ErrorCode: INVALID_STATE
terminate called after throwing an instance of 'facebook::velox::VeloxRuntimeError'
  what():  Exception: VeloxRuntimeError
Error Source: RUNTIME
Error Code: INVALID_STATE
Reason: (1048576 vs. 0)
Retriable: False
Expression: pool->reservedBytes() == 0
Function: removePool
File: /mnt/DP_disk3/jk/projects/fb-velox/velox/velox/common/memory/MemoryArbitrator.cpp
Line: 100
Stack trace:
# 0  facebook::velox::VeloxException::VeloxException(char const*, unsigned long, char const*, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, bool, facebook::velox::VeloxException::Type, std::basic_string_view<char, std::char_traits<char> >)
# 1  void facebook::velox::detail::veloxCheckFail<facebook::velox::VeloxRuntimeError, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>(facebook::velox::detail::VeloxCheckFailArgs const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
# 2  facebook::velox::memory::(anonymous namespace)::NoopArbitrator::removePool(facebook::velox::memory::MemoryPool*)
# 3  facebook::velox::memory::MemoryManager::dropPool(facebook::velox::memory::MemoryPool*)
# 4  facebook::velox::memory::MemoryPoolImpl::~MemoryPoolImpl()
# 5  std::_Sp_counted_ptr<facebook::velox::core::QueryCtx*, (__gnu_cxx::_Lock_policy)2>::_M_dispose()
# 6  std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() [clone .part.0]
# 7  facebook::velox::exec::Task::~Task()
# 8  std::_Sp_counted_ptr<facebook::velox::exec::Task*, (__gnu_cxx::_Lock_policy)2>::_M_dispose()
# 9  std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() [clone .part.0]
# 10 std::_Sp_counted_ptr<facebook::velox::exec::Driver*, (__gnu_cxx::_Lock_policy)2>::_M_dispose()
# 11 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() [clone .part.0]
# 12 unsigned long folly::detail::function::DispatchSmall::exec<facebook::velox::exec::Driver::enqueue(std::shared_ptr<facebook::velox::exec::Driver>)::{lambda()https://github.com/facebookincubator/velox/issues/1}>(folly::detail::function::Op, folly::detail::function::Data*, folly::detail::function::Data)
# 13 folly::ThreadPoolExecutor::runTask(std::shared_ptr<folly::ThreadPoolExecutor::Thread> const&, folly::ThreadPoolExecutor::Task&&)
# 14 folly::CPUThreadPoolExecutor::threadRun(std::shared_ptr<folly::ThreadPoolExecutor::Thread>)
# 15 void folly::detail::function::call_<std::_Bind<void (folly::ThreadPoolExecutor::*(folly::ThreadPoolExecutor*, std::shared_ptr<folly::ThreadPoolExecutor::Thread>))(std::shared_ptr<folly::ThreadPoolExecutor::Thread>)>, true, false, void>(, folly::detail::function::Data&)
# 16 0x00000000000d6df3
# 17 start_thread
# 18 clone

*** Aborted at 1724341506 (Unix time, try 'date -d 1724341506') ***
*** Signal 6 (SIGABRT) (0x2054c2) received by PID 2118850 (pthread TID 0x7f7e077fe700) (linux TID 2121801) (maybe from PID 2118850, UID 0) (code: -6), stack trace: ***
    @ 0000000007449637 folly::symbolizer::(anonymous namespace)::signalHandler(int, siginfo_t*, void*)
                       /mnt/DP_disk3/jk/projects/gluten/ep/build-velox/build/velox_ep/folly/folly/experimental/symbolizer/SignalHandler.cpp:453
    @ 000000000001441f (unknown)
    @ 000000000004300b gsignal
    @ 0000000000022858 abort
    @ 000000000009e8d0 (unknown)
    @ 00000000000aa37b (unknown)
    @ 00000000000a9358 (unknown)
    @ 00000000000a9d10 __gxx_personality_v0
    @ 0000000000003f88 __libunwind_Unwind_Resume
    @ 00000000023e7d28 facebook::velox::memory::MemoryManager::dropPool(facebook::velox::memory::MemoryPool*) [clone .cold]
    @ 0000000007395017 facebook::velox::memory::MemoryPoolImpl::~MemoryPoolImpl()
    @ 00000000071d6a98 std::_Sp_counted_ptr<facebook::velox::core::QueryCtx*, (__gnu_cxx::_Lock_policy)2>::_M_dispose()
    @ 00000000024332d9 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() [clone .part.0]
    @ 0000000006c75269 facebook::velox::exec::Task::~Task()
    @ 0000000006c92975 std::_Sp_counted_ptr<facebook::velox::exec::Task*, (__gnu_cxx::_Lock_policy)2>::_M_dispose()
    @ 00000000024332d9 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() [clone .part.0]
    @ 0000000006d0fdd3 std::_Sp_counted_ptr<facebook::velox::exec::Driver*, (__gnu_cxx::_Lock_policy)2>::_M_dispose()
    @ 00000000024332d9 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() [clone .part.0]
    @ 0000000006ce647f unsigned long folly::detail::function::DispatchSmall::exec<facebook::velox::exec::Driver::enqueue(std::shared_ptr<facebook::velox::exec::Driver>)::{lambda()https://github.com/facebookincubator/velox/issues/1}>(folly::detail::function::Op, folly::detail::function::Data*, folly::detail::function::Data)
    @ 000000000743f420 folly::ThreadPoolExecutor::runTask(std::shared_ptr<folly::ThreadPoolExecutor::Thread> const&, folly::ThreadPoolExecutor::Task&&)
                       /mnt/DP_disk3/jk/projects/gluten/ep/build-velox/build/velox_ep/folly/folly/Function.h:639
                       -> /mnt/DP_disk3/jk/projects/gluten/ep/build-velox/build/velox_ep/folly/folly/executors/ThreadPoolExecutor.cpp
    @ 0000000007427d25 folly::CPUThreadPoolExecutor::threadRun(std::shared_ptr<folly::ThreadPoolExecutor::Thread>)
                       /mnt/DP_disk3/jk/projects/gluten/ep/build-velox/build/velox_ep/folly/folly/executors/CPUThreadPoolExecutor.cpp:350
    @ 0000000007445d2d void folly::detail::function::call_<std::_Bind<void (folly::ThreadPoolExecutor::*(folly::ThreadPoolExecutor*, std::shared_ptr<folly::ThreadPoolExecutor::Thread>))(std::shared_ptr<folly::ThreadPoolExecutor::Thread>)>, true, false, void>(, folly::detail::function::Data&)
                       /usr/include/c++/9/bits/invoke.h:73
                       -> /mnt/DP_disk3/jk/projects/gluten/ep/build-velox/build/velox_ep/folly/folly/executors/ThreadPoolExecutor.cpp
    @ 00000000000d6df3 (unknown)
    @ 0000000000008608 start_thread
    @ 000000000011f352 clone
Aborted (core dumped)
```

We discovered that the memory leak occurs during the creation of the frame.

```
Detected total of 1 leaked allocations:
======== Leaked memory from 1 total allocations of 96B total size ========
# 0  facebook::velox::memory::MemoryPoolImpl::recordAllocDbg(void const*, unsigned
# 1  facebook::velox::memory::MemoryPoolImpl::allocate(long)
# 2  boost::intrusive_ptr<facebook::velox::Buffer> facebook::velox::AlignedBuffer::e<int>(unsigned long, facebook::velox::memory::MemoryPool*, std::optional<int> cons
# 3  std::shared_ptr<facebook::velox::BaseVector> facebook::velox::createEmpty<(facvelox::TypeKind)3>(int, facebook::velox::memory::MemoryPool*, std::shared_ptr<faceblox::Type const> const&)
# 4  facebook::velox::BaseVector::createInternal(std::shared_ptr<facebook::velox::Tst> const&, int, facebook::velox::memory::MemoryPool*)::{lambda()https://github.com/facebookincubator/velox/issues/1}::operator()() {lambda()https://github.com/facebookincubator/velox/issues/1}::operator()() const
# 5  facebook::velox::BaseVector::createInternal(std::shared_ptr<facebook::velox::Tst> const&, int, facebook::velox::memory::MemoryPool*)::{lambda()https://github.com/facebookincubator/velox/issues/1}::operator()()
# 6  facebook::velox::BaseVector::createInternal(std::shared_ptr<facebook::velox::Tst> const&, int, facebook::velox::memory::MemoryPool*)
# 7  std::shared_ptr<facebook::velox::BaseVector> facebook::velox::BaseVector::creabook::velox::BaseVector>(std::shared_ptr<facebook::velox::Type const> const&, int, k::velox::memory::MemoryPool*)
# 8  facebook::velox::exec::Window::createWindowFrame(std::shared_ptr<facebook::vele::WindowNode const> const&, facebook::velox::core::WindowNode::Frame const&, std::ptr<facebook::velox::RowType const> const&)::{lambda(std::shared_ptr<facebook::velo::ITypedExpr const> const&)https://github.com/facebookincubator/velox/issues/1}::operator()(std::shared_ptr<facebook::velox::core::Ipr const> const&) const
# 9  facebook::velox::exec::Window::createWindowFrame(std::shared_ptr<facebook::vele::WindowNode const> const&, facebook::velox::core::WindowNode::Frame const&, std::ptr<facebook::velox::RowType const> const&)
# 10 facebook::velox::exec::Window::createWindowFunctions()
# 11 facebook::velox::exec::Window::initialize()
# 12 facebook::velox::exec::Driver::initializeOperators()
# 13 facebook::velox::exec::Driver::runInternal(std::shared_ptr<facebook::velox::exver>&, std::shared_ptr<facebook::velox::exec::BlockingState>&, std::shared_ptr<faceelox::RowVector>&)
# 14 facebook::velox::exec::Driver::run(std::shared_ptr<facebook::velox::exec::Driv
# 15 facebook::velox::exec::Driver::enqueue(std::shared_ptr<facebook::velox::exec::)::{lambda()https://github.com/facebookincubator/velox/issues/1}::operator()() const
# 16 void folly::detail::function::call_<facebook::velox::exec::Driver::enqueue(stdd_ptr<facebook::velox::exec::Driver>)::{lambda()#1}, true, false, void>(, folly::deunction::Data&)
# 17 folly::ThreadPoolExecutor::runTask(std::shared_ptr<folly::ThreadPoolExecutor:: const&, folly::ThreadPoolExecutor::Task&&)
# 18 folly::CPUThreadPoolExecutor::threadRun(std::shared_ptr<folly::ThreadPoolExecuread>)
# 19 void folly::detail::function::call_<std::_Bind<void (folly::ThreadPoolExecutorly::ThreadPoolExecutor*, std::shared_ptr<folly::ThreadPoolExecutor::Thread>))(std::ptr<folly::ThreadPoolExecutor::Thread>)>, true, false, void>(, folly::detail::functta&)
# 20 folly::detail::function::FunctionTraits<void ()>::operator()()
# 21 folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()https://github.com/facebookincubator/velox/issues/1}tor()()
# 22 void std::__invoke_impl<void, folly::NamedThreadFactory::newThread(folly::Funcid ()>&&)::{lambda()https://github.com/facebookincubator/velox/issues/1}>(std::__invoke_other, folly::NamedThreadFactory::newThread(Function<void ()>&&)::{lambda()#1}&&)
# 23 std::__invoke_result<folly::NamedThreadFactory::newThread(folly::Function<void::{lambda()https://github.com/facebookincubator/velox/issues/1}>::type std::__invoke<folly::NamedThreadFactory::newThread(folly::Funoid ()>&&)::{lambda()https://github.com/facebookincubator/velox/issues/1}>(std::__invoke_result&&, (folly::NamedThreadFactory::newThlly::Function<void ()>&&)::{lambda()#1}&&)...)
# 24 void std::thread::_Invoker<std::tuple<folly::NamedThreadFactory::newThread(folction<void ()>&&)::{lambda()#1}> >::_M_invoke<0ul>(std::_Index_tuple<0ul>)
# 25 std::thread::_Invoker<std::tuple<folly::NamedThreadFactory::newThread(folly::F<void ()>&&)::{lambda()#1}> >::operator()()
# 26 std::thread::_State_impl<std::thread::_Invoker<std::tuple<folly::NamedThreadFanewThread(folly::Function<void ()>&&)::{lambda()#1}> > >::_M_run()
# 27 0x00000000000d6df3
# 28 start_thread
# 29 clone
```

The frame start has already been established and a buffer has been allocated [here](https://github.com/facebookincubator/velox/blob/main/velox/exec/Window.cpp#L144). However, when creating the frame end, the offset is negative, which triggers an error [here](https://github.com/facebookincubator/velox/blob/main/velox/exec/Window.cpp#L137). At this point, the buffer allocated for the frame start has not been released in time, resulting in the reported memory leak. This PR ensures that the expected error check for negative frame does not result in a memory leak.

Pull Request resolved: #10847

Reviewed By: Yuhta

Differential Revision: D61894144

Pulled By: xiaoxmeng

fbshipit-source-id: 859d95a37cf1481f1af7f8127e7544ef6ea4a402
Joe-Abraham pushed a commit to Joe-Abraham/velox that referenced this pull request Sep 2, 2024
…tor#10847)

Summary:
We got the following exception when running window fuzzer test with `ROWS between k0 PRECEDING and -2157330091482954703 FOLLOWING` frame in ubuntu 20.04 os.
```
I0822 23:45:06.386909 2118850 WindowFuzzer.cpp:277] ==============================> Done with iteration 228
I0822 23:45:06.387009 2118850 WindowFuzzer.cpp:217] ==============================> Started iteration 229 (seed: 2266604503)
I0822 23:45:06.413928 2118850 AggregationFuzzerBase.cpp:427] Executing query plan:
-- Window[1][partition by [p0, p1] order by [s0 ASC NULLS FIRST, s1 DESC NULLS FIRST, row_number ASC NULLS LAST] w0 := approx_percentile(ROW["c0"],ROW["c1"],ROW["c2"]) ROWS between k0 FOLLOWING and -8268856359831993385 FOLLOWING] -> c0:DOUBLE, c1:BIGINT, c2:DOUBLE, s0:ROW<f0:BOOLEAN,f1:INTERVAL DAY TO SECOND,f2:BOOLEAN,f3:VARCHAR,f4:INTEGER>, s1:SMALLINT, p0:ARRAY<DOUBLE>, p1:ARRAY<DOUBLE>, k0:INTEGER, row_number:BIGINT, w0:DOUBLE
  -- Values[0][1000 rows in 10 vectors] -> c0:DOUBLE, c1:BIGINT, c2:DOUBLE, s0:ROW<f0:BOOLEAN,f1:INTERVAL DAY TO SECOND,f2:BOOLEAN,f3:VARCHAR,f4:INTEGER>, s1:SMALLINT, p0:ARRAY<DOUBLE>, p1:ARRAY<DOUBLE>, k0:INTEGER, row_number:BIGINT
I0822 23:45:06.414655 2121801 Task.cpp:1920] Terminating task test_cursor 1170 with state Failed after running for 0ms
I0822 23:45:06.414741 2121801 Task.cpp:1155] All drivers (1) finished for task test_cursor 1170 after running for 0ms
E0822 23:45:06.414860 2121801 MemoryPool.cpp:448] [MEM] Memory leak (Used memory): Memory Pool[op.1.0.0.Window LEAF root[query.TaskCursorQuery_1169.1619] parent[node.1] MALLOC track-usage thread-safe]<unlimited max capacity unlimited capacity used 128B available 1023.88KB reservation [used 128B, reserved 1.00MB, min 0B] counters [allocs 9, frees 8, reserves 0, releases 1, collisions 0])>
E0822 23:45:06.414930 2121801 Exceptions.h:67] Line: /mnt/DP_disk3/jk/projects/fb-velox/velox/velox/common/memory/MemoryArbitrator.cpp:100, Function:removePool, Expression: pool->reservedBytes() == 0 (1048576 vs. 0), Source: RUNTIME, ErrorCode: INVALID_STATE
terminate called after throwing an instance of 'facebook::velox::VeloxRuntimeError'
  what():  Exception: VeloxRuntimeError
Error Source: RUNTIME
Error Code: INVALID_STATE
Reason: (1048576 vs. 0)
Retriable: False
Expression: pool->reservedBytes() == 0
Function: removePool
File: /mnt/DP_disk3/jk/projects/fb-velox/velox/velox/common/memory/MemoryArbitrator.cpp
Line: 100
Stack trace:
# 0  facebook::velox::VeloxException::VeloxException(char const*, unsigned long, char const*, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, bool, facebook::velox::VeloxException::Type, std::basic_string_view<char, std::char_traits<char> >)
# 1  void facebook::velox::detail::veloxCheckFail<facebook::velox::VeloxRuntimeError, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>(facebook::velox::detail::VeloxCheckFailArgs const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
# 2  facebook::velox::memory::(anonymous namespace)::NoopArbitrator::removePool(facebook::velox::memory::MemoryPool*)
# 3  facebook::velox::memory::MemoryManager::dropPool(facebook::velox::memory::MemoryPool*)
# 4  facebook::velox::memory::MemoryPoolImpl::~MemoryPoolImpl()
# 5  std::_Sp_counted_ptr<facebook::velox::core::QueryCtx*, (__gnu_cxx::_Lock_policy)2>::_M_dispose()
# 6  std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() [clone .part.0]
# 7  facebook::velox::exec::Task::~Task()
# 8  std::_Sp_counted_ptr<facebook::velox::exec::Task*, (__gnu_cxx::_Lock_policy)2>::_M_dispose()
# 9  std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() [clone .part.0]
# 10 std::_Sp_counted_ptr<facebook::velox::exec::Driver*, (__gnu_cxx::_Lock_policy)2>::_M_dispose()
# 11 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() [clone .part.0]
# 12 unsigned long folly::detail::function::DispatchSmall::exec<facebook::velox::exec::Driver::enqueue(std::shared_ptr<facebook::velox::exec::Driver>)::{lambda()https://github.com/facebookincubator/velox/issues/1}>(folly::detail::function::Op, folly::detail::function::Data*, folly::detail::function::Data)
# 13 folly::ThreadPoolExecutor::runTask(std::shared_ptr<folly::ThreadPoolExecutor::Thread> const&, folly::ThreadPoolExecutor::Task&&)
# 14 folly::CPUThreadPoolExecutor::threadRun(std::shared_ptr<folly::ThreadPoolExecutor::Thread>)
# 15 void folly::detail::function::call_<std::_Bind<void (folly::ThreadPoolExecutor::*(folly::ThreadPoolExecutor*, std::shared_ptr<folly::ThreadPoolExecutor::Thread>))(std::shared_ptr<folly::ThreadPoolExecutor::Thread>)>, true, false, void>(, folly::detail::function::Data&)
# 16 0x00000000000d6df3
# 17 start_thread
# 18 clone

*** Aborted at 1724341506 (Unix time, try 'date -d 1724341506') ***
*** Signal 6 (SIGABRT) (0x2054c2) received by PID 2118850 (pthread TID 0x7f7e077fe700) (linux TID 2121801) (maybe from PID 2118850, UID 0) (code: -6), stack trace: ***
    @ 0000000007449637 folly::symbolizer::(anonymous namespace)::signalHandler(int, siginfo_t*, void*)
                       /mnt/DP_disk3/jk/projects/gluten/ep/build-velox/build/velox_ep/folly/folly/experimental/symbolizer/SignalHandler.cpp:453
    @ 000000000001441f (unknown)
    @ 000000000004300b gsignal
    @ 0000000000022858 abort
    @ 000000000009e8d0 (unknown)
    @ 00000000000aa37b (unknown)
    @ 00000000000a9358 (unknown)
    @ 00000000000a9d10 __gxx_personality_v0
    @ 0000000000003f88 __libunwind_Unwind_Resume
    @ 00000000023e7d28 facebook::velox::memory::MemoryManager::dropPool(facebook::velox::memory::MemoryPool*) [clone .cold]
    @ 0000000007395017 facebook::velox::memory::MemoryPoolImpl::~MemoryPoolImpl()
    @ 00000000071d6a98 std::_Sp_counted_ptr<facebook::velox::core::QueryCtx*, (__gnu_cxx::_Lock_policy)2>::_M_dispose()
    @ 00000000024332d9 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() [clone .part.0]
    @ 0000000006c75269 facebook::velox::exec::Task::~Task()
    @ 0000000006c92975 std::_Sp_counted_ptr<facebook::velox::exec::Task*, (__gnu_cxx::_Lock_policy)2>::_M_dispose()
    @ 00000000024332d9 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() [clone .part.0]
    @ 0000000006d0fdd3 std::_Sp_counted_ptr<facebook::velox::exec::Driver*, (__gnu_cxx::_Lock_policy)2>::_M_dispose()
    @ 00000000024332d9 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() [clone .part.0]
    @ 0000000006ce647f unsigned long folly::detail::function::DispatchSmall::exec<facebook::velox::exec::Driver::enqueue(std::shared_ptr<facebook::velox::exec::Driver>)::{lambda()https://github.com/facebookincubator/velox/issues/1}>(folly::detail::function::Op, folly::detail::function::Data*, folly::detail::function::Data)
    @ 000000000743f420 folly::ThreadPoolExecutor::runTask(std::shared_ptr<folly::ThreadPoolExecutor::Thread> const&, folly::ThreadPoolExecutor::Task&&)
                       /mnt/DP_disk3/jk/projects/gluten/ep/build-velox/build/velox_ep/folly/folly/Function.h:639
                       -> /mnt/DP_disk3/jk/projects/gluten/ep/build-velox/build/velox_ep/folly/folly/executors/ThreadPoolExecutor.cpp
    @ 0000000007427d25 folly::CPUThreadPoolExecutor::threadRun(std::shared_ptr<folly::ThreadPoolExecutor::Thread>)
                       /mnt/DP_disk3/jk/projects/gluten/ep/build-velox/build/velox_ep/folly/folly/executors/CPUThreadPoolExecutor.cpp:350
    @ 0000000007445d2d void folly::detail::function::call_<std::_Bind<void (folly::ThreadPoolExecutor::*(folly::ThreadPoolExecutor*, std::shared_ptr<folly::ThreadPoolExecutor::Thread>))(std::shared_ptr<folly::ThreadPoolExecutor::Thread>)>, true, false, void>(, folly::detail::function::Data&)
                       /usr/include/c++/9/bits/invoke.h:73
                       -> /mnt/DP_disk3/jk/projects/gluten/ep/build-velox/build/velox_ep/folly/folly/executors/ThreadPoolExecutor.cpp
    @ 00000000000d6df3 (unknown)
    @ 0000000000008608 start_thread
    @ 000000000011f352 clone
Aborted (core dumped)
```

We discovered that the memory leak occurs during the creation of the frame.

```
Detected total of 1 leaked allocations:
======== Leaked memory from 1 total allocations of 96B total size ========
# 0  facebook::velox::memory::MemoryPoolImpl::recordAllocDbg(void const*, unsigned
# 1  facebook::velox::memory::MemoryPoolImpl::allocate(long)
# 2  boost::intrusive_ptr<facebook::velox::Buffer> facebook::velox::AlignedBuffer::e<int>(unsigned long, facebook::velox::memory::MemoryPool*, std::optional<int> cons
# 3  std::shared_ptr<facebook::velox::BaseVector> facebook::velox::createEmpty<(facvelox::TypeKind)3>(int, facebook::velox::memory::MemoryPool*, std::shared_ptr<faceblox::Type const> const&)
# 4  facebook::velox::BaseVector::createInternal(std::shared_ptr<facebook::velox::Tst> const&, int, facebook::velox::memory::MemoryPool*)::{lambda()https://github.com/facebookincubator/velox/issues/1}::operator()() {lambda()https://github.com/facebookincubator/velox/issues/1}::operator()() const
# 5  facebook::velox::BaseVector::createInternal(std::shared_ptr<facebook::velox::Tst> const&, int, facebook::velox::memory::MemoryPool*)::{lambda()https://github.com/facebookincubator/velox/issues/1}::operator()()
# 6  facebook::velox::BaseVector::createInternal(std::shared_ptr<facebook::velox::Tst> const&, int, facebook::velox::memory::MemoryPool*)
# 7  std::shared_ptr<facebook::velox::BaseVector> facebook::velox::BaseVector::creabook::velox::BaseVector>(std::shared_ptr<facebook::velox::Type const> const&, int, k::velox::memory::MemoryPool*)
# 8  facebook::velox::exec::Window::createWindowFrame(std::shared_ptr<facebook::vele::WindowNode const> const&, facebook::velox::core::WindowNode::Frame const&, std::ptr<facebook::velox::RowType const> const&)::{lambda(std::shared_ptr<facebook::velo::ITypedExpr const> const&)https://github.com/facebookincubator/velox/issues/1}::operator()(std::shared_ptr<facebook::velox::core::Ipr const> const&) const
# 9  facebook::velox::exec::Window::createWindowFrame(std::shared_ptr<facebook::vele::WindowNode const> const&, facebook::velox::core::WindowNode::Frame const&, std::ptr<facebook::velox::RowType const> const&)
# 10 facebook::velox::exec::Window::createWindowFunctions()
# 11 facebook::velox::exec::Window::initialize()
# 12 facebook::velox::exec::Driver::initializeOperators()
# 13 facebook::velox::exec::Driver::runInternal(std::shared_ptr<facebook::velox::exver>&, std::shared_ptr<facebook::velox::exec::BlockingState>&, std::shared_ptr<faceelox::RowVector>&)
# 14 facebook::velox::exec::Driver::run(std::shared_ptr<facebook::velox::exec::Driv
# 15 facebook::velox::exec::Driver::enqueue(std::shared_ptr<facebook::velox::exec::)::{lambda()https://github.com/facebookincubator/velox/issues/1}::operator()() const
# 16 void folly::detail::function::call_<facebook::velox::exec::Driver::enqueue(stdd_ptr<facebook::velox::exec::Driver>)::{lambda()facebookincubator#1}, true, false, void>(, folly::deunction::Data&)
# 17 folly::ThreadPoolExecutor::runTask(std::shared_ptr<folly::ThreadPoolExecutor:: const&, folly::ThreadPoolExecutor::Task&&)
# 18 folly::CPUThreadPoolExecutor::threadRun(std::shared_ptr<folly::ThreadPoolExecuread>)
# 19 void folly::detail::function::call_<std::_Bind<void (folly::ThreadPoolExecutorly::ThreadPoolExecutor*, std::shared_ptr<folly::ThreadPoolExecutor::Thread>))(std::ptr<folly::ThreadPoolExecutor::Thread>)>, true, false, void>(, folly::detail::functta&)
# 20 folly::detail::function::FunctionTraits<void ()>::operator()()
# 21 folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()https://github.com/facebookincubator/velox/issues/1}tor()()
# 22 void std::__invoke_impl<void, folly::NamedThreadFactory::newThread(folly::Funcid ()>&&)::{lambda()https://github.com/facebookincubator/velox/issues/1}>(std::__invoke_other, folly::NamedThreadFactory::newThread(Function<void ()>&&)::{lambda()facebookincubator#1}&&)
# 23 std::__invoke_result<folly::NamedThreadFactory::newThread(folly::Function<void::{lambda()https://github.com/facebookincubator/velox/issues/1}>::type std::__invoke<folly::NamedThreadFactory::newThread(folly::Funoid ()>&&)::{lambda()https://github.com/facebookincubator/velox/issues/1}>(std::__invoke_result&&, (folly::NamedThreadFactory::newThlly::Function<void ()>&&)::{lambda()facebookincubator#1}&&)...)
# 24 void std::thread::_Invoker<std::tuple<folly::NamedThreadFactory::newThread(folction<void ()>&&)::{lambda()facebookincubator#1}> >::_M_invoke<0ul>(std::_Index_tuple<0ul>)
# 25 std::thread::_Invoker<std::tuple<folly::NamedThreadFactory::newThread(folly::F<void ()>&&)::{lambda()facebookincubator#1}> >::operator()()
# 26 std::thread::_State_impl<std::thread::_Invoker<std::tuple<folly::NamedThreadFanewThread(folly::Function<void ()>&&)::{lambda()facebookincubator#1}> > >::_M_run()
# 27 0x00000000000d6df3
# 28 start_thread
# 29 clone
```

The frame start has already been established and a buffer has been allocated [here](https://github.com/facebookincubator/velox/blob/main/velox/exec/Window.cpp#L144). However, when creating the frame end, the offset is negative, which triggers an error [here](https://github.com/facebookincubator/velox/blob/main/velox/exec/Window.cpp#L137). At this point, the buffer allocated for the frame start has not been released in time, resulting in the reported memory leak. This PR ensures that the expected error check for negative frame does not result in a memory leak.

Pull Request resolved: facebookincubator#10847

Reviewed By: Yuhta

Differential Revision: D61894144

Pulled By: xiaoxmeng

fbshipit-source-id: 859d95a37cf1481f1af7f8127e7544ef6ea4a402
Joe-Abraham pushed a commit to Joe-Abraham/velox that referenced this pull request Sep 3, 2024
…tor#10847)

Summary:
We got the following exception when running window fuzzer test with `ROWS between k0 PRECEDING and -2157330091482954703 FOLLOWING` frame in ubuntu 20.04 os.
```
I0822 23:45:06.386909 2118850 WindowFuzzer.cpp:277] ==============================> Done with iteration 228
I0822 23:45:06.387009 2118850 WindowFuzzer.cpp:217] ==============================> Started iteration 229 (seed: 2266604503)
I0822 23:45:06.413928 2118850 AggregationFuzzerBase.cpp:427] Executing query plan:
-- Window[1][partition by [p0, p1] order by [s0 ASC NULLS FIRST, s1 DESC NULLS FIRST, row_number ASC NULLS LAST] w0 := approx_percentile(ROW["c0"],ROW["c1"],ROW["c2"]) ROWS between k0 FOLLOWING and -8268856359831993385 FOLLOWING] -> c0:DOUBLE, c1:BIGINT, c2:DOUBLE, s0:ROW<f0:BOOLEAN,f1:INTERVAL DAY TO SECOND,f2:BOOLEAN,f3:VARCHAR,f4:INTEGER>, s1:SMALLINT, p0:ARRAY<DOUBLE>, p1:ARRAY<DOUBLE>, k0:INTEGER, row_number:BIGINT, w0:DOUBLE
  -- Values[0][1000 rows in 10 vectors] -> c0:DOUBLE, c1:BIGINT, c2:DOUBLE, s0:ROW<f0:BOOLEAN,f1:INTERVAL DAY TO SECOND,f2:BOOLEAN,f3:VARCHAR,f4:INTEGER>, s1:SMALLINT, p0:ARRAY<DOUBLE>, p1:ARRAY<DOUBLE>, k0:INTEGER, row_number:BIGINT
I0822 23:45:06.414655 2121801 Task.cpp:1920] Terminating task test_cursor 1170 with state Failed after running for 0ms
I0822 23:45:06.414741 2121801 Task.cpp:1155] All drivers (1) finished for task test_cursor 1170 after running for 0ms
E0822 23:45:06.414860 2121801 MemoryPool.cpp:448] [MEM] Memory leak (Used memory): Memory Pool[op.1.0.0.Window LEAF root[query.TaskCursorQuery_1169.1619] parent[node.1] MALLOC track-usage thread-safe]<unlimited max capacity unlimited capacity used 128B available 1023.88KB reservation [used 128B, reserved 1.00MB, min 0B] counters [allocs 9, frees 8, reserves 0, releases 1, collisions 0])>
E0822 23:45:06.414930 2121801 Exceptions.h:67] Line: /mnt/DP_disk3/jk/projects/fb-velox/velox/velox/common/memory/MemoryArbitrator.cpp:100, Function:removePool, Expression: pool->reservedBytes() == 0 (1048576 vs. 0), Source: RUNTIME, ErrorCode: INVALID_STATE
terminate called after throwing an instance of 'facebook::velox::VeloxRuntimeError'
  what():  Exception: VeloxRuntimeError
Error Source: RUNTIME
Error Code: INVALID_STATE
Reason: (1048576 vs. 0)
Retriable: False
Expression: pool->reservedBytes() == 0
Function: removePool
File: /mnt/DP_disk3/jk/projects/fb-velox/velox/velox/common/memory/MemoryArbitrator.cpp
Line: 100
Stack trace:
# 0  facebook::velox::VeloxException::VeloxException(char const*, unsigned long, char const*, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, bool, facebook::velox::VeloxException::Type, std::basic_string_view<char, std::char_traits<char> >)
# 1  void facebook::velox::detail::veloxCheckFail<facebook::velox::VeloxRuntimeError, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>(facebook::velox::detail::VeloxCheckFailArgs const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
# 2  facebook::velox::memory::(anonymous namespace)::NoopArbitrator::removePool(facebook::velox::memory::MemoryPool*)
# 3  facebook::velox::memory::MemoryManager::dropPool(facebook::velox::memory::MemoryPool*)
# 4  facebook::velox::memory::MemoryPoolImpl::~MemoryPoolImpl()
# 5  std::_Sp_counted_ptr<facebook::velox::core::QueryCtx*, (__gnu_cxx::_Lock_policy)2>::_M_dispose()
# 6  std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() [clone .part.0]
# 7  facebook::velox::exec::Task::~Task()
# 8  std::_Sp_counted_ptr<facebook::velox::exec::Task*, (__gnu_cxx::_Lock_policy)2>::_M_dispose()
# 9  std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() [clone .part.0]
# 10 std::_Sp_counted_ptr<facebook::velox::exec::Driver*, (__gnu_cxx::_Lock_policy)2>::_M_dispose()
# 11 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() [clone .part.0]
# 12 unsigned long folly::detail::function::DispatchSmall::exec<facebook::velox::exec::Driver::enqueue(std::shared_ptr<facebook::velox::exec::Driver>)::{lambda()https://github.com/facebookincubator/velox/issues/1}>(folly::detail::function::Op, folly::detail::function::Data*, folly::detail::function::Data)
# 13 folly::ThreadPoolExecutor::runTask(std::shared_ptr<folly::ThreadPoolExecutor::Thread> const&, folly::ThreadPoolExecutor::Task&&)
# 14 folly::CPUThreadPoolExecutor::threadRun(std::shared_ptr<folly::ThreadPoolExecutor::Thread>)
# 15 void folly::detail::function::call_<std::_Bind<void (folly::ThreadPoolExecutor::*(folly::ThreadPoolExecutor*, std::shared_ptr<folly::ThreadPoolExecutor::Thread>))(std::shared_ptr<folly::ThreadPoolExecutor::Thread>)>, true, false, void>(, folly::detail::function::Data&)
# 16 0x00000000000d6df3
# 17 start_thread
# 18 clone

*** Aborted at 1724341506 (Unix time, try 'date -d 1724341506') ***
*** Signal 6 (SIGABRT) (0x2054c2) received by PID 2118850 (pthread TID 0x7f7e077fe700) (linux TID 2121801) (maybe from PID 2118850, UID 0) (code: -6), stack trace: ***
    @ 0000000007449637 folly::symbolizer::(anonymous namespace)::signalHandler(int, siginfo_t*, void*)
                       /mnt/DP_disk3/jk/projects/gluten/ep/build-velox/build/velox_ep/folly/folly/experimental/symbolizer/SignalHandler.cpp:453
    @ 000000000001441f (unknown)
    @ 000000000004300b gsignal
    @ 0000000000022858 abort
    @ 000000000009e8d0 (unknown)
    @ 00000000000aa37b (unknown)
    @ 00000000000a9358 (unknown)
    @ 00000000000a9d10 __gxx_personality_v0
    @ 0000000000003f88 __libunwind_Unwind_Resume
    @ 00000000023e7d28 facebook::velox::memory::MemoryManager::dropPool(facebook::velox::memory::MemoryPool*) [clone .cold]
    @ 0000000007395017 facebook::velox::memory::MemoryPoolImpl::~MemoryPoolImpl()
    @ 00000000071d6a98 std::_Sp_counted_ptr<facebook::velox::core::QueryCtx*, (__gnu_cxx::_Lock_policy)2>::_M_dispose()
    @ 00000000024332d9 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() [clone .part.0]
    @ 0000000006c75269 facebook::velox::exec::Task::~Task()
    @ 0000000006c92975 std::_Sp_counted_ptr<facebook::velox::exec::Task*, (__gnu_cxx::_Lock_policy)2>::_M_dispose()
    @ 00000000024332d9 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() [clone .part.0]
    @ 0000000006d0fdd3 std::_Sp_counted_ptr<facebook::velox::exec::Driver*, (__gnu_cxx::_Lock_policy)2>::_M_dispose()
    @ 00000000024332d9 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() [clone .part.0]
    @ 0000000006ce647f unsigned long folly::detail::function::DispatchSmall::exec<facebook::velox::exec::Driver::enqueue(std::shared_ptr<facebook::velox::exec::Driver>)::{lambda()https://github.com/facebookincubator/velox/issues/1}>(folly::detail::function::Op, folly::detail::function::Data*, folly::detail::function::Data)
    @ 000000000743f420 folly::ThreadPoolExecutor::runTask(std::shared_ptr<folly::ThreadPoolExecutor::Thread> const&, folly::ThreadPoolExecutor::Task&&)
                       /mnt/DP_disk3/jk/projects/gluten/ep/build-velox/build/velox_ep/folly/folly/Function.h:639
                       -> /mnt/DP_disk3/jk/projects/gluten/ep/build-velox/build/velox_ep/folly/folly/executors/ThreadPoolExecutor.cpp
    @ 0000000007427d25 folly::CPUThreadPoolExecutor::threadRun(std::shared_ptr<folly::ThreadPoolExecutor::Thread>)
                       /mnt/DP_disk3/jk/projects/gluten/ep/build-velox/build/velox_ep/folly/folly/executors/CPUThreadPoolExecutor.cpp:350
    @ 0000000007445d2d void folly::detail::function::call_<std::_Bind<void (folly::ThreadPoolExecutor::*(folly::ThreadPoolExecutor*, std::shared_ptr<folly::ThreadPoolExecutor::Thread>))(std::shared_ptr<folly::ThreadPoolExecutor::Thread>)>, true, false, void>(, folly::detail::function::Data&)
                       /usr/include/c++/9/bits/invoke.h:73
                       -> /mnt/DP_disk3/jk/projects/gluten/ep/build-velox/build/velox_ep/folly/folly/executors/ThreadPoolExecutor.cpp
    @ 00000000000d6df3 (unknown)
    @ 0000000000008608 start_thread
    @ 000000000011f352 clone
Aborted (core dumped)
```

We discovered that the memory leak occurs during the creation of the frame.

```
Detected total of 1 leaked allocations:
======== Leaked memory from 1 total allocations of 96B total size ========
# 0  facebook::velox::memory::MemoryPoolImpl::recordAllocDbg(void const*, unsigned
# 1  facebook::velox::memory::MemoryPoolImpl::allocate(long)
# 2  boost::intrusive_ptr<facebook::velox::Buffer> facebook::velox::AlignedBuffer::e<int>(unsigned long, facebook::velox::memory::MemoryPool*, std::optional<int> cons
# 3  std::shared_ptr<facebook::velox::BaseVector> facebook::velox::createEmpty<(facvelox::TypeKind)3>(int, facebook::velox::memory::MemoryPool*, std::shared_ptr<faceblox::Type const> const&)
# 4  facebook::velox::BaseVector::createInternal(std::shared_ptr<facebook::velox::Tst> const&, int, facebook::velox::memory::MemoryPool*)::{lambda()https://github.com/facebookincubator/velox/issues/1}::operator()() {lambda()https://github.com/facebookincubator/velox/issues/1}::operator()() const
# 5  facebook::velox::BaseVector::createInternal(std::shared_ptr<facebook::velox::Tst> const&, int, facebook::velox::memory::MemoryPool*)::{lambda()https://github.com/facebookincubator/velox/issues/1}::operator()()
# 6  facebook::velox::BaseVector::createInternal(std::shared_ptr<facebook::velox::Tst> const&, int, facebook::velox::memory::MemoryPool*)
# 7  std::shared_ptr<facebook::velox::BaseVector> facebook::velox::BaseVector::creabook::velox::BaseVector>(std::shared_ptr<facebook::velox::Type const> const&, int, k::velox::memory::MemoryPool*)
# 8  facebook::velox::exec::Window::createWindowFrame(std::shared_ptr<facebook::vele::WindowNode const> const&, facebook::velox::core::WindowNode::Frame const&, std::ptr<facebook::velox::RowType const> const&)::{lambda(std::shared_ptr<facebook::velo::ITypedExpr const> const&)https://github.com/facebookincubator/velox/issues/1}::operator()(std::shared_ptr<facebook::velox::core::Ipr const> const&) const
# 9  facebook::velox::exec::Window::createWindowFrame(std::shared_ptr<facebook::vele::WindowNode const> const&, facebook::velox::core::WindowNode::Frame const&, std::ptr<facebook::velox::RowType const> const&)
# 10 facebook::velox::exec::Window::createWindowFunctions()
# 11 facebook::velox::exec::Window::initialize()
# 12 facebook::velox::exec::Driver::initializeOperators()
# 13 facebook::velox::exec::Driver::runInternal(std::shared_ptr<facebook::velox::exver>&, std::shared_ptr<facebook::velox::exec::BlockingState>&, std::shared_ptr<faceelox::RowVector>&)
# 14 facebook::velox::exec::Driver::run(std::shared_ptr<facebook::velox::exec::Driv
# 15 facebook::velox::exec::Driver::enqueue(std::shared_ptr<facebook::velox::exec::)::{lambda()https://github.com/facebookincubator/velox/issues/1}::operator()() const
# 16 void folly::detail::function::call_<facebook::velox::exec::Driver::enqueue(stdd_ptr<facebook::velox::exec::Driver>)::{lambda()facebookincubator#1}, true, false, void>(, folly::deunction::Data&)
# 17 folly::ThreadPoolExecutor::runTask(std::shared_ptr<folly::ThreadPoolExecutor:: const&, folly::ThreadPoolExecutor::Task&&)
# 18 folly::CPUThreadPoolExecutor::threadRun(std::shared_ptr<folly::ThreadPoolExecuread>)
# 19 void folly::detail::function::call_<std::_Bind<void (folly::ThreadPoolExecutorly::ThreadPoolExecutor*, std::shared_ptr<folly::ThreadPoolExecutor::Thread>))(std::ptr<folly::ThreadPoolExecutor::Thread>)>, true, false, void>(, folly::detail::functta&)
# 20 folly::detail::function::FunctionTraits<void ()>::operator()()
# 21 folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()https://github.com/facebookincubator/velox/issues/1}tor()()
# 22 void std::__invoke_impl<void, folly::NamedThreadFactory::newThread(folly::Funcid ()>&&)::{lambda()https://github.com/facebookincubator/velox/issues/1}>(std::__invoke_other, folly::NamedThreadFactory::newThread(Function<void ()>&&)::{lambda()facebookincubator#1}&&)
# 23 std::__invoke_result<folly::NamedThreadFactory::newThread(folly::Function<void::{lambda()https://github.com/facebookincubator/velox/issues/1}>::type std::__invoke<folly::NamedThreadFactory::newThread(folly::Funoid ()>&&)::{lambda()https://github.com/facebookincubator/velox/issues/1}>(std::__invoke_result&&, (folly::NamedThreadFactory::newThlly::Function<void ()>&&)::{lambda()facebookincubator#1}&&)...)
# 24 void std::thread::_Invoker<std::tuple<folly::NamedThreadFactory::newThread(folction<void ()>&&)::{lambda()facebookincubator#1}> >::_M_invoke<0ul>(std::_Index_tuple<0ul>)
# 25 std::thread::_Invoker<std::tuple<folly::NamedThreadFactory::newThread(folly::F<void ()>&&)::{lambda()facebookincubator#1}> >::operator()()
# 26 std::thread::_State_impl<std::thread::_Invoker<std::tuple<folly::NamedThreadFanewThread(folly::Function<void ()>&&)::{lambda()facebookincubator#1}> > >::_M_run()
# 27 0x00000000000d6df3
# 28 start_thread
# 29 clone
```

The frame start has already been established and a buffer has been allocated [here](https://github.com/facebookincubator/velox/blob/main/velox/exec/Window.cpp#L144). However, when creating the frame end, the offset is negative, which triggers an error [here](https://github.com/facebookincubator/velox/blob/main/velox/exec/Window.cpp#L137). At this point, the buffer allocated for the frame start has not been released in time, resulting in the reported memory leak. This PR ensures that the expected error check for negative frame does not result in a memory leak.

Pull Request resolved: facebookincubator#10847

Reviewed By: Yuhta

Differential Revision: D61894144

Pulled By: xiaoxmeng

fbshipit-source-id: 859d95a37cf1481f1af7f8127e7544ef6ea4a402
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants