Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Don't instantiate the scalar composition code quadratically for dictionaries #2391

Merged
merged 2 commits into from
Aug 11, 2022

Conversation

Marwes
Copy link
Contributor

@Marwes Marwes commented Aug 10, 2022

Instead, re-use the ones normal function. Reduces how much code datafusion-physical-expr generated significantly (since the functions are generic, and not instantiated in arrow itself, it only shows up downstream).

https://github.com/apache/arrow-datafusion

There is technically an extra indirect call now as the recursive call to eq_dyn_scalar etc coerces to a dyn Array again but that seems unlikely to matter.

cargo llvm-lines -p datafusion-physical-expr

Before

 Lines           Copies        Function name
  -----           ------        -------------
  2270242 (100%)  38377 (100%)  (TOTAL)
   245854 (10.8%)  5580 (14.5%) core::option::Option<T>::ok_or_else
    58690 (2.6%)     10 (0.0%)  arrow::compute::kernels::comparison::eq_dyn_scalar
    58690 (2.6%)     10 (0.0%)  arrow::compute::kernels::comparison::gt_dyn_scalar
    58690 (2.6%)     10 (0.0%)  arrow::compute::kernels::comparison::gt_eq_dyn_scalar
    58690 (2.6%)     10 (0.0%)  arrow::compute::kernels::comparison::lt_dyn_scalar
    58690 (2.6%)     10 (0.0%)  arrow::compute::kernels::comparison::lt_eq_dyn_scalar
    58690 (2.6%)     10 (0.0%)  arrow::compute::kernels::comparison::neq_dyn_scalar
    55800 (2.5%)    900 (2.3%)  arrow::compute::kernels::comparison::eq_dyn_scalar::{{closure}}
    55800 (2.5%)    900 (2.3%)  arrow::compute::kernels::comparison::gt_dyn_scalar::{{closure}}
    55800 (2.5%)    900 (2.3%)  arrow::compute::kernels::comparison::gt_eq_dyn_scalar::{{closure}}
    55800 (2.5%)    900 (2.3%)  arrow::compute::kernels::comparison::lt_dyn_scalar::{{closure}}
    55800 (2.5%)    900 (2.3%)  arrow::compute::kernels::comparison::lt_eq_dyn_scalar::{{closure}}
    55800 (2.5%)    900 (2.3%)  arrow::compute::kernels::comparison::neq_dyn_scalar::{{closure}}
    44929 (2.0%)    900 (2.3%)  core::option::Option<T>::map
    40986 (1.8%)    162 (0.4%)  <arrow::array::array_boolean::BooleanArray as core::iter::traits::collect::FromIterator<Ptr>>::from_iter
    37528 (1.7%)    508 (1.3%)  core::iter::traits::iterator::Iterator::fold
    30595 (1.3%)    245 (0.6%)  <alloc::vec::Vec<T> as alloc::vec::spec_from_iter_nested::SpecFromIterNested<T,I>>::from_iter
    29272 (1.3%)     46 (0.1%)  <core::iter::adapters::flatten::FlattenCompat<I,U> as core::iter::traits::iterator::Iterator>::size_hint
    27815 (1.2%)    285 (0.7%)  core::iter::traits::iterator::Iterator::try_fold
    26014 (1.1%)      1 (0.0%)  datafusion_physical_expr::expressions::binary::BinaryExpr::evaluate_array_scalar
    25095 (1.1%)    441 (1.1%)  core::iter::adapters::map::map_fold::{{closure}}
    22849 (1.0%)    174 (0.5%)  <core::iter::adapters::GenericShunt<I,R> as core::iter::traits::iterator::Iterator>::try_fold::{{closure}}
    21888 (1.0%)     96 (0.3%)  arrow::compute::kernels::comparison::compare_op_scalar
    21464 (0.9%)     56 (0.1%)  <arrow::array::array_string::GenericStringArray<OffsetSize> as core::iter::traits::collect::FromIterator<core::option::Option<Ptr>>>::from_iter
    21461 (0.9%)    441 (1.1%)  <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::fold
    19918 (0.9%)    118 (0.3%)  arrow::buffer::mutable::MutableBuffer::from_trusted_len_iter
    16916 (0.7%)    246 (0.6%)  <alloc::vec::Vec<T,A> as alloc::vec::spec_extend::SpecExtend<T,I>>::spec_extend

After

  Lines           Copies        Function name
  -----           ------        -------------
  1475122 (100%)  28777 (100%)  (TOTAL)
    44929 (3.0%)    900 (3.1%)  core::option::Option<T>::map
    40986 (2.8%)    162 (0.6%)  <arrow::array::array_boolean::BooleanArray as core::iter::traits::collect::FromIterator<Ptr>>::from_iter
    37528 (2.5%)    508 (1.8%)  core::iter::traits::iterator::Iterator::fold
    34174 (2.3%)    780 (2.7%)  core::option::Option<T>::ok_or_else
    30595 (2.1%)    245 (0.9%)  <alloc::vec::Vec<T> as alloc::vec::spec_from_iter_nested::SpecFromIterNested<T,I>>::from_iter
    29272 (2.0%)     46 (0.2%)  <core::iter::adapters::flatten::FlattenCompat<I,U> as core::iter::traits::iterator::Iterator>::size_hint
    27815 (1.9%)    285 (1.0%)  core::iter::traits::iterator::Iterator::try_fold
    26014 (1.8%)      1 (0.0%)  datafusion_physical_expr::expressions::binary::BinaryExpr::evaluate_array_scalar
    25095 (1.7%)    441 (1.5%)  core::iter::adapters::map::map_fold::{{closure}}
    22849 (1.5%)    174 (0.6%)  <core::iter::adapters::GenericShunt<I,R> as core::iter::traits::iterator::Iterator>::try_fold::{{closure}}
    21888 (1.5%)     96 (0.3%)  arrow::compute::kernels::comparison::compare_op_scalar
    21464 (1.5%)     56 (0.2%)  <arrow::array::array_string::GenericStringArray<OffsetSize> as core::iter::traits::collect::FromIterator<core::option::Option<Ptr>>>::from_iter
    21461 (1.5%)    441 (1.5%)  <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::fold
    19918 (1.4%)    118 (0.4%)  arrow::buffer::mutable::MutableBuffer::from_trusted_len_iter
    16916 (1.1%)    246 (0.9%)  <alloc::vec::Vec<T,A> as alloc::vec::spec_extend::SpecExtend<T,I>>::spec_extend
    16146 (1.1%)    960 (3.3%)  core::iter::adapters::map::Map<I,F>::new
    15492 (1.1%)    427 (1.5%)  core::iter::traits::iterator::Iterator::for_each
    14921 (1.0%)    111 (0.4%)  alloc::vec::Vec<T,A>::extend_desugared
    14670 (1.0%)    126 (0.4%)  core::iter::adapters::try_process
    13918 (0.9%)      1 (0.0%)  datafusion_physical_expr::expressions::binary::BinaryExpr::evaluate_scalar_array
    13120 (0.9%)     64 (0.2%)  <arrow::array::array_primitive::PrimitiveArray<T> as core::iter::traits::collect::FromIterator<Ptr>>::from_iter
    12963 (0.9%)     52 (0.2%)  <core::iter::adapters::flatten::FlattenCompat<I,U> as core::iter::traits::iterator::Iterator>::try_fold
    12245 (0.8%)    180 (0.6%)  <core::iter::adapters::enumerate::Enumerate<I> as core::iter::traits::iterator::Iterator>::fold::enumerate::{{closure}}
    12201 (0.8%)     81 (0.3%)  arrow::buffer::mutable::MutableBuffer::extend_from_iter
    11826 (0.8%)    162 (0.6%)  <arrow::array::array_boolean::BooleanArray as core::iter::traits::collect::FromIterator<Ptr>>::from_iter::{{closure}}
    11536 (0.8%)    960 (3.3%)  core::iter::traits::iterator::Iterator::map
    11200 (0.8%)     32 (0.1%)  alloc::raw_vec::RawVec<T,A>::grow_amortized

Rationale for this change

Cuts down (release) compile times when these functions are used (in datafusion-physical-expr)

Before

________________________________________________________
Executed in  147,60 secs   fish           external
   usr time  323,32 secs  693,00 micros  323,32 secs
   sys time    9,43 secs  187,00 micros    9,43 secs

After

Executed in  133,27 secs   fish           external
   usr time  312,95 secs  498,00 micros  312,95 secs
   sys time    9,94 secs  134,00 micros    9,94 secs

Debug compile times remain pretty much unchanged, phyiscal-expr itself gets ~4s faster (21 -> 17) but it ends up having to wait for arrow to complete instead.

Markus Westerlind added 2 commits August 10, 2022 13:59
…dictionaries

Instead, re-use the ones normal function. Reduces how much code `datafusion-physical-expr` generated significantly (since the functions are generic, and not instantiated in `arrow` itself, it only shows up downstream).

https://github.com/apache/arrow-datafusion

There is technically an extra indirect call now as the recursive call to `eq_dyn_scalar` etc coerces to a `dyn Array` again but that seems unlikely to matter.

## cargo llvm-lines -p datafusion-physical-expr

### Before

```
 Lines           Copies        Function name
  -----           ------        -------------
  2270242 (100%)  38377 (100%)  (TOTAL)
   245854 (10.8%)  5580 (14.5%) core::option::Option<T>::ok_or_else
    58690 (2.6%)     10 (0.0%)  arrow::compute::kernels::comparison::eq_dyn_scalar
    58690 (2.6%)     10 (0.0%)  arrow::compute::kernels::comparison::gt_dyn_scalar
    58690 (2.6%)     10 (0.0%)  arrow::compute::kernels::comparison::gt_eq_dyn_scalar
    58690 (2.6%)     10 (0.0%)  arrow::compute::kernels::comparison::lt_dyn_scalar
    58690 (2.6%)     10 (0.0%)  arrow::compute::kernels::comparison::lt_eq_dyn_scalar
    58690 (2.6%)     10 (0.0%)  arrow::compute::kernels::comparison::neq_dyn_scalar
    55800 (2.5%)    900 (2.3%)  arrow::compute::kernels::comparison::eq_dyn_scalar::{{closure}}
    55800 (2.5%)    900 (2.3%)  arrow::compute::kernels::comparison::gt_dyn_scalar::{{closure}}
    55800 (2.5%)    900 (2.3%)  arrow::compute::kernels::comparison::gt_eq_dyn_scalar::{{closure}}
    55800 (2.5%)    900 (2.3%)  arrow::compute::kernels::comparison::lt_dyn_scalar::{{closure}}
    55800 (2.5%)    900 (2.3%)  arrow::compute::kernels::comparison::lt_eq_dyn_scalar::{{closure}}
    55800 (2.5%)    900 (2.3%)  arrow::compute::kernels::comparison::neq_dyn_scalar::{{closure}}
    44929 (2.0%)    900 (2.3%)  core::option::Option<T>::map
    40986 (1.8%)    162 (0.4%)  <arrow::array::array_boolean::BooleanArray as core::iter::traits::collect::FromIterator<Ptr>>::from_iter
    37528 (1.7%)    508 (1.3%)  core::iter::traits::iterator::Iterator::fold
    30595 (1.3%)    245 (0.6%)  <alloc::vec::Vec<T> as alloc::vec::spec_from_iter_nested::SpecFromIterNested<T,I>>::from_iter
    29272 (1.3%)     46 (0.1%)  <core::iter::adapters::flatten::FlattenCompat<I,U> as core::iter::traits::iterator::Iterator>::size_hint
    27815 (1.2%)    285 (0.7%)  core::iter::traits::iterator::Iterator::try_fold
    26014 (1.1%)      1 (0.0%)  datafusion_physical_expr::expressions::binary::BinaryExpr::evaluate_array_scalar
    25095 (1.1%)    441 (1.1%)  core::iter::adapters::map::map_fold::{{closure}}
    22849 (1.0%)    174 (0.5%)  <core::iter::adapters::GenericShunt<I,R> as core::iter::traits::iterator::Iterator>::try_fold::{{closure}}
    21888 (1.0%)     96 (0.3%)  arrow::compute::kernels::comparison::compare_op_scalar
    21464 (0.9%)     56 (0.1%)  <arrow::array::array_string::GenericStringArray<OffsetSize> as core::iter::traits::collect::FromIterator<core::option::Option<Ptr>>>::from_iter
    21461 (0.9%)    441 (1.1%)  <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::fold
    19918 (0.9%)    118 (0.3%)  arrow::buffer::mutable::MutableBuffer::from_trusted_len_iter
    16916 (0.7%)    246 (0.6%)  <alloc::vec::Vec<T,A> as alloc::vec::spec_extend::SpecExtend<T,I>>::spec_extend
```

### After

```
  Lines           Copies        Function name
  -----           ------        -------------
  1475122 (100%)  28777 (100%)  (TOTAL)
    44929 (3.0%)    900 (3.1%)  core::option::Option<T>::map
    40986 (2.8%)    162 (0.6%)  <arrow::array::array_boolean::BooleanArray as core::iter::traits::collect::FromIterator<Ptr>>::from_iter
    37528 (2.5%)    508 (1.8%)  core::iter::traits::iterator::Iterator::fold
    34174 (2.3%)    780 (2.7%)  core::option::Option<T>::ok_or_else
    30595 (2.1%)    245 (0.9%)  <alloc::vec::Vec<T> as alloc::vec::spec_from_iter_nested::SpecFromIterNested<T,I>>::from_iter
    29272 (2.0%)     46 (0.2%)  <core::iter::adapters::flatten::FlattenCompat<I,U> as core::iter::traits::iterator::Iterator>::size_hint
    27815 (1.9%)    285 (1.0%)  core::iter::traits::iterator::Iterator::try_fold
    26014 (1.8%)      1 (0.0%)  datafusion_physical_expr::expressions::binary::BinaryExpr::evaluate_array_scalar
    25095 (1.7%)    441 (1.5%)  core::iter::adapters::map::map_fold::{{closure}}
    22849 (1.5%)    174 (0.6%)  <core::iter::adapters::GenericShunt<I,R> as core::iter::traits::iterator::Iterator>::try_fold::{{closure}}
    21888 (1.5%)     96 (0.3%)  arrow::compute::kernels::comparison::compare_op_scalar
    21464 (1.5%)     56 (0.2%)  <arrow::array::array_string::GenericStringArray<OffsetSize> as core::iter::traits::collect::FromIterator<core::option::Option<Ptr>>>::from_iter
    21461 (1.5%)    441 (1.5%)  <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::fold
    19918 (1.4%)    118 (0.4%)  arrow::buffer::mutable::MutableBuffer::from_trusted_len_iter
    16916 (1.1%)    246 (0.9%)  <alloc::vec::Vec<T,A> as alloc::vec::spec_extend::SpecExtend<T,I>>::spec_extend
    16146 (1.1%)    960 (3.3%)  core::iter::adapters::map::Map<I,F>::new
    15492 (1.1%)    427 (1.5%)  core::iter::traits::iterator::Iterator::for_each
    14921 (1.0%)    111 (0.4%)  alloc::vec::Vec<T,A>::extend_desugared
    14670 (1.0%)    126 (0.4%)  core::iter::adapters::try_process
    13918 (0.9%)      1 (0.0%)  datafusion_physical_expr::expressions::binary::BinaryExpr::evaluate_scalar_array
    13120 (0.9%)     64 (0.2%)  <arrow::array::array_primitive::PrimitiveArray<T> as core::iter::traits::collect::FromIterator<Ptr>>::from_iter
    12963 (0.9%)     52 (0.2%)  <core::iter::adapters::flatten::FlattenCompat<I,U> as core::iter::traits::iterator::Iterator>::try_fold
    12245 (0.8%)    180 (0.6%)  <core::iter::adapters::enumerate::Enumerate<I> as core::iter::traits::iterator::Iterator>::fold::enumerate::{{closure}}
    12201 (0.8%)     81 (0.3%)  arrow::buffer::mutable::MutableBuffer::extend_from_iter
    11826 (0.8%)    162 (0.6%)  <arrow::array::array_boolean::BooleanArray as core::iter::traits::collect::FromIterator<Ptr>>::from_iter::{{closure}}
    11536 (0.8%)    960 (3.3%)  core::iter::traits::iterator::Iterator::map
    11200 (0.8%)     32 (0.1%)  alloc::raw_vec::RawVec<T,A>::grow_amortized

```
…ry_to_type in comparisons (-4%)

Reduces the number of llvm-lines in datafusion-physical-expr by another 4%
@github-actions github-actions bot added the arrow Changes to the arrow crate label Aug 10, 2022
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM -- thanks @Marwes !

@@ -1186,7 +1163,7 @@ where
{
match left.data_type() {
DataType::Dictionary(key_type, _value_type) => {
dyn_compare_scalar!(left, right, key_type, eq_scalar)
dyn_compare_scalar!(left, right, key_type, eq_dyn_scalar)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

@alamb alamb merged commit 6948373 into apache:master Aug 11, 2022
@ursabot
Copy link

ursabot commented Aug 11, 2022

Benchmark runs are scheduled for baseline = 6b1369e and contender = 6948373. 6948373 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants