Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<numeric>: Can we improve throughput or reduce repetition with is_constant_evaluated()? #414

Closed
StephanTLavavej opened this issue Jan 9, 2020 · 0 comments · Fixed by #425
Assignees
Labels
fixed Something works now, yay! throughput Must compile faster

Comments

@StephanTLavavej
Copy link
Member

After @Neargye implemented constexpr <numeric> in #399, @miscco observed in #399 (comment) that there is a potential throughput concern here (and another occurrence below):

STL/stl/inc/numeric

Lines 78 to 97 in 7ad0d63

#ifdef __cpp_lib_is_constant_evaluated
// TRANSITION, DevCom-878972
if (_STD is_constant_evaluated()) {
for (; _UFirst != _ULast; ++_UFirst) {
_Val = _Reduce_op(_STD move(_Val), *_UFirst); // Requirement missing from N4713
}
return _Val;
} else
#endif // __cpp_lib_is_constant_evaluated
{
if constexpr (_Plus_on_arithmetic_ranges_reduction_v<_Unwrapped_t<const _InIt&>, _Ty, _BinOp>) {
(void) _Reduce_op; // TRANSITION, VSO-486357
return _Reduce_plus_arithmetic_ranges(_UFirst, _ULast, _Val);
} else {
for (; _UFirst != _ULast; ++_UFirst) {
_Val = _Reduce_op(_STD move(_Val), *_UFirst); // Requirement missing from N4713
}
return _Val;
}
}

On a related note, I observe that there is code repetition between the is_constant_evaluated and !_Plus_on_arithmetic_ranges_reduction_v paths.

I suspect that we can't get both optimal throughput and zero repetition (because of the non-if constexpr nature of is_constant_evaluated). So perhaps the best pattern is:

if constexpr (_Plus_on_arithmetic_ranges_reduction_v) {
    if (!_STD is_constant_evaluated()) {
        return _Reduce_plus_arithmetic_ranges(ARGS);
    }
}

return CLASSIC_IMPLEMENTATION;
@StephanTLavavej StephanTLavavej added enhancement Something can be improved throughput Must compile faster labels Jan 9, 2020
@BillyONeal BillyONeal self-assigned this Jan 13, 2020
BillyONeal added a commit to BillyONeal/STL that referenced this issue Jan 16, 2020
Resolves microsoftGH-6, microsoftGH-38, and drive-by fixes microsoftGH-414.

Constexprizes the following algorithms and enables all relevant libc++ tests:

* adjacent_find
* all_of
* any_of
* binary_search
* copy
* copy_backward
* copy_if
* copy_n
* count
* count_if
* equal
* equal_range
* exchange
* fill
* fill_n
* find
* find_end
* find_first_of
* find_if
* find_if_not
* for_each
* for_each_n
* generate
* generate_n
* includes
* is_heap
* is_heap
* is_heap_until
* is_partitioned
* is_permutation
* is_sorted
* is_sorted_until
* iter_swap
* lexicographical_compare
* lower_bound
* make_heap
* merge
* mismatch
* move
* move_backward
* next_permutation
* none_of
* nth_element
* partial_sort
* partial_sort_copy
* partition
* partition_copy
* partition_point
* pop_heap
* prev_permutation
* push_heap
* remove
* remove_copy
* remove_copy_if
* remove_if
* replace
* replace_copy
* replace_copy_if
* replace_if
* reverse_copy
* revese
* rotate
* rotate_copy
* search
* search_n
* set_difference
* set_intersection
* set_symmetric_difference
* set_union
* sort
* sort_heap
* swap
* swap_ranges
* transform
* unique
* unique_copy
* upper_bound

This commit also contains the contents of microsoft#423 to workaround DevCom-883631, it will be rebased and that content removed once that is merged (the other one needs to be merged first).

Specific notes:

`<xutility>`:
* The `_Ptr_copy_cat` family are moved down next to std::copy as that is their first consumer.
* The core language loop for `copy_n` is fairly long and so it was extracted into its own function, `_Copy_n_core` (note similar name schema to `_Copy_memmove`)
* `reverse` was changed to use early-returns for its optimization passes; this removes the nice thing of having if constexpr not instantiate the loop. However, this form allows the loop to not be duplicated.
BillyONeal added a commit to BillyONeal/STL that referenced this issue Jan 21, 2020
Resolves microsoftGH-6 ( P0202R3 ), microsoftGH-38 ( P0879R0 ), and drive-by fixes microsoftGH-414.

Everywhere: Add constexpr, _CONSTEXPR20, and _CONSTEXPR20_ICE to things.

skipped_tests.txt: Turn on all tests previously blocked by missing constexpr algorithms (and exchange and swap). Mark those algorithms that cannot be turned on that we have outstanding PRs for with their associated PRs.
yvals_core.h: Turn on feature test macros.
xutility:
* Move the _Ptr_cat family down to copy, and fix associated SHOUTY comments to indicate that this is really an implementation detail of copy, not something the rest of the standard library intends to use directly. Removed and clarified some of the comments as requested by Casey Carter.
* Extract _Copy_n_core which implements copy_n using only the core language (rather than memcpy-as-an-intrinsic). Note that we cannot use __builtin_memcpy or similar to avoid the is_constant_evaluated check here; builtin_memcpy only works in constexpr contexts when the inputs are of type char.
numeric: Refactor as suggested by microsoftGH-414.
BillyONeal added a commit to BillyONeal/STL that referenced this issue Jan 22, 2020
StephanTLavavej pushed a commit that referenced this issue Jan 23, 2020
* Implement constexpr algorithms.

Resolves GH-6 ( P0202R3 ), resolves GH-38 ( P0879R0 ), and drive-by fixes GH-414.

Everywhere: Add constexpr, _CONSTEXPR20, and _CONSTEXPR20_ICE to things.

skipped_tests.txt: Turn on all tests previously blocked by missing constexpr algorithms (and exchange and swap). Mark those algorithms that cannot be turned on that we have outstanding PRs for with their associated PRs.
yvals_core.h: Turn on feature test macros.
xutility:
* Move the _Ptr_cat family down to copy, and fix associated SHOUTY comments to indicate that this is really an implementation detail of copy, not something the rest of the standard library intends to use directly. Removed and clarified some of the comments as requested by Casey Carter.
* Extract _Copy_n_core which implements copy_n using only the core language (rather than memcpy-as-an-intrinsic). Note that we cannot use __builtin_memcpy or similar to avoid the is_constant_evaluated check here; builtin_memcpy only works in constexpr contexts when the inputs are of type char.
numeric: Refactor as suggested by GH-414.

* Attempt alternate fix of GH-414 suggested by Stephan.

* Stephan product code PR comments:

* _Swap_ranges_unchecked => _CONSTEXPR20
* _Idl_dist_add => _NODISCARD (and remove comments)
* is_permutation => _NODISCARD
* Add yvals_core.h comments.

* Delete unused _Copy_n_core and TRANSITION, DevCom-889321 comment.

* Put the comments in the right place and remove phantom braces.
@StephanTLavavej StephanTLavavej added the fixed Something works now, yay! label Jan 23, 2020
@StephanTLavavej StephanTLavavej removed the enhancement Something can be improved label Feb 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fixed Something works now, yay! throughput Must compile faster
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants