-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
std::linalg
accessors and transposed_layout
#2962
Merged
Merged
Changes from 62 commits
Commits
Show all changes
70 commits
Select commit
Hold shift + click to select a range
dee9385
draft of scaled accessor
fbusato 0a33496
add scaled unit test
fbusato fb03ff2
Merge branch 'main' into linalg-accessors
fbusato c5c6fa4
add conjugated accessor
fbusato ce2cc84
refined scaled accessor implementation
fbusato 1bf05d0
add [[nodiscard]]
fbusato 1a261ff
add transposed function
fbusato 8db4036
add conjugate_transposed
fbusato f48792a
fix internal names
fbusato 67c59f9
replace inline lambda with function object
fbusato 9d913be
add tests
fbusato 77c2d66
Merge branch 'main' into linalg-accessors
fbusato 3d06282
fix c++20 requires clause
fbusato c3d1464
Merge branch 'main' into linalg-accessors
fbusato 5390bd1
add __cccl_lib_mdspan check
fbusato 626f63a
prevent to include headers if the compiler is not supported
fbusato baa26f6
skip double noexcept for old compilers
fbusato 4cce6e6
avoid deduction guides to prevent errors with old gcc versions
fbusato 5031fb6
fix #endif position
fbusato 8d20f84
Merge branch 'main' into linalg-accessors
fbusato bdee5a3
fix clang9/gcc9 compatibility
fbusato 10c0f0a
remove redundant header
fbusato feed72e
avoid deduction guides for conjugate_transposed
fbusato 30cf216
fix variable shadowing
fbusato a39d27a
fix nvrtc header bug
fbusato 9699ce5
Merge branch 'main' into linalg-accessors
fbusato 2da684d
relax noexcept(noexcept()) compiler filtering
fbusato 8746de6
adopt concept for conj_if_needed
fbusato 6445b49
add documentation
fbusato cf34003
fix compiler identification macro
fbusato ad47459
Update libcudacxx/include/cuda/std/__linalg/conjugate_transposed.h
fbusato 9c621a5
Update libcudacxx/include/cuda/std/__linalg/conjugated.h
fbusato 6d521d8
Update libcudacxx/include/cuda/std/__linalg/conjugated.h
fbusato ea0038f
Update libcudacxx/include/cuda/std/__linalg/conjugated.h
fbusato 6f18bee
Update libcudacxx/include/cuda/std/__linalg/conjugate_transposed.h
fbusato 4411e5d
add linalg reference in docs
fbusato 22c37f6
fix documentation
fbusato fe7d176
add linalg header
fbusato 48483f4
change test header
fbusato 7fb5d60
remove redundant namespace specifications
fbusato 627d354
Merge branch 'main' into linalg-accessors
fbusato b565ad6
add operator!=
fbusato 5af13ad
fix new linalg documentation position
fbusato 103f1b6
fix c++20 require clause
fbusato 377bf53
fix requires expression again
fbusato eef1e17
remove forward_like duplication in docs
fbusato de30054
license update
fbusato 8093083
Merge branch 'main' into linalg-accessors
fbusato 140c843
add `unreachable` in standard_api.rst
fbusato ec199d0
Merge branch 'main' into linalg-accessors
fbusato 71c7211
add missing constexpr
fbusato 7b529bd
remove duplicate line in docs
fbusato 03e7387
split scaled accessor constructors
fbusato 67686de
Merge branch 'NVIDIA:main' into linalg-accessors
fbusato 94d95c8
license update to Kokkos v4
fbusato e3ce965
Update libcudacxx/test/libcudacxx/std/linalg/layouts/transposed.pass.cpp
fbusato 962c513
Update libcudacxx/test/libcudacxx/std/linalg/accessors/conjugate_tran…
fbusato b618e27
Update libcudacxx/test/libcudacxx/std/linalg/accessors/conjugated.pas…
fbusato d29a17d
Update libcudacxx/test/libcudacxx/std/linalg/accessors/scaled.pass.cpp
fbusato 6cf30e7
replace [[maybe_unused]]
fbusato 91201f6
moving noexcept to static constexpr functions
fbusato 2186a28
moving private functions on top
fbusato 1a1e6d1
Update libcudacxx/test/libcudacxx/std/linalg/layouts/transposed.pass.cpp
fbusato 699d168
Update libcudacxx/include/cuda/std/__linalg/transposed.h
fbusato 46ea14d
Update libcudacxx/include/cuda/std/__linalg/conjugated.h
fbusato 53083c7
Update libcudacxx/include/cuda/std/__linalg/conj_if_needed.h
fbusato 3fb575e
license update
fbusato be1354f
move noexcept checks to constexpr variables
fbusato 7ed2d04
fix linalg header position
fbusato 575aa52
fix wrong type for noexcept variables
fbusato File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
.. _libcudacxx-standard-api-numerics-linalg: | ||
|
||
``<cuda/std/linalg>`` | ||
============================================ | ||
|
||
Provided functionalities | ||
------------------------ | ||
|
||
- ``scaled()`` `std::linalg::scaled <https://en.cppreference.com/w/cpp/numeric/linalg/scaled>`_ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure if we want to add our own short documentation for these (sounds taxing), but cppref has no information whatsoever yet, so a new user would have to dig on their own to find out what they are doing. |
||
- ``scaled_accessor`` `std::linalg::scaled_accessor <https://en.cppreference.com/w/cpp/numeric/linalg/scaled_accessor>`_ | ||
- ``conjugated()`` `std::linalg::conjugated <https://en.cppreference.com/w/cpp/numeric/linalg/conjugated>`_ | ||
- ``conjugated_accessor`` `std::linalg::conjugated_accessor <https://en.cppreference.com/w/cpp/numeric/linalg/conjugated_accessor>`_ | ||
- ``transposed()`` `std::linalg::transposed <https://en.cppreference.com/w/cpp/numeric/linalg/transposed>`_ | ||
- ``layout_transpose`` `std::linalg::layout_transpose <https://en.cppreference.com/w/cpp/numeric/linalg/layout_transpose>`_ | ||
- ``conjugate_transposed()`` `std::linalg::conjugate_transposed <https://en.cppreference.com/w/cpp/numeric/linalg/conjugate_transposed>`_ | ||
|
||
Extensions | ||
---------- | ||
|
||
- C++26 ``std::linalg`` accessors, transposed layout, and related functions are available in C++17 | ||
|
||
Omissions | ||
--------- | ||
|
||
- Currently we do not expose any BLAS functions and layouts. | ||
|
||
Restrictions | ||
------------ | ||
|
||
- On device no exceptions are thrown in case of a bad access. | ||
- MSVC is only supported with C++20 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
//@HEADER | ||
// ************************************************************************ | ||
// | ||
// Kokkos v. 4.0 | ||
// Copyright (2022) National Technology & Engineering | ||
// Solutions of Sandia, LLC (NTESS). | ||
// | ||
// Under the terms of Contract DE-NA0003525 with NTESS, | ||
// the U.S. Government retains certain rights in this software. | ||
// | ||
// Part of Kokkos, under the Apache License v2.0 with LLVM Exceptions. | ||
// See https://kokkos.org/LICENSE for license information. | ||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
// | ||
fbusato marked this conversation as resolved.
Show resolved
Hide resolved
|
||
// ************************************************************************ | ||
//@HEADER | ||
|
||
#ifndef _LIBCUDACXX___LINALG_CONJUGATE_IF_NEEDED_HPP | ||
#define _LIBCUDACXX___LINALG_CONJUGATE_IF_NEEDED_HPP | ||
|
||
#include <cuda/std/detail/__config> | ||
|
||
#if defined(_CCCL_IMPLICIT_SYSTEM_HEADER_GCC) | ||
# pragma GCC system_header | ||
#elif defined(_CCCL_IMPLICIT_SYSTEM_HEADER_CLANG) | ||
# pragma clang system_header | ||
#elif defined(_CCCL_IMPLICIT_SYSTEM_HEADER_MSVC) | ||
# pragma system_header | ||
#endif // no system header | ||
|
||
#include <cuda/std/version> | ||
|
||
#if defined(__cccl_lib_mdspan) && _CCCL_STD_VER >= 2017 | ||
|
||
# include <cuda/std/__concepts/concept_macros.h> | ||
# include <cuda/std/__type_traits/is_arithmetic.h> | ||
# include <cuda/std/complex> | ||
|
||
_LIBCUDACXX_BEGIN_NAMESPACE_STD | ||
|
||
namespace linalg | ||
{ | ||
|
||
_LIBCUDACXX_BEGIN_NAMESPACE_CPO(__conj_if_needed) | ||
|
||
template <class _Type> | ||
_CCCL_CONCEPT _HasConj = _CCCL_REQUIRES_EXPR((_Type), _Type __a)(static_cast<void>(_CUDA_VSTD::conj(__a))); | ||
|
||
struct __conj_if_needed | ||
{ | ||
template <class _Type> | ||
_LIBCUDACXX_HIDE_FROM_ABI constexpr auto operator()(const _Type& __t) const | ||
{ | ||
if constexpr (is_arithmetic_v<_Type> || !_HasConj<_Type>) | ||
{ | ||
return __t; | ||
} | ||
else | ||
{ | ||
return _CUDA_VSTD::conj(__t); | ||
} | ||
_CCCL_UNREACHABLE(); | ||
} | ||
}; | ||
|
||
_LIBCUDACXX_END_NAMESPACE_CPO | ||
|
||
inline namespace __cpo | ||
{ | ||
_CCCL_GLOBAL_CONSTANT auto conj_if_needed = __conj_if_needed::__conj_if_needed{}; | ||
|
||
} // namespace __cpo | ||
} // end namespace linalg | ||
|
||
_LIBCUDACXX_END_NAMESPACE_STD | ||
|
||
#endif // defined(__cccl_lib_mdspan) && _CCCL_STD_VER >= 2017 | ||
#endif // _LIBCUDACXX___LINALG_CONJUGATED_HPP |
55 changes: 55 additions & 0 deletions
55
libcudacxx/include/cuda/std/__linalg/conjugate_transposed.h
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
//@HEADER | ||
// ************************************************************************ | ||
// | ||
// Kokkos v. 4.0 | ||
// Copyright (2022) National Technology & Engineering | ||
// Solutions of Sandia, LLC (NTESS). | ||
// | ||
// Under the terms of Contract DE-NA0003525 with NTESS, | ||
// the U.S. Government retains certain rights in this software. | ||
// | ||
// Part of Kokkos, under the Apache License v2.0 with LLVM Exceptions. | ||
// See https://kokkos.org/LICENSE for license information. | ||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
// | ||
// ************************************************************************ | ||
//@HEADER | ||
|
||
#ifndef _LIBCUDACXX___LINALG_CONJUGATE_TRANSPOSED_HPP | ||
#define _LIBCUDACXX___LINALG_CONJUGATE_TRANSPOSED_HPP | ||
|
||
#include <cuda/std/detail/__config> | ||
|
||
#if defined(_CCCL_IMPLICIT_SYSTEM_HEADER_GCC) | ||
# pragma GCC system_header | ||
#elif defined(_CCCL_IMPLICIT_SYSTEM_HEADER_CLANG) | ||
# pragma clang system_header | ||
#elif defined(_CCCL_IMPLICIT_SYSTEM_HEADER_MSVC) | ||
# pragma system_header | ||
#endif // no system header | ||
|
||
#include <cuda/std/version> | ||
|
||
#if defined(__cccl_lib_mdspan) && _CCCL_STD_VER >= 2017 | ||
|
||
# include <cuda/std/__linalg/conjugated.h> | ||
# include <cuda/std/__linalg/transposed.h> | ||
|
||
_LIBCUDACXX_BEGIN_NAMESPACE_STD | ||
|
||
namespace linalg | ||
{ | ||
|
||
template <class _ElementType, class _Extents, class _Layout, class _Accessor> | ||
_CCCL_NODISCARD _LIBCUDACXX_HIDE_FROM_ABI constexpr auto | ||
conjugate_transposed(mdspan<_ElementType, _Extents, _Layout, _Accessor> __a) | ||
{ | ||
return conjugated(transposed(__a)); | ||
} | ||
|
||
} // end namespace linalg | ||
|
||
_LIBCUDACXX_END_NAMESPACE_STD | ||
|
||
#endif // defined(__cccl_lib_mdspan) && _CCCL_STD_VER >= 2017 | ||
#endif // _LIBCUDACXX___LINALG_CONJUGATE_TRANSPOSED_HPP |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,141 @@ | ||
//@HEADER | ||
// ************************************************************************ | ||
// | ||
// Kokkos v. 4.0 | ||
// Copyright (2022) National Technology & Engineering | ||
// Solutions of Sandia, LLC (NTESS). | ||
// | ||
// Under the terms of Contract DE-NA0003525 with NTESS, | ||
// the U.S. Government retains certain rights in this software. | ||
// | ||
// Part of Kokkos, under the Apache License v2.0 with LLVM Exceptions. | ||
// See https://kokkos.org/LICENSE for license information. | ||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
// | ||
// ************************************************************************ | ||
//@HEADER | ||
|
||
#ifndef _LIBCUDACXX___LINALG_CONJUGATED_HPP | ||
#define _LIBCUDACXX___LINALG_CONJUGATED_HPP | ||
|
||
#include <cuda/std/detail/__config> | ||
|
||
#if defined(_CCCL_IMPLICIT_SYSTEM_HEADER_GCC) | ||
# pragma GCC system_header | ||
#elif defined(_CCCL_IMPLICIT_SYSTEM_HEADER_CLANG) | ||
# pragma clang system_header | ||
#elif defined(_CCCL_IMPLICIT_SYSTEM_HEADER_MSVC) | ||
# pragma system_header | ||
#endif // no system header | ||
|
||
#include <cuda/std/version> | ||
|
||
#if defined(__cccl_lib_mdspan) && _CCCL_STD_VER >= 2017 | ||
|
||
# include <cuda/std/__linalg/conj_if_needed.h> | ||
# include <cuda/std/__type_traits/add_const.h> | ||
# include <cuda/std/__type_traits/is_arithmetic.h> | ||
# include <cuda/std/__type_traits/remove_const.h> | ||
# include <cuda/std/__utility/declval.h> | ||
# include <cuda/std/mdspan> | ||
|
||
_LIBCUDACXX_BEGIN_NAMESPACE_STD | ||
|
||
namespace linalg | ||
{ | ||
|
||
template <class _NestedAccessor> | ||
class conjugated_accessor | ||
{ | ||
private: | ||
using __nested_element_type = typename _NestedAccessor::element_type; | ||
using __nc_result_type = decltype(conj_if_needed(_CUDA_VSTD::declval<__nested_element_type>())); | ||
|
||
public: | ||
using element_type = add_const_t<__nc_result_type>; | ||
using reference = remove_const_t<element_type>; | ||
using data_handle_type = typename _NestedAccessor::data_handle_type; | ||
using offset_policy = conjugated_accessor<typename _NestedAccessor::offset_policy>; | ||
|
||
_CCCL_HIDE_FROM_ABI constexpr conjugated_accessor() = default; | ||
|
||
_LIBCUDACXX_HIDE_FROM_ABI constexpr conjugated_accessor(const _NestedAccessor& __acc) | ||
: __nested_accessor_(__acc) | ||
{} | ||
|
||
_CCCL_TEMPLATE(class _OtherNestedAccessor) | ||
_CCCL_REQUIRES(_CCCL_TRAIT(is_constructible, _NestedAccessor, const _OtherNestedAccessor&) | ||
_CCCL_AND _CCCL_TRAIT(is_convertible, _OtherNestedAccessor, _NestedAccessor)) | ||
_LIBCUDACXX_HIDE_FROM_ABI constexpr conjugated_accessor(const conjugated_accessor<_OtherNestedAccessor>& __other) | ||
: __nested_accessor_(__other.nested_accessor()) | ||
{} | ||
|
||
_CCCL_TEMPLATE(class _OtherNestedAccessor) | ||
_CCCL_REQUIRES(_CCCL_TRAIT(is_constructible, _NestedAccessor, const _OtherNestedAccessor&) | ||
_CCCL_AND(!_CCCL_TRAIT(is_convertible, _OtherNestedAccessor, _NestedAccessor))) | ||
_LIBCUDACXX_HIDE_FROM_ABI explicit constexpr conjugated_accessor( | ||
const conjugated_accessor<_OtherNestedAccessor>& __other) | ||
: __nested_accessor_(__other.nested_accessor()) | ||
{} | ||
|
||
_LIBCUDACXX_HIDE_FROM_ABI constexpr reference access(data_handle_type __p, size_t __i) const noexcept | ||
{ | ||
return conj_if_needed(__nested_element_type(__nested_accessor_.access(__p, __i))); | ||
} | ||
|
||
_CCCL_NODISCARD _LIBCUDACXX_HIDE_FROM_ABI constexpr typename offset_policy::data_handle_type | ||
offset(data_handle_type __p, size_t __i) const noexcept | ||
{ | ||
return __nested_accessor_.offset(__p, __i); | ||
} | ||
|
||
_CCCL_NODISCARD _LIBCUDACXX_HIDE_FROM_ABI constexpr const _NestedAccessor& nested_accessor() const noexcept | ||
{ | ||
return __nested_accessor_; | ||
} | ||
|
||
private: | ||
_NestedAccessor __nested_accessor_; | ||
}; | ||
|
||
template <class _ElementType, class _Extents, class _Layout, class _Accessor> | ||
_CCCL_NODISCARD _LIBCUDACXX_HIDE_FROM_ABI constexpr auto | ||
conjugated(mdspan<_ElementType, _Extents, _Layout, _Accessor> __a) | ||
{ | ||
using __value_type = typename decltype(__a)::value_type; | ||
// Current status of [linalg] only optimizes if _Accessor is conjugated_accessor<_Accessor> for some _Accessor. | ||
// There's __a separate specialization for that case below. | ||
fbusato marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
// P3050 optimizes conjugated's accessor type for when we know that it can't be complex: arithmetic types, | ||
// and types for which `conj` is not ADL-findable. | ||
if constexpr (is_arithmetic_v<__value_type> || !__conj_if_needed::_HasConj<__value_type>) | ||
{ | ||
return mdspan<_ElementType, _Extents, _Layout, _Accessor>(__a.data_handle(), __a.mapping(), __a.accessor()); | ||
} | ||
else | ||
{ | ||
using __return_element_type = typename conjugated_accessor<_Accessor>::element_type; | ||
using __return_accessor_type = conjugated_accessor<_Accessor>; | ||
return mdspan<__return_element_type, _Extents, _Layout, __return_accessor_type>{ | ||
__a.data_handle(), __a.mapping(), __return_accessor_type(__a.accessor())}; | ||
} | ||
_CCCL_UNREACHABLE(); | ||
} | ||
|
||
// Conjugation is self-annihilating | ||
template <class _ElementType, class _Extents, class _Layout, class _NestedAccessor> | ||
_CCCL_NODISCARD _LIBCUDACXX_HIDE_FROM_ABI constexpr auto | ||
conjugated(mdspan<_ElementType, _Extents, _Layout, conjugated_accessor<_NestedAccessor>> __a) | ||
{ | ||
using __return_element_type = typename _NestedAccessor::element_type; | ||
using __return_accessor_type = _NestedAccessor; | ||
return mdspan<__return_element_type, _Extents, _Layout, __return_accessor_type>( | ||
__a.data_handle(), __a.mapping(), __a.accessor().nested_accessor()); | ||
} | ||
|
||
} // end namespace linalg | ||
|
||
_LIBCUDACXX_END_NAMESPACE_STD | ||
|
||
#endif // defined(__cccl_lib_mdspan) && _CCCL_STD_VER >= 2017 | ||
#endif // _LIBCUDACXX___LINALG_CONJUGATED_HPP |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we were to maintain the chronological order maybe the C++26 segments should all go below the last C+23 segment in #L124, unless there is a specific reason why we grouped them like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, that would require to grab C++26
std::dim
along, which is not part of that PR. As always, not a hard suggestion.