mirror of
https://github.com/gcc-mirror/gcc.git
synced 2026-05-06 06:49:09 +02:00
libstdc++: Implement [simd] for C++26
This implementation differs significantly from the
std::experimental::simd implementation. One goal was a reduction in
template instantiations wrt. what std::experimental::simd did.
Design notes:
- bits/vec_ops.h contains concepts, traits, and functions for working
with GNU vector builtins that are mostly independent from std::simd.
These could move from std::simd:: to std::__vec (or similar). However,
we would then need to revisit naming. For now we kept everything in
the std::simd namespace with __vec_ prefix in the names. The __vec_*
functions can be called unqualified because they can never be called
on user-defined types (no ADL). If we ever get simd<UDT> support this
will be implemented via bit_cast to/from integral vector
builtins/intrinsics.
- bits/simd_x86.h extends vec_ops.h with calls to __builtin_ia32_* that
can only be used after uttering the right GCC target pragma.
- basic_vec and basic_mask are built on top of register-size GNU vector
builtins (for now / x86). Any larger vec/mask is a tree of power-of-2
#elements on the "first" branch. Anything non-power-of-2 that is
smaller than register size uses padding elements that participate in
element-wise operations. The library ensures that padding elements
lead to no side effects. The implementation makes no assumption on the
values of these padding elements since the user can bit_cast to
basic_vec/basic_mask.
Implementation status:
- The implementation is prepared for more than x86 but is x86-only for
now.
- Parts of [simd] *not* implemented in this patch:
- std::complex<floating-point> as vectorizable types
- [simd.permute.dynamic]
- [simd.permute.mask]
- [simd.permute.memory]
- [simd.bit]
- [simd.math]
- mixed operations with vec-mask and bit-mask types
- some conversion optimizations (open questions wrt. missed
optimizations in the compiler)
- This patch implements P3844R3 "Restore simd::vec broadcast from int",
which is not part of the C++26 WD draft yet. If the paper does not get
accepted the feature will be reverted.
- This patch implements D4042R0 "incorrect cast between simd::vec and
simd::mask via conversion to and from impl-defined vector types" (to be
published once the reported LWG issue gets a number).
- The standard feature test macro __cpp_lib_simd is not defined yet.
Tests:
- Full coverage requires testing
1. constexpr,
2. constant-propagating inputs, and
3. unknown (to the optimizer) inputs
- for all vectorizable types
* for every supported width (1–64 and higher)
+ for all possible ISA extensions (combinations)
= with different fast-math flags
... leading to a test matrix that's far out of reach for regular
testsuite builds.
- The tests in testsuite/std/simd/ try to cover all of the API. The
tests can be build in every combination listed above. Per default only
a small subset is built and tested.
- Use GCC_TEST_RUN_EXPENSIVE=something to compile the more expensive
tests (constexpr and const-prop testing) and to enable more /
different widths for the test type.
- Tests can still emit bogus -Wpsabi warnings (see PR98734) which are
filtered out via dg-prune-output.
Benchmarks:
- The current implementation has been benchmarked in some aspects on
x86_64 hardware. There is more optimization potential. However, it is
not always clear whether optimizations should be part of the library
if they can be implemented in the compiler.
- No benchmark code is included in this patch.
libstdc++-v3/ChangeLog:
* include/Makefile.am: Add simd headers.
* include/Makefile.in: Regenerate.
* include/bits/version.def (simd): New.
* include/bits/version.h: Regenerate.
* include/bits/simd_alg.h: New file.
* include/bits/simd_details.h: New file.
* include/bits/simd_flags.h: New file.
* include/bits/simd_iterator.h: New file.
* include/bits/simd_loadstore.h: New file.
* include/bits/simd_mask.h: New file.
* include/bits/simd_mask_reductions.h: New file.
* include/bits/simd_reductions.h: New file.
* include/bits/simd_vec.h: New file.
* include/bits/simd_x86.h: New file.
* include/bits/vec_ops.h: New file.
* include/std/simd: New file.
* testsuite/std/simd/arithmetic.cc: New test.
* testsuite/std/simd/arithmetic_expensive.cc: New test.
* testsuite/std/simd/create_tests.h: New file.
* testsuite/std/simd/creation.cc: New test.
* testsuite/std/simd/creation_expensive.cc: New test.
* testsuite/std/simd/loads.cc: New test.
* testsuite/std/simd/loads_expensive.cc: New test.
* testsuite/std/simd/mask2.cc: New test.
* testsuite/std/simd/mask2_expensive.cc: New test.
* testsuite/std/simd/mask.cc: New test.
* testsuite/std/simd/mask_expensive.cc: New test.
* testsuite/std/simd/reductions.cc: New test.
* testsuite/std/simd/reductions_expensive.cc: New test.
* testsuite/std/simd/shift_left.cc: New test.
* testsuite/std/simd/shift_left_expensive.cc: New test.
* testsuite/std/simd/shift_right.cc: New test.
* testsuite/std/simd/shift_right_expensive.cc: New test.
* testsuite/std/simd/simd_alg.cc: New test.
* testsuite/std/simd/simd_alg_expensive.cc: New test.
* testsuite/std/simd/sse_intrin.cc: New test.
* testsuite/std/simd/stores.cc: New test.
* testsuite/std/simd/stores_expensive.cc: New test.
* testsuite/std/simd/test_setup.h: New file.
* testsuite/std/simd/traits_common.cc: New test.
* testsuite/std/simd/traits_impl.cc: New test.
* testsuite/std/simd/traits_math.cc: New test.
Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
This commit is contained in:
@@ -100,6 +100,7 @@ std_headers = \
|
||||
${std_srcdir}/semaphore \
|
||||
${std_srcdir}/set \
|
||||
${std_srcdir}/shared_mutex \
|
||||
${std_srcdir}/simd \
|
||||
${std_srcdir}/spanstream \
|
||||
${std_srcdir}/sstream \
|
||||
${std_srcdir}/syncstream \
|
||||
@@ -264,6 +265,16 @@ bits_headers = \
|
||||
${bits_srcdir}/shared_ptr.h \
|
||||
${bits_srcdir}/shared_ptr_atomic.h \
|
||||
${bits_srcdir}/shared_ptr_base.h \
|
||||
${bits_srcdir}/simd_alg.h \
|
||||
${bits_srcdir}/simd_details.h \
|
||||
${bits_srcdir}/simd_flags.h \
|
||||
${bits_srcdir}/simd_iterator.h \
|
||||
${bits_srcdir}/simd_loadstore.h \
|
||||
${bits_srcdir}/simd_mask.h \
|
||||
${bits_srcdir}/simd_mask_reductions.h \
|
||||
${bits_srcdir}/simd_reductions.h \
|
||||
${bits_srcdir}/simd_vec.h \
|
||||
${bits_srcdir}/simd_x86.h \
|
||||
${bits_srcdir}/slice_array.h \
|
||||
${bits_srcdir}/specfun.h \
|
||||
${bits_srcdir}/sstream.tcc \
|
||||
@@ -296,6 +307,7 @@ bits_headers = \
|
||||
${bits_srcdir}/valarray_array.tcc \
|
||||
${bits_srcdir}/valarray_before.h \
|
||||
${bits_srcdir}/valarray_after.h \
|
||||
${bits_srcdir}/vec_ops.h \
|
||||
${bits_srcdir}/vector.tcc
|
||||
endif GLIBCXX_HOSTED
|
||||
|
||||
|
||||
@@ -459,6 +459,7 @@ std_freestanding = \
|
||||
@GLIBCXX_HOSTED_TRUE@ ${std_srcdir}/semaphore \
|
||||
@GLIBCXX_HOSTED_TRUE@ ${std_srcdir}/set \
|
||||
@GLIBCXX_HOSTED_TRUE@ ${std_srcdir}/shared_mutex \
|
||||
@GLIBCXX_HOSTED_TRUE@ ${std_srcdir}/simd \
|
||||
@GLIBCXX_HOSTED_TRUE@ ${std_srcdir}/spanstream \
|
||||
@GLIBCXX_HOSTED_TRUE@ ${std_srcdir}/sstream \
|
||||
@GLIBCXX_HOSTED_TRUE@ ${std_srcdir}/syncstream \
|
||||
@@ -620,6 +621,16 @@ bits_freestanding = \
|
||||
@GLIBCXX_HOSTED_TRUE@ ${bits_srcdir}/shared_ptr.h \
|
||||
@GLIBCXX_HOSTED_TRUE@ ${bits_srcdir}/shared_ptr_atomic.h \
|
||||
@GLIBCXX_HOSTED_TRUE@ ${bits_srcdir}/shared_ptr_base.h \
|
||||
@GLIBCXX_HOSTED_TRUE@ ${bits_srcdir}/simd_alg.h \
|
||||
@GLIBCXX_HOSTED_TRUE@ ${bits_srcdir}/simd_details.h \
|
||||
@GLIBCXX_HOSTED_TRUE@ ${bits_srcdir}/simd_flags.h \
|
||||
@GLIBCXX_HOSTED_TRUE@ ${bits_srcdir}/simd_iterator.h \
|
||||
@GLIBCXX_HOSTED_TRUE@ ${bits_srcdir}/simd_loadstore.h \
|
||||
@GLIBCXX_HOSTED_TRUE@ ${bits_srcdir}/simd_mask.h \
|
||||
@GLIBCXX_HOSTED_TRUE@ ${bits_srcdir}/simd_mask_reductions.h \
|
||||
@GLIBCXX_HOSTED_TRUE@ ${bits_srcdir}/simd_reductions.h \
|
||||
@GLIBCXX_HOSTED_TRUE@ ${bits_srcdir}/simd_vec.h \
|
||||
@GLIBCXX_HOSTED_TRUE@ ${bits_srcdir}/simd_x86.h \
|
||||
@GLIBCXX_HOSTED_TRUE@ ${bits_srcdir}/slice_array.h \
|
||||
@GLIBCXX_HOSTED_TRUE@ ${bits_srcdir}/specfun.h \
|
||||
@GLIBCXX_HOSTED_TRUE@ ${bits_srcdir}/sstream.tcc \
|
||||
@@ -652,6 +663,7 @@ bits_freestanding = \
|
||||
@GLIBCXX_HOSTED_TRUE@ ${bits_srcdir}/valarray_array.tcc \
|
||||
@GLIBCXX_HOSTED_TRUE@ ${bits_srcdir}/valarray_before.h \
|
||||
@GLIBCXX_HOSTED_TRUE@ ${bits_srcdir}/valarray_after.h \
|
||||
@GLIBCXX_HOSTED_TRUE@ ${bits_srcdir}/vec_ops.h \
|
||||
@GLIBCXX_HOSTED_TRUE@ ${bits_srcdir}/vector.tcc
|
||||
|
||||
bits_host_headers = \
|
||||
|
||||
98
libstdc++-v3/include/bits/simd_alg.h
Normal file
98
libstdc++-v3/include/bits/simd_alg.h
Normal file
@@ -0,0 +1,98 @@
|
||||
// Implementation of <simd> -*- C++ -*-
|
||||
|
||||
// Copyright The GNU Toolchain Authors.
|
||||
//
|
||||
// This file is part of the GNU ISO C++ Library. This library is free
|
||||
// software; you can redistribute it and/or modify it under the
|
||||
// terms of the GNU General Public License as published by the
|
||||
// Free Software Foundation; either version 3, or (at your option)
|
||||
// any later version.
|
||||
|
||||
// This library is distributed in the hope that it will be useful,
|
||||
// but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
// GNU General Public License for more details.
|
||||
|
||||
// Under Section 7 of GPL version 3, you are granted additional
|
||||
// permissions described in the GCC Runtime Library Exception, version
|
||||
// 3.1, as published by the Free Software Foundation.
|
||||
|
||||
// You should have received a copy of the GNU General Public License and
|
||||
// a copy of the GCC Runtime Library Exception along with this program;
|
||||
// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
|
||||
// <http://www.gnu.org/licenses/>.
|
||||
|
||||
#ifndef _GLIBCXX_SIMD_ALG_H
|
||||
#define _GLIBCXX_SIMD_ALG_H 1
|
||||
|
||||
#ifdef _GLIBCXX_SYSHDR
|
||||
#pragma GCC system_header
|
||||
#endif
|
||||
|
||||
#if __cplusplus >= 202400L
|
||||
|
||||
#include "simd_vec.h"
|
||||
|
||||
// psabi warnings are bogus because the ABI of the internal types never leaks into user code
|
||||
#pragma GCC diagnostic push
|
||||
#pragma GCC diagnostic ignored "-Wpsabi"
|
||||
|
||||
// [simd.alg] -----------------------------------------------------------------
|
||||
namespace std _GLIBCXX_VISIBILITY(default)
|
||||
{
|
||||
_GLIBCXX_BEGIN_NAMESPACE_VERSION
|
||||
namespace simd
|
||||
{
|
||||
template<typename _Tp, typename _Ap>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr basic_vec<_Tp, _Ap>
|
||||
min(const basic_vec<_Tp, _Ap>& __a, const basic_vec<_Tp, _Ap>& __b) noexcept
|
||||
{ return __select_impl(__a < __b, __a, __b); }
|
||||
|
||||
template<typename _Tp, typename _Ap>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr basic_vec<_Tp, _Ap>
|
||||
max(const basic_vec<_Tp, _Ap>& __a, const basic_vec<_Tp, _Ap>& __b) noexcept
|
||||
{ return __select_impl(__a < __b, __b, __a); }
|
||||
|
||||
template<typename _Tp, typename _Ap>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr pair<basic_vec<_Tp, _Ap>, basic_vec<_Tp, _Ap>>
|
||||
minmax(const basic_vec<_Tp, _Ap>& __a, const basic_vec<_Tp, _Ap>& __b) noexcept
|
||||
{ return {min(__a, __b), max(__a, __b)}; }
|
||||
|
||||
template<typename _Tp, typename _Ap>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr basic_vec<_Tp, _Ap>
|
||||
clamp(const basic_vec<_Tp, _Ap>& __v, const basic_vec<_Tp, _Ap>& __lo,
|
||||
const basic_vec<_Tp, _Ap>& __hi)
|
||||
{
|
||||
__glibcxx_simd_precondition(none_of(__lo > __hi), "lower bound is larger than upper bound");
|
||||
return max(__lo, min(__hi, __v));
|
||||
}
|
||||
|
||||
template<typename _Tp, typename _Up>
|
||||
constexpr auto
|
||||
select(bool __c, const _Tp& __a, const _Up& __b)
|
||||
-> remove_cvref_t<decltype(__c ? __a : __b)>
|
||||
{ return __c ? __a : __b; }
|
||||
|
||||
template<size_t _Bytes, typename _Ap, typename _Tp, typename _Up>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr auto
|
||||
select(const basic_mask<_Bytes, _Ap>& __c, const _Tp& __a, const _Up& __b)
|
||||
noexcept -> decltype(__select_impl(__c, __a, __b))
|
||||
{ return __select_impl(__c, __a, __b); }
|
||||
} // namespace simd
|
||||
|
||||
using simd::min;
|
||||
using simd::max;
|
||||
using simd::minmax;
|
||||
using simd::clamp;
|
||||
|
||||
_GLIBCXX_END_NAMESPACE_VERSION
|
||||
} // namespace std
|
||||
|
||||
#pragma GCC diagnostic pop
|
||||
#endif // C++26
|
||||
#endif // _GLIBCXX_SIMD_ALG_H
|
||||
1394
libstdc++-v3/include/bits/simd_details.h
Normal file
1394
libstdc++-v3/include/bits/simd_details.h
Normal file
File diff suppressed because it is too large
Load Diff
187
libstdc++-v3/include/bits/simd_flags.h
Normal file
187
libstdc++-v3/include/bits/simd_flags.h
Normal file
@@ -0,0 +1,187 @@
|
||||
// Implementation of <simd> -*- C++ -*-
|
||||
|
||||
// Copyright The GNU Toolchain Authors.
|
||||
//
|
||||
// This file is part of the GNU ISO C++ Library. This library is free
|
||||
// software; you can redistribute it and/or modify it under the
|
||||
// terms of the GNU General Public License as published by the
|
||||
// Free Software Foundation; either version 3, or (at your option)
|
||||
// any later version.
|
||||
|
||||
// This library is distributed in the hope that it will be useful,
|
||||
// but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
// GNU General Public License for more details.
|
||||
|
||||
// Under Section 7 of GPL version 3, you are granted additional
|
||||
// permissions described in the GCC Runtime Library Exception, version
|
||||
// 3.1, as published by the Free Software Foundation.
|
||||
|
||||
// You should have received a copy of the GNU General Public License and
|
||||
// a copy of the GCC Runtime Library Exception along with this program;
|
||||
// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
|
||||
// <http://www.gnu.org/licenses/>.
|
||||
|
||||
#ifndef _GLIBCXX_SIMD_FLAGS_H
|
||||
#define _GLIBCXX_SIMD_FLAGS_H 1
|
||||
|
||||
#ifdef _GLIBCXX_SYSHDR
|
||||
#pragma GCC system_header
|
||||
#endif
|
||||
|
||||
#if __cplusplus >= 202400L
|
||||
|
||||
#include "simd_details.h"
|
||||
#include <bits/align.h> // assume_aligned
|
||||
|
||||
namespace std _GLIBCXX_VISIBILITY(default)
|
||||
{
|
||||
_GLIBCXX_BEGIN_NAMESPACE_VERSION
|
||||
namespace simd
|
||||
{
|
||||
// [simd.traits]
|
||||
// --- alignment ---
|
||||
template <typename _Tp, typename _Up = typename _Tp::value_type>
|
||||
struct alignment
|
||||
{};
|
||||
|
||||
template <typename _Tp, typename _Ap, __vectorizable _Up>
|
||||
struct alignment<basic_vec<_Tp, _Ap>, _Up>
|
||||
: integral_constant<size_t, alignof(basic_vec<_Tp, _Ap>)>
|
||||
{};
|
||||
|
||||
template <typename _Tp, typename _Up = typename _Tp::value_type>
|
||||
constexpr size_t alignment_v = alignment<_Tp, _Up>::value;
|
||||
|
||||
// [simd.flags] -------------------------------------------------------------
|
||||
struct _LoadStoreTag
|
||||
{};
|
||||
|
||||
/** @internal
|
||||
* `struct convert-flag`
|
||||
*
|
||||
* C++26 [simd.expos] / [simd.flags]
|
||||
*/
|
||||
struct __convert_flag
|
||||
: _LoadStoreTag
|
||||
{};
|
||||
|
||||
/** @internal
|
||||
* `struct aligned-flag`
|
||||
*
|
||||
* C++26 [simd.expos] / [simd.flags]
|
||||
*/
|
||||
struct __aligned_flag
|
||||
: _LoadStoreTag
|
||||
{
|
||||
template <typename _Tp, typename _Up>
|
||||
[[__gnu__::__always_inline__]]
|
||||
static constexpr _Up*
|
||||
_S_adjust_pointer(_Up* __ptr)
|
||||
{ return assume_aligned<simd::alignment_v<_Tp, remove_cv_t<_Up>>>(__ptr); }
|
||||
};
|
||||
|
||||
/** @internal
|
||||
* `template<size_t N> struct overaligned-flag`
|
||||
*
|
||||
* @tparam _Np alignment in bytes
|
||||
*
|
||||
* C++26 [simd.expos] / [simd.flags]
|
||||
*/
|
||||
template <size_t _Np>
|
||||
struct __overaligned_flag
|
||||
: _LoadStoreTag
|
||||
{
|
||||
static_assert(__has_single_bit(_Np));
|
||||
|
||||
template <typename, typename _Up>
|
||||
[[__gnu__::__always_inline__]]
|
||||
static constexpr _Up*
|
||||
_S_adjust_pointer(_Up* __ptr)
|
||||
{ return assume_aligned<_Np>(__ptr); }
|
||||
};
|
||||
|
||||
struct __partial_loadstore_flag
|
||||
: _LoadStoreTag
|
||||
{};
|
||||
|
||||
|
||||
template <typename _Tp>
|
||||
concept __loadstore_tag = is_base_of_v<_LoadStoreTag, _Tp>;
|
||||
|
||||
template <typename...>
|
||||
struct flags;
|
||||
|
||||
template <typename... _Flags>
|
||||
requires (__loadstore_tag<_Flags> && ...)
|
||||
struct flags<_Flags...>
|
||||
{
|
||||
/** @internal
|
||||
* Returns @c true if the given argument is part of this specialization, otherwise returns @c
|
||||
* false.
|
||||
*/
|
||||
template <typename _F0>
|
||||
static consteval bool
|
||||
_S_test(flags<_F0>)
|
||||
{ return (is_same_v<_Flags, _F0> || ...); }
|
||||
|
||||
friend consteval flags
|
||||
operator|(flags, flags<>)
|
||||
{ return flags{}; }
|
||||
|
||||
template <typename _T0, typename... _More>
|
||||
friend consteval auto
|
||||
operator|(flags, flags<_T0, _More...>)
|
||||
{
|
||||
if constexpr ((same_as<_Flags, _T0> || ...))
|
||||
return flags<_Flags...>{} | flags<_More...>{};
|
||||
else
|
||||
return flags<_Flags..., _T0>{} | flags<_More...>{};
|
||||
}
|
||||
|
||||
/** @internal
|
||||
* Adjusts a pointer according to the alignment requirements of the flags.
|
||||
*
|
||||
* This function iterates over all flags in the pack and applies each flag's
|
||||
* `_S_adjust_pointer` method to the input pointer. Flags that don't provide
|
||||
* this method are ignored.
|
||||
*
|
||||
* @tparam _Tp A basic_vec type for which a load/store pointer is adjusted
|
||||
* @tparam _Up The value-type of the input/output range
|
||||
* @param __ptr The pointer to the range
|
||||
* @return The adjusted pointer
|
||||
*/
|
||||
template <typename _Tp, typename _Up>
|
||||
static constexpr _Up*
|
||||
_S_adjust_pointer(_Up* __ptr)
|
||||
{
|
||||
template for ([[maybe_unused]] constexpr auto __f : {_Flags()...})
|
||||
{
|
||||
if constexpr (requires {__f.template _S_adjust_pointer<_Tp>(__ptr); })
|
||||
__ptr = __f.template _S_adjust_pointer<_Tp>(__ptr);
|
||||
}
|
||||
return __ptr;
|
||||
}
|
||||
};
|
||||
|
||||
inline constexpr flags<> flag_default {};
|
||||
|
||||
inline constexpr flags<__convert_flag> flag_convert {};
|
||||
|
||||
inline constexpr flags<__aligned_flag> flag_aligned {};
|
||||
|
||||
template <size_t _Np>
|
||||
requires(__has_single_bit(_Np))
|
||||
inline constexpr flags<__overaligned_flag<_Np>> flag_overaligned {};
|
||||
|
||||
/** @internal
|
||||
* Pass to unchecked_load or unchecked_store to make it behave like partial_load / partial_store.
|
||||
*/
|
||||
inline constexpr flags<__partial_loadstore_flag> __allow_partial_loadstore {};
|
||||
|
||||
} // namespace simd
|
||||
_GLIBCXX_END_NAMESPACE_VERSION
|
||||
} // namespace std
|
||||
|
||||
#endif // C++26
|
||||
#endif // _GLIBCXX_SIMD_FLAGS_H
|
||||
177
libstdc++-v3/include/bits/simd_iterator.h
Normal file
177
libstdc++-v3/include/bits/simd_iterator.h
Normal file
@@ -0,0 +1,177 @@
|
||||
// Implementation of <simd> -*- C++ -*-
|
||||
|
||||
// Copyright The GNU Toolchain Authors.
|
||||
//
|
||||
// This file is part of the GNU ISO C++ Library. This library is free
|
||||
// software; you can redistribute it and/or modify it under the
|
||||
// terms of the GNU General Public License as published by the
|
||||
// Free Software Foundation; either version 3, or (at your option)
|
||||
// any later version.
|
||||
|
||||
// This library is distributed in the hope that it will be useful,
|
||||
// but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
// GNU General Public License for more details.
|
||||
|
||||
// Under Section 7 of GPL version 3, you are granted additional
|
||||
// permissions described in the GCC Runtime Library Exception, version
|
||||
// 3.1, as published by the Free Software Foundation.
|
||||
|
||||
// You should have received a copy of the GNU General Public License and
|
||||
// a copy of the GCC Runtime Library Exception along with this program;
|
||||
// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
|
||||
// <http://www.gnu.org/licenses/>.
|
||||
|
||||
#ifndef _GLIBCXX_SIMD_ITERATOR_H
|
||||
#define _GLIBCXX_SIMD_ITERATOR_H 1
|
||||
|
||||
#ifdef _GLIBCXX_SYSHDR
|
||||
#pragma GCC system_header
|
||||
#endif
|
||||
|
||||
#if __cplusplus >= 202400L
|
||||
|
||||
#include "simd_details.h"
|
||||
|
||||
namespace std _GLIBCXX_VISIBILITY(default)
|
||||
{
|
||||
_GLIBCXX_BEGIN_NAMESPACE_VERSION
|
||||
namespace simd
|
||||
{
|
||||
/** @internal
|
||||
* Iterator type for basic_vec and basic_mask.
|
||||
*
|
||||
* C++26 [simd.iterator]
|
||||
*/
|
||||
template <typename _Vp>
|
||||
class __iterator
|
||||
{
|
||||
friend class __iterator<const _Vp>;
|
||||
|
||||
template <typename, typename>
|
||||
friend class _VecBase;
|
||||
|
||||
template <size_t, typename>
|
||||
friend class _MaskBase;
|
||||
|
||||
_Vp* _M_data = nullptr;
|
||||
|
||||
__simd_size_type _M_offset = 0;
|
||||
|
||||
constexpr
|
||||
__iterator(_Vp& __d, __simd_size_type __off)
|
||||
: _M_data(&__d), _M_offset(__off)
|
||||
{}
|
||||
|
||||
public:
|
||||
using value_type = typename _Vp::value_type;
|
||||
|
||||
using iterator_category = input_iterator_tag;
|
||||
|
||||
using iterator_concept = random_access_iterator_tag;
|
||||
|
||||
using difference_type = __simd_size_type;
|
||||
|
||||
constexpr __iterator() = default;
|
||||
|
||||
constexpr
|
||||
__iterator(const __iterator &) = default;
|
||||
|
||||
constexpr __iterator&
|
||||
operator=(const __iterator &) = default;
|
||||
|
||||
constexpr
|
||||
__iterator(const __iterator<remove_const_t<_Vp>> &__i) requires is_const_v<_Vp>
|
||||
: _M_data(__i._M_data), _M_offset(__i._M_offset)
|
||||
{}
|
||||
|
||||
constexpr value_type
|
||||
operator*() const
|
||||
{ return (*_M_data)[_M_offset]; } // checked in operator[]
|
||||
|
||||
constexpr __iterator&
|
||||
operator++()
|
||||
{
|
||||
++_M_offset;
|
||||
return *this;
|
||||
}
|
||||
|
||||
constexpr __iterator
|
||||
operator++(int)
|
||||
{
|
||||
__iterator r = *this;
|
||||
++_M_offset;
|
||||
return r;
|
||||
}
|
||||
|
||||
constexpr __iterator&
|
||||
operator--()
|
||||
{
|
||||
--_M_offset;
|
||||
return *this;
|
||||
}
|
||||
|
||||
constexpr __iterator
|
||||
operator--(int)
|
||||
{
|
||||
__iterator r = *this;
|
||||
--_M_offset;
|
||||
return r;
|
||||
}
|
||||
|
||||
constexpr __iterator&
|
||||
operator+=(difference_type __x)
|
||||
{
|
||||
_M_offset += __x;
|
||||
return *this;
|
||||
}
|
||||
|
||||
constexpr __iterator&
|
||||
operator-=(difference_type __x)
|
||||
{
|
||||
_M_offset -= __x;
|
||||
return *this;
|
||||
}
|
||||
|
||||
constexpr value_type
|
||||
operator[](difference_type __i) const
|
||||
{ return (*_M_data)[_M_offset + __i]; } // checked in operator[]
|
||||
|
||||
constexpr friend bool operator==(__iterator __a, __iterator __b) = default;
|
||||
|
||||
constexpr friend bool operator==(__iterator __a, std::default_sentinel_t) noexcept
|
||||
{ return __a._M_offset == _Vp::size.value; }
|
||||
|
||||
constexpr friend auto operator<=>(__iterator __a, __iterator __b)
|
||||
{ return __a._M_offset <=> __b._M_offset; }
|
||||
|
||||
constexpr friend __iterator
|
||||
operator+(const __iterator& __it, difference_type __x)
|
||||
{ return __iterator(*__it._M_data, __it._M_offset + __x); }
|
||||
|
||||
constexpr friend __iterator
|
||||
operator+(difference_type __x, const __iterator& __it)
|
||||
{ return __iterator(*__it._M_data, __it._M_offset + __x); }
|
||||
|
||||
constexpr friend __iterator
|
||||
operator-(const __iterator& __it, difference_type __x)
|
||||
{ return __iterator(*__it._M_data, __it._M_offset - __x); }
|
||||
|
||||
constexpr friend difference_type
|
||||
operator-(__iterator __a, __iterator __b)
|
||||
{ return __a._M_offset - __b._M_offset; }
|
||||
|
||||
constexpr friend difference_type
|
||||
operator-(__iterator __it, std::default_sentinel_t) noexcept
|
||||
{ return __it._M_offset - difference_type(_Vp::size.value); }
|
||||
|
||||
constexpr friend difference_type
|
||||
operator-(std::default_sentinel_t, __iterator __it) noexcept
|
||||
{ return difference_type(_Vp::size.value) - __it._M_offset; }
|
||||
};
|
||||
} // namespace simd
|
||||
_GLIBCXX_END_NAMESPACE_VERSION
|
||||
} // namespace std
|
||||
|
||||
#endif // C++26
|
||||
#endif // _GLIBCXX_SIMD_ITERATOR_H
|
||||
408
libstdc++-v3/include/bits/simd_loadstore.h
Normal file
408
libstdc++-v3/include/bits/simd_loadstore.h
Normal file
@@ -0,0 +1,408 @@
|
||||
// Implementation of <simd> -*- C++ -*-
|
||||
|
||||
// Copyright The GNU Toolchain Authors.
|
||||
//
|
||||
// This file is part of the GNU ISO C++ Library. This library is free
|
||||
// software; you can redistribute it and/or modify it under the
|
||||
// terms of the GNU General Public License as published by the
|
||||
// Free Software Foundation; either version 3, or (at your option)
|
||||
// any later version.
|
||||
|
||||
// This library is distributed in the hope that it will be useful,
|
||||
// but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
// GNU General Public License for more details.
|
||||
|
||||
// Under Section 7 of GPL version 3, you are granted additional
|
||||
// permissions described in the GCC Runtime Library Exception, version
|
||||
// 3.1, as published by the Free Software Foundation.
|
||||
|
||||
// You should have received a copy of the GNU General Public License and
|
||||
// a copy of the GCC Runtime Library Exception along with this program;
|
||||
// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
|
||||
// <http://www.gnu.org/licenses/>.
|
||||
|
||||
#ifndef _GLIBCXX_SIMD_LOADSTORE_H
|
||||
#define _GLIBCXX_SIMD_LOADSTORE_H 1
|
||||
|
||||
#ifdef _GLIBCXX_SYSHDR
|
||||
#pragma GCC system_header
|
||||
#endif
|
||||
|
||||
#if __cplusplus >= 202400L
|
||||
|
||||
#include "simd_vec.h"
|
||||
|
||||
// psabi warnings are bogus because the ABI of the internal types never leaks into user code
|
||||
#pragma GCC diagnostic push
|
||||
#pragma GCC diagnostic ignored "-Wpsabi"
|
||||
|
||||
// [simd.reductions] ----------------------------------------------------------
|
||||
namespace std _GLIBCXX_VISIBILITY(default)
|
||||
{
|
||||
_GLIBCXX_BEGIN_NAMESPACE_VERSION
|
||||
namespace simd
|
||||
{
|
||||
template <typename _Vp, typename _Tp>
|
||||
struct __vec_load_return
|
||||
{ using type = _Vp; };
|
||||
|
||||
template <typename _Tp>
|
||||
struct __vec_load_return<void, _Tp>
|
||||
{ using type = basic_vec<_Tp>; };
|
||||
|
||||
template <typename _Vp, typename _Tp>
|
||||
using __vec_load_return_t = typename __vec_load_return<_Vp, _Tp>::type;
|
||||
|
||||
template <typename _Vp, typename _Tp>
|
||||
using __load_mask_type_t = typename __vec_load_return_t<_Vp, _Tp>::mask_type;
|
||||
|
||||
template <typename _Tp>
|
||||
concept __sized_contiguous_range
|
||||
= ranges::contiguous_range<_Tp> && ranges::sized_range<_Tp>;
|
||||
|
||||
template <typename _Vp = void, __sized_contiguous_range _Rg, typename... _Flags>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr __vec_load_return_t<_Vp, ranges::range_value_t<_Rg>>
|
||||
unchecked_load(_Rg&& __r, flags<_Flags...> __f = {})
|
||||
{
|
||||
using _Tp = ranges::range_value_t<_Rg>;
|
||||
using _RV = __vec_load_return_t<_Vp, _Tp>;
|
||||
using _Rp = typename _RV::value_type;
|
||||
static_assert(__loadstore_convertible_to<ranges::range_value_t<_Rg>, _Rp, _Flags...>,
|
||||
"'flag_convert' must be used for conversions that are not value-preserving");
|
||||
|
||||
constexpr bool __allow_out_of_bounds = __f._S_test(__allow_partial_loadstore);
|
||||
constexpr size_t __static_size = __static_range_size(__r);
|
||||
|
||||
if constexpr (!__allow_out_of_bounds && __static_sized_range<_Rg>)
|
||||
static_assert(ranges::size(__r) >= _RV::size(), "given range must have sufficient size");
|
||||
|
||||
const auto* __ptr = __f.template _S_adjust_pointer<_RV>(ranges::data(__r));
|
||||
const auto __rg_size = std::ranges::size(__r);
|
||||
if constexpr (!__allow_out_of_bounds)
|
||||
__glibcxx_simd_precondition(
|
||||
std::ranges::size(__r) >= _RV::size(),
|
||||
"Input range is too small. Did you mean to use 'partial_load'?");
|
||||
|
||||
if consteval
|
||||
{
|
||||
return _RV([&](size_t __i) -> _Rp {
|
||||
if (__i >= __rg_size)
|
||||
return _Rp();
|
||||
else
|
||||
return static_cast<_Rp>(__r[__i]);
|
||||
});
|
||||
}
|
||||
else
|
||||
{
|
||||
if constexpr ((__static_size != dynamic_extent && __static_size >= size_t(_RV::size()))
|
||||
|| !__allow_out_of_bounds)
|
||||
return _RV(_LoadCtorTag(), __ptr);
|
||||
else
|
||||
return _RV::_S_partial_load(__ptr, __rg_size);
|
||||
}
|
||||
}
|
||||
|
||||
template <typename _Vp = void, __sized_contiguous_range _Rg, typename... _Flags>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr __vec_load_return_t<_Vp, ranges::range_value_t<_Rg>>
|
||||
unchecked_load(_Rg&& __r, const __load_mask_type_t<_Vp, ranges::range_value_t<_Rg>>& __mask,
|
||||
flags<_Flags...> __f = {})
|
||||
{
|
||||
using _Tp = ranges::range_value_t<_Rg>;
|
||||
using _RV = __vec_load_return_t<_Vp, _Tp>;
|
||||
using _Rp = typename _RV::value_type;
|
||||
static_assert(__vectorizable<_Tp>);
|
||||
static_assert(__explicitly_convertible_to<_Tp, _Rp>);
|
||||
static_assert(__loadstore_convertible_to<_Tp, _Rp, _Flags...>,
|
||||
"'flag_convert' must be used for conversions that are not value-preserving");
|
||||
|
||||
constexpr bool __allow_out_of_bounds = __f._S_test(__allow_partial_loadstore);
|
||||
constexpr auto __static_size = __static_range_size(__r);
|
||||
|
||||
if constexpr (!__allow_out_of_bounds && __static_sized_range<_Rg>)
|
||||
static_assert(ranges::size(__r) >= _RV::size(), "given range must have sufficient size");
|
||||
|
||||
const auto* __ptr = __f.template _S_adjust_pointer<_RV>(ranges::data(__r));
|
||||
|
||||
if constexpr (!__allow_out_of_bounds)
|
||||
__glibcxx_simd_precondition(
|
||||
ranges::size(__r) >= size_t(_RV::size()),
|
||||
"Input range is too small. Did you mean to use 'partial_load'?");
|
||||
|
||||
const size_t __rg_size = ranges::size(__r);
|
||||
if consteval
|
||||
{
|
||||
return _RV([&](size_t __i) -> _Rp {
|
||||
if (__i >= __rg_size || !__mask[int(__i)])
|
||||
return _Rp();
|
||||
else
|
||||
return static_cast<_Rp>(__r[__i]);
|
||||
});
|
||||
}
|
||||
else
|
||||
{
|
||||
constexpr bool __no_size_check
|
||||
= !__allow_out_of_bounds
|
||||
|| (__static_size != dynamic_extent
|
||||
&& __static_size >= size_t(_RV::size.value));
|
||||
if constexpr (_RV::size() == 1)
|
||||
return __mask[0] && (__no_size_check || __rg_size > 0) ? _RV(_LoadCtorTag(), __ptr)
|
||||
: _RV();
|
||||
else if constexpr (__no_size_check)
|
||||
return _RV::_S_masked_load(__ptr, __mask);
|
||||
else if (__rg_size >= size_t(_RV::size()))
|
||||
return _RV::_S_masked_load(__ptr, __mask);
|
||||
else if (__rg_size > 0)
|
||||
return _RV::_S_masked_load(
|
||||
__ptr, __mask && _RV::mask_type::_S_partial_mask_of_n(int(__rg_size)));
|
||||
else
|
||||
return _RV();
|
||||
}
|
||||
}
|
||||
|
||||
template <typename _Vp = void, contiguous_iterator _It, typename... _Flags>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr __vec_load_return_t<_Vp, iter_value_t<_It>>
|
||||
unchecked_load(_It __first, iter_difference_t<_It> __n, flags<_Flags...> __f = {})
|
||||
{ return simd::unchecked_load<_Vp>(span<const iter_value_t<_It>>(__first, __n), __f); }
|
||||
|
||||
template <typename _Vp = void, contiguous_iterator _It, typename... _Flags>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr __vec_load_return_t<_Vp, iter_value_t<_It>>
|
||||
unchecked_load(_It __first, iter_difference_t<_It> __n,
|
||||
const __load_mask_type_t<_Vp, iter_value_t<_It>>& __mask,
|
||||
flags<_Flags...> __f = {})
|
||||
{ return simd::unchecked_load<_Vp>(span<const iter_value_t<_It>>(__first, __n), __mask, __f); }
|
||||
|
||||
template <typename _Vp = void, contiguous_iterator _It, sized_sentinel_for<_It> _Sp,
|
||||
typename... _Flags>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr __vec_load_return_t<_Vp, iter_value_t<_It>>
|
||||
unchecked_load(_It __first, _Sp __last, flags<_Flags...> __f = {})
|
||||
{ return simd::unchecked_load<_Vp>(span<const iter_value_t<_It>>(__first, __last), __f); }
|
||||
|
||||
template <typename _Vp = void, contiguous_iterator _It, sized_sentinel_for<_It> _Sp,
|
||||
typename... _Flags>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr __vec_load_return_t<_Vp, iter_value_t<_It>>
|
||||
unchecked_load(_It __first, _Sp __last,
|
||||
const __load_mask_type_t<_Vp, iter_value_t<_It>>& __mask,
|
||||
flags<_Flags...> __f = {})
|
||||
{
|
||||
return simd::unchecked_load<_Vp>(span<const iter_value_t<_It>>(__first, __last), __mask, __f);
|
||||
}
|
||||
|
||||
template <typename _Vp = void, __sized_contiguous_range _Rg, typename... _Flags>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr __vec_load_return_t<_Vp, ranges::range_value_t<_Rg>>
|
||||
partial_load(_Rg&& __r, flags<_Flags...> __f = {})
|
||||
{ return simd::unchecked_load<_Vp>(__r, __f | __allow_partial_loadstore); }
|
||||
|
||||
template <typename _Vp = void, __sized_contiguous_range _Rg, typename... _Flags>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr __vec_load_return_t<_Vp, ranges::range_value_t<_Rg>>
|
||||
partial_load(_Rg&& __r, const __load_mask_type_t<_Vp, ranges::range_value_t<_Rg>>& __mask,
|
||||
flags<_Flags...> __f = {})
|
||||
{ return simd::unchecked_load<_Vp>(__r, __mask, __f | __allow_partial_loadstore); }
|
||||
|
||||
template <typename _Vp = void, contiguous_iterator _It, typename... _Flags>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr __vec_load_return_t<_Vp, iter_value_t<_It>>
|
||||
partial_load(_It __first, iter_difference_t<_It> __n, flags<_Flags...> __f = {})
|
||||
{ return partial_load<_Vp>(span<const iter_value_t<_It>>(__first, __n), __f); }
|
||||
|
||||
template <typename _Vp = void, contiguous_iterator _It, typename... _Flags>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr __vec_load_return_t<_Vp, iter_value_t<_It>>
|
||||
partial_load(_It __first, iter_difference_t<_It> __n,
|
||||
const __load_mask_type_t<_Vp, iter_value_t<_It>>& __mask,
|
||||
flags<_Flags...> __f = {})
|
||||
{ return partial_load<_Vp>(span<const iter_value_t<_It>>(__first, __n), __mask, __f); }
|
||||
|
||||
template <typename _Vp = void, contiguous_iterator _It, sized_sentinel_for<_It> _Sp,
|
||||
typename... _Flags>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr __vec_load_return_t<_Vp, iter_value_t<_It>>
|
||||
partial_load(_It __first, _Sp __last, flags<_Flags...> __f = {})
|
||||
{ return partial_load<_Vp>(span<const iter_value_t<_It>>(__first, __last), __f); }
|
||||
|
||||
template <typename _Vp = void, contiguous_iterator _It, sized_sentinel_for<_It> _Sp,
|
||||
typename... _Flags>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr __vec_load_return_t<_Vp, iter_value_t<_It>>
|
||||
partial_load(_It __first, _Sp __last, const __load_mask_type_t<_Vp, iter_value_t<_It>>& __mask,
|
||||
flags<_Flags...> __f = {})
|
||||
{ return partial_load<_Vp>(span<const iter_value_t<_It>>(__first, __last), __mask, __f); }
|
||||
|
||||
template <typename _Tp, typename _Ap, __sized_contiguous_range _Rg, typename... _Flags>
|
||||
requires indirectly_writable<ranges::iterator_t<_Rg>, _Tp>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr void
|
||||
unchecked_store(const basic_vec<_Tp, _Ap>& __v, _Rg&& __r, flags<_Flags...> __f = {})
|
||||
{
|
||||
using _TV = basic_vec<_Tp, _Ap>;
|
||||
static_assert(destructible<_TV>);
|
||||
static_assert(__loadstore_convertible_to<_Tp, ranges::range_value_t<_Rg>, _Flags...>,
|
||||
"'flag_convert' must be used for conversions that are not value-preserving");
|
||||
|
||||
constexpr bool __allow_out_of_bounds = __f._S_test(__allow_partial_loadstore);
|
||||
if constexpr (!__allow_out_of_bounds && __static_sized_range<_Rg>)
|
||||
static_assert(ranges::size(__r) >= _TV::size(), "given range must have sufficient size");
|
||||
|
||||
auto* __ptr = __f.template _S_adjust_pointer<_TV>(ranges::data(__r));
|
||||
const auto __rg_size = ranges::size(__r);
|
||||
if constexpr (!__allow_out_of_bounds)
|
||||
__glibcxx_simd_precondition(
|
||||
ranges::size(__r) >= _TV::size(),
|
||||
"output range is too small. Did you mean to use 'partial_store'?");
|
||||
|
||||
if consteval
|
||||
{
|
||||
for (unsigned __i = 0; __i < __rg_size && __i < _TV::size(); ++__i)
|
||||
__ptr[__i] = static_cast<ranges::range_value_t<_Rg>>(__v[__i]);
|
||||
}
|
||||
else
|
||||
{
|
||||
if constexpr (!__allow_out_of_bounds)
|
||||
__v._M_store(__ptr);
|
||||
else
|
||||
_TV::_S_partial_store(__v, __ptr, __rg_size);
|
||||
}
|
||||
}
|
||||
|
||||
template <typename _Tp, typename _Ap, __sized_contiguous_range _Rg, typename... _Flags>
|
||||
requires indirectly_writable<ranges::iterator_t<_Rg>, _Tp>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr void
|
||||
unchecked_store(const basic_vec<_Tp, _Ap>& __v, _Rg&& __r,
|
||||
const typename basic_vec<_Tp, _Ap>::mask_type& __mask,
|
||||
flags<_Flags...> __f = {})
|
||||
{
|
||||
using _TV = basic_vec<_Tp, _Ap>;
|
||||
static_assert(__loadstore_convertible_to<_Tp, ranges::range_value_t<_Rg>, _Flags...>,
|
||||
"'flag_convert' must be used for conversions that are not value-preserving");
|
||||
|
||||
constexpr bool __allow_out_of_bounds = __f._S_test(__allow_partial_loadstore);
|
||||
if constexpr (!__allow_out_of_bounds && __static_sized_range<_Rg>)
|
||||
static_assert(ranges::size(__r) >= _TV::size(), "given range must have sufficient size");
|
||||
|
||||
auto* __ptr = __f.template _S_adjust_pointer<_TV>(ranges::data(__r));
|
||||
|
||||
if constexpr (!__allow_out_of_bounds)
|
||||
__glibcxx_simd_precondition(
|
||||
ranges::size(__r) >= size_t(_TV::size()),
|
||||
"output range is too small. Did you mean to use 'partial_store'?");
|
||||
|
||||
const size_t __rg_size = ranges::size(__r);
|
||||
if consteval
|
||||
{
|
||||
for (int __i = 0; __i < _TV::size(); ++__i)
|
||||
{
|
||||
if (__mask[__i] && (!__allow_out_of_bounds || size_t(__i) < __rg_size))
|
||||
__ptr[__i] = static_cast<ranges::range_value_t<_Rg>>(__v[__i]);
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
if (__allow_out_of_bounds && __rg_size < size_t(_TV::size()))
|
||||
_TV::_S_masked_store(__v, __ptr,
|
||||
__mask && _TV::mask_type::_S_partial_mask_of_n(int(__rg_size)));
|
||||
else
|
||||
_TV::_S_masked_store(__v, __ptr, __mask);
|
||||
}
|
||||
}
|
||||
|
||||
template <typename _Tp, typename _Ap, contiguous_iterator _It, typename... _Flags>
|
||||
requires indirectly_writable<_It, _Tp>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr void
|
||||
unchecked_store(const basic_vec<_Tp, _Ap>& __v, _It __first,
|
||||
iter_difference_t<_It> __n, flags<_Flags...> __f = {})
|
||||
{ simd::unchecked_store(__v, std::span<iter_value_t<_It>>(__first, __n), __f); }
|
||||
|
||||
template <typename _Tp, typename _Ap, contiguous_iterator _It, typename... _Flags>
|
||||
requires indirectly_writable<_It, _Tp>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr void
|
||||
unchecked_store(const basic_vec<_Tp, _Ap>& __v, _It __first, iter_difference_t<_It> __n,
|
||||
const typename basic_vec<_Tp, _Ap>::mask_type& __mask,
|
||||
flags<_Flags...> __f = {})
|
||||
{ simd::unchecked_store(__v, std::span<iter_value_t<_It>>(__first, __n), __mask, __f); }
|
||||
|
||||
template <typename _Tp, typename _Ap, contiguous_iterator _It, sized_sentinel_for<_It> _Sp,
|
||||
typename... _Flags>
|
||||
requires indirectly_writable<_It, _Tp>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr void
|
||||
unchecked_store(const basic_vec<_Tp, _Ap>& __v, _It __first, _Sp __last,
|
||||
flags<_Flags...> __f = {})
|
||||
{ simd::unchecked_store(__v, std::span<iter_value_t<_It>>(__first, __last), __f); }
|
||||
|
||||
template <typename _Tp, typename _Ap, contiguous_iterator _It, sized_sentinel_for<_It> _Sp,
|
||||
typename... _Flags>
|
||||
requires indirectly_writable<_It, _Tp>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr void
|
||||
unchecked_store(const basic_vec<_Tp, _Ap>& __v, _It __first, _Sp __last,
|
||||
const typename basic_vec<_Tp, _Ap>::mask_type& __mask,
|
||||
flags<_Flags...> __f = {})
|
||||
{ simd::unchecked_store(__v, std::span<iter_value_t<_It>>(__first, __last), __mask, __f); }
|
||||
|
||||
template <typename _Tp, typename _Ap, __sized_contiguous_range _Rg, typename... _Flags>
|
||||
requires indirectly_writable<ranges::iterator_t<_Rg>, _Tp>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr void
|
||||
partial_store(const basic_vec<_Tp, _Ap>& __v, _Rg&& __r, flags<_Flags...> __f = {})
|
||||
{ simd::unchecked_store(__v, __r, __f | __allow_partial_loadstore); }
|
||||
|
||||
template <typename _Tp, typename _Ap, __sized_contiguous_range _Rg, typename... _Flags>
|
||||
requires indirectly_writable<ranges::iterator_t<_Rg>, _Tp>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr void
|
||||
partial_store(const basic_vec<_Tp, _Ap>& __v, _Rg&& __r,
|
||||
const typename basic_vec<_Tp, _Ap>::mask_type& __mask,
|
||||
flags<_Flags...> __f = {})
|
||||
{ simd::unchecked_store(__v, __r, __mask, __f | __allow_partial_loadstore); }
|
||||
|
||||
template <typename _Tp, typename _Ap, contiguous_iterator _It, typename... _Flags>
|
||||
requires indirectly_writable<_It, _Tp>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr void
|
||||
partial_store(const basic_vec<_Tp, _Ap>& __v, _It __first, iter_difference_t<_It> __n,
|
||||
flags<_Flags...> __f = {})
|
||||
{ partial_store(__v, span(__first, __n), __f); }
|
||||
|
||||
template <typename _Tp, typename _Ap, contiguous_iterator _It, typename... _Flags>
|
||||
requires indirectly_writable<_It, _Tp>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr void
|
||||
partial_store(const basic_vec<_Tp, _Ap>& __v, _It __first, iter_difference_t<_It> __n,
|
||||
const typename basic_vec<_Tp, _Ap>::mask_type& __mask, flags<_Flags...> __f = {})
|
||||
{ partial_store(__v, span(__first, __n), __mask, __f); }
|
||||
|
||||
template <typename _Tp, typename _Ap, contiguous_iterator _It, sized_sentinel_for<_It> _Sp,
|
||||
typename... _Flags>
|
||||
requires indirectly_writable<_It, _Tp>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr void
|
||||
partial_store(const basic_vec<_Tp, _Ap>& __v, _It __first, _Sp __last,
|
||||
flags<_Flags...> __f = {})
|
||||
{ partial_store(__v, span(__first, __last), __f); }
|
||||
|
||||
template <typename _Tp, typename _Ap, contiguous_iterator _It, sized_sentinel_for<_It> _Sp,
|
||||
typename... _Flags>
|
||||
requires indirectly_writable<_It, _Tp>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr void
|
||||
partial_store(const basic_vec<_Tp, _Ap>& __v, _It __first, _Sp __last,
|
||||
const typename basic_vec<_Tp, _Ap>::mask_type& __mask, flags<_Flags...> __f = {})
|
||||
{ partial_store(__v, span(__first, __last), __mask, __f); }
|
||||
} // namespace simd
|
||||
_GLIBCXX_END_NAMESPACE_VERSION
|
||||
} // namespace std
|
||||
|
||||
#pragma GCC diagnostic pop
|
||||
#endif // C++26
|
||||
#endif // _GLIBCXX_SIMD_LOADSTORE_H
|
||||
1972
libstdc++-v3/include/bits/simd_mask.h
Normal file
1972
libstdc++-v3/include/bits/simd_mask.h
Normal file
File diff suppressed because it is too large
Load Diff
118
libstdc++-v3/include/bits/simd_mask_reductions.h
Normal file
118
libstdc++-v3/include/bits/simd_mask_reductions.h
Normal file
@@ -0,0 +1,118 @@
|
||||
// Implementation of <simd> -*- C++ -*-
|
||||
|
||||
// Copyright The GNU Toolchain Authors.
|
||||
//
|
||||
// This file is part of the GNU ISO C++ Library. This library is free
|
||||
// software; you can redistribute it and/or modify it under the
|
||||
// terms of the GNU General Public License as published by the
|
||||
// Free Software Foundation; either version 3, or (at your option)
|
||||
// any later version.
|
||||
|
||||
// This library is distributed in the hope that it will be useful,
|
||||
// but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
// GNU General Public License for more details.
|
||||
|
||||
// Under Section 7 of GPL version 3, you are granted additional
|
||||
// permissions described in the GCC Runtime Library Exception, version
|
||||
// 3.1, as published by the Free Software Foundation.
|
||||
|
||||
// You should have received a copy of the GNU General Public License and
|
||||
// a copy of the GCC Runtime Library Exception along with this program;
|
||||
// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
|
||||
// <http://www.gnu.org/licenses/>.
|
||||
|
||||
#ifndef _GLIBCXX_SIMD_MASK_REDUCTIONS_H
|
||||
#define _GLIBCXX_SIMD_MASK_REDUCTIONS_H 1
|
||||
|
||||
#ifdef _GLIBCXX_SYSHDR
|
||||
#pragma GCC system_header
|
||||
#endif
|
||||
|
||||
#if __cplusplus >= 202400L
|
||||
|
||||
#include "simd_mask.h"
|
||||
|
||||
// psabi warnings are bogus because the ABI of the internal types never leaks into user code
|
||||
#pragma GCC diagnostic push
|
||||
#pragma GCC diagnostic ignored "-Wpsabi"
|
||||
|
||||
// [simd.mask.reductions] -----------------------------------------------------
|
||||
namespace std _GLIBCXX_VISIBILITY(default)
|
||||
{
|
||||
_GLIBCXX_BEGIN_NAMESPACE_VERSION
|
||||
namespace simd
|
||||
{
|
||||
template <size_t _Bytes, typename _Ap>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr bool
|
||||
all_of(const basic_mask<_Bytes, _Ap>& __k) noexcept
|
||||
{ return __k._M_all_of(); }
|
||||
|
||||
template <size_t _Bytes, typename _Ap>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr bool
|
||||
any_of(const basic_mask<_Bytes, _Ap>& __k) noexcept
|
||||
{ return __k._M_any_of(); }
|
||||
|
||||
template <size_t _Bytes, typename _Ap>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr bool
|
||||
none_of(const basic_mask<_Bytes, _Ap>& __k) noexcept
|
||||
{ return __k._M_none_of(); }
|
||||
|
||||
template <size_t _Bytes, typename _Ap>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr __simd_size_type
|
||||
reduce_count(const basic_mask<_Bytes, _Ap>& __k) noexcept
|
||||
{
|
||||
if constexpr (_Ap::_S_size == 1)
|
||||
return +__k[0];
|
||||
else if constexpr (_Ap::_S_is_vecmask)
|
||||
return -reduce(-__k);
|
||||
else
|
||||
return __k._M_reduce_count();
|
||||
}
|
||||
|
||||
template <size_t _Bytes, typename _Ap>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr __simd_size_type
|
||||
reduce_min_index(const basic_mask<_Bytes, _Ap>& __k)
|
||||
{ return __k._M_reduce_min_index(); }
|
||||
|
||||
template <size_t _Bytes, typename _Ap>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr __simd_size_type
|
||||
reduce_max_index(const basic_mask<_Bytes, _Ap>& __k)
|
||||
{ return __k._M_reduce_max_index(); }
|
||||
|
||||
constexpr bool
|
||||
all_of(same_as<bool> auto __x) noexcept
|
||||
{ return __x; }
|
||||
|
||||
constexpr bool
|
||||
any_of(same_as<bool> auto __x) noexcept
|
||||
{ return __x; }
|
||||
|
||||
constexpr bool
|
||||
none_of(same_as<bool> auto __x) noexcept
|
||||
{ return !__x; }
|
||||
|
||||
constexpr __simd_size_type
|
||||
reduce_count(same_as<bool> auto __x) noexcept
|
||||
{ return __x; }
|
||||
|
||||
constexpr __simd_size_type
|
||||
reduce_min_index(same_as<bool> auto __x)
|
||||
{ return 0; }
|
||||
|
||||
constexpr __simd_size_type
|
||||
reduce_max_index(same_as<bool> auto __x)
|
||||
{ return 0; }
|
||||
} // namespace simd
|
||||
_GLIBCXX_END_NAMESPACE_VERSION
|
||||
} // namespace std
|
||||
|
||||
#pragma GCC diagnostic pop
|
||||
#endif // C++26
|
||||
#endif // _GLIBCXX_SIMD_MASK_REDUCTIONS_H
|
||||
109
libstdc++-v3/include/bits/simd_reductions.h
Normal file
109
libstdc++-v3/include/bits/simd_reductions.h
Normal file
@@ -0,0 +1,109 @@
|
||||
// Implementation of <simd> -*- C++ -*-
|
||||
|
||||
// Copyright The GNU Toolchain Authors.
|
||||
//
|
||||
// This file is part of the GNU ISO C++ Library. This library is free
|
||||
// software; you can redistribute it and/or modify it under the
|
||||
// terms of the GNU General Public License as published by the
|
||||
// Free Software Foundation; either version 3, or (at your option)
|
||||
// any later version.
|
||||
|
||||
// This library is distributed in the hope that it will be useful,
|
||||
// but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
// GNU General Public License for more details.
|
||||
|
||||
// Under Section 7 of GPL version 3, you are granted additional
|
||||
// permissions described in the GCC Runtime Library Exception, version
|
||||
// 3.1, as published by the Free Software Foundation.
|
||||
|
||||
// You should have received a copy of the GNU General Public License and
|
||||
// a copy of the GCC Runtime Library Exception along with this program;
|
||||
// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
|
||||
// <http://www.gnu.org/licenses/>.
|
||||
|
||||
#ifndef _GLIBCXX_SIMD_REDUCTIONS_H
|
||||
#define _GLIBCXX_SIMD_REDUCTIONS_H 1
|
||||
|
||||
#ifdef _GLIBCXX_SYSHDR
|
||||
#pragma GCC system_header
|
||||
#endif
|
||||
|
||||
#if __cplusplus >= 202400L
|
||||
|
||||
#include "simd_vec.h"
|
||||
|
||||
// psabi warnings are bogus because the ABI of the internal types never leaks into user code
|
||||
#pragma GCC diagnostic push
|
||||
#pragma GCC diagnostic ignored "-Wpsabi"
|
||||
|
||||
// [simd.reductions] ----------------------------------------------------------
|
||||
namespace std _GLIBCXX_VISIBILITY(default)
|
||||
{
|
||||
_GLIBCXX_BEGIN_NAMESPACE_VERSION
|
||||
namespace simd
|
||||
{
|
||||
template <typename _Tp, typename _Ap, __reduction_binary_operation<_Tp> _BinaryOperation = plus<>>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr _Tp
|
||||
reduce(const basic_vec<_Tp, _Ap>& __x, _BinaryOperation __binary_op = {})
|
||||
{ return __x._M_reduce(__binary_op); }
|
||||
|
||||
template <typename _Tp, typename _Ap, __reduction_binary_operation<_Tp> _BinaryOperation = plus<>>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr _Tp
|
||||
reduce(const basic_vec<_Tp, _Ap>& __x, const typename basic_vec<_Tp, _Ap>::mask_type& __mask,
|
||||
_BinaryOperation __binary_op = {}, type_identity_t<_Tp> __identity_element
|
||||
= __default_identity_element<_Tp, _BinaryOperation>())
|
||||
{ return reduce(__select_impl(__mask, __x, __identity_element), __binary_op); }
|
||||
|
||||
template <totally_ordered _Tp, typename _Ap>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr _Tp
|
||||
reduce_min(const basic_vec<_Tp, _Ap>& __x) noexcept
|
||||
{
|
||||
return reduce(__x, []<typename _UV>(const _UV& __a, const _UV& __b) {
|
||||
return __select_impl(__a < __b, __a, __b);
|
||||
});
|
||||
}
|
||||
|
||||
template <totally_ordered _Tp, typename _Ap>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr _Tp
|
||||
reduce_min(const basic_vec<_Tp, _Ap>& __x,
|
||||
const typename basic_vec<_Tp, _Ap>::mask_type& __mask) noexcept
|
||||
{
|
||||
return reduce(__select_impl(__mask, __x, numeric_limits<_Tp>::max()),
|
||||
[]<typename _UV>(const _UV& __a, const _UV& __b) {
|
||||
return __select_impl(__a < __b, __a, __b);
|
||||
});
|
||||
}
|
||||
|
||||
template <totally_ordered _Tp, typename _Ap>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr _Tp
|
||||
reduce_max(const basic_vec<_Tp, _Ap>& __x) noexcept
|
||||
{
|
||||
return reduce(__x, []<typename _UV>(const _UV& __a, const _UV& __b) {
|
||||
return __select_impl(__a < __b, __b, __a);
|
||||
});
|
||||
}
|
||||
|
||||
template <totally_ordered _Tp, typename _Ap>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr _Tp
|
||||
reduce_max(const basic_vec<_Tp, _Ap>& __x,
|
||||
const typename basic_vec<_Tp, _Ap>::mask_type& __mask) noexcept
|
||||
{
|
||||
return reduce(__select_impl(__mask, __x, numeric_limits<_Tp>::lowest()),
|
||||
[]<typename _UV>(const _UV& __a, const _UV& __b) {
|
||||
return __select_impl(__a < __b, __b, __a);
|
||||
});
|
||||
}
|
||||
} // namespace simd
|
||||
_GLIBCXX_END_NAMESPACE_VERSION
|
||||
} // namespace std
|
||||
|
||||
#pragma GCC diagnostic pop
|
||||
#endif // C++26
|
||||
#endif // _GLIBCXX_SIMD_REDUCTIONS_H
|
||||
2297
libstdc++-v3/include/bits/simd_vec.h
Normal file
2297
libstdc++-v3/include/bits/simd_vec.h
Normal file
File diff suppressed because it is too large
Load Diff
1413
libstdc++-v3/include/bits/simd_x86.h
Normal file
1413
libstdc++-v3/include/bits/simd_x86.h
Normal file
File diff suppressed because it is too large
Load Diff
606
libstdc++-v3/include/bits/vec_ops.h
Normal file
606
libstdc++-v3/include/bits/vec_ops.h
Normal file
@@ -0,0 +1,606 @@
|
||||
// Implementation of <simd> -*- C++ -*-
|
||||
|
||||
// Copyright The GNU Toolchain Authors.
|
||||
//
|
||||
// This file is part of the GNU ISO C++ Library. This library is free
|
||||
// software; you can redistribute it and/or modify it under the
|
||||
// terms of the GNU General Public License as published by the
|
||||
// Free Software Foundation; either version 3, or (at your option)
|
||||
// any later version.
|
||||
|
||||
// This library is distributed in the hope that it will be useful,
|
||||
// but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
// GNU General Public License for more details.
|
||||
|
||||
// Under Section 7 of GPL version 3, you are granted additional
|
||||
// permissions described in the GCC Runtime Library Exception, version
|
||||
// 3.1, as published by the Free Software Foundation.
|
||||
|
||||
// You should have received a copy of the GNU General Public License and
|
||||
// a copy of the GCC Runtime Library Exception along with this program;
|
||||
// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
|
||||
// <http://www.gnu.org/licenses/>.
|
||||
|
||||
#ifndef _GLIBCXX_VEC_OPS_H
|
||||
#define _GLIBCXX_VEC_OPS_H 1
|
||||
|
||||
#ifdef _GLIBCXX_SYSHDR
|
||||
#pragma GCC system_header
|
||||
#endif
|
||||
|
||||
#if __cplusplus >= 202400L
|
||||
|
||||
#include "simd_details.h"
|
||||
|
||||
#include <bit>
|
||||
#include <bits/utility.h>
|
||||
|
||||
// psabi warnings are bogus because the ABI of the internal types never leaks into user code
|
||||
#pragma GCC diagnostic push
|
||||
#pragma GCC diagnostic ignored "-Wpsabi"
|
||||
|
||||
namespace std _GLIBCXX_VISIBILITY(default)
|
||||
{
|
||||
_GLIBCXX_BEGIN_NAMESPACE_VERSION
|
||||
namespace simd
|
||||
{
|
||||
template <std::signed_integral _Tp>
|
||||
constexpr bool
|
||||
__signed_has_single_bit(_Tp __x)
|
||||
{ return __has_single_bit(make_unsigned_t<_Tp>(__x)); }
|
||||
|
||||
/**
|
||||
* Alias for a vector builtin with given value type and total sizeof.
|
||||
*/
|
||||
template <__vectorizable _Tp, size_t _Bytes>
|
||||
requires (__has_single_bit(_Bytes))
|
||||
using __vec_builtin_type_bytes [[__gnu__::__vector_size__(_Bytes)]] = _Tp;
|
||||
|
||||
/**
|
||||
* Alias for a vector builtin with given value type @p _Tp and @p _Width.
|
||||
*/
|
||||
template <__vectorizable _Tp, __simd_size_type _Width>
|
||||
requires (__signed_has_single_bit(_Width))
|
||||
using __vec_builtin_type = __vec_builtin_type_bytes<_Tp, sizeof(_Tp) * _Width>;
|
||||
|
||||
/**
|
||||
* Constrain to any vector builtin with given value type and optional width.
|
||||
*/
|
||||
template <typename _Tp, typename _ValueType,
|
||||
__simd_size_type _Width = sizeof(_Tp) / sizeof(_ValueType)>
|
||||
concept __vec_builtin_of
|
||||
= !is_class_v<_Tp> && !is_pointer_v<_Tp> && !is_arithmetic_v<_Tp>
|
||||
&& __vectorizable<_ValueType>
|
||||
&& _Width >= 1 && sizeof(_Tp) / sizeof(_ValueType) == _Width
|
||||
&& same_as<__vec_builtin_type_bytes<_ValueType, sizeof(_Tp)>, _Tp>
|
||||
&& requires(_Tp& __v, _ValueType __x) { __v[0] = __x; };
|
||||
|
||||
/**
|
||||
* Constrain to any vector builtin.
|
||||
*/
|
||||
template <typename _Tp>
|
||||
concept __vec_builtin
|
||||
= __vec_builtin_of<_Tp, remove_cvref_t<decltype(declval<const _Tp>()[0])>>;
|
||||
|
||||
/**
|
||||
* Alias for the value type of the given __vec_builtin type @p _Tp.
|
||||
*/
|
||||
template <__vec_builtin _Tp>
|
||||
using __vec_value_type = remove_cvref_t<decltype(declval<const _Tp>()[0])>;
|
||||
|
||||
/**
|
||||
* The width (number of value_type elements) of the given vector builtin or arithmetic type.
|
||||
*/
|
||||
template <typename _Tp>
|
||||
inline constexpr __simd_size_type __width_of = 1;
|
||||
|
||||
template <typename _Tp>
|
||||
requires __vec_builtin<_Tp>
|
||||
inline constexpr __simd_size_type __width_of<_Tp> = sizeof(_Tp) / sizeof(__vec_value_type<_Tp>);
|
||||
|
||||
/**
|
||||
* Alias for a vector builtin with equal value type and new width @p _Np.
|
||||
*/
|
||||
template <__simd_size_type _Np, __vec_builtin _TV>
|
||||
using __resize_vec_builtin_t = __vec_builtin_type<__vec_value_type<_TV>, _Np>;
|
||||
|
||||
template <__vec_builtin _TV>
|
||||
requires (__width_of<_TV> > 1)
|
||||
using __half_vec_builtin_t = __resize_vec_builtin_t<__width_of<_TV> / 2, _TV>;
|
||||
|
||||
template <__vec_builtin _TV>
|
||||
using __double_vec_builtin_t = __resize_vec_builtin_t<__width_of<_TV> * 2, _TV>;
|
||||
|
||||
template <typename _Up, __vec_builtin _TV>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr __vec_builtin_type_bytes<_Up, sizeof(_TV)>
|
||||
__vec_bit_cast(_TV __v)
|
||||
{ return reinterpret_cast<__vec_builtin_type_bytes<_Up, sizeof(_TV)>>(__v); }
|
||||
|
||||
template <int _Np, __vec_builtin _TV>
|
||||
requires signed_integral<__vec_value_type<_TV>>
|
||||
static constexpr _TV _S_vec_implicit_mask = []<int... _Is> (integer_sequence<int, _Is...>) {
|
||||
return _TV{ (_Is < _Np ? -1 : 0)... };
|
||||
} (make_integer_sequence<int, __width_of<_TV>>());
|
||||
|
||||
/**
|
||||
* Helper function to work around Clang not allowing v[i] in constant expressions.
|
||||
*/
|
||||
template <__vec_builtin _TV>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr __vec_value_type<_TV>
|
||||
__vec_get(_TV __v, int __i)
|
||||
{
|
||||
#ifdef _GLIBCXX_CLANG
|
||||
if consteval
|
||||
{
|
||||
return __builtin_bit_cast(array<__vec_value_type<_TV>, __width_of<_TV>>, __v)[__i];
|
||||
}
|
||||
else
|
||||
#endif
|
||||
{
|
||||
return __v[__i];
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Helper function to work around Clang and GCC not allowing assignment to v[i] in constant
|
||||
* expressions.
|
||||
*/
|
||||
template <__vec_builtin _TV>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr void
|
||||
__vec_set(_TV& __v, int __i, __vec_value_type<_TV> __x)
|
||||
{
|
||||
if consteval
|
||||
{
|
||||
#ifdef _GLIBCXX_CLANG
|
||||
auto __arr = __builtin_bit_cast(array<__vec_value_type<_TV>, __width_of<_TV>>, __v);
|
||||
__arr[__i] = __x;
|
||||
__v = __builtin_bit_cast(_TV, __arr);
|
||||
#else
|
||||
constexpr auto [...__j] = _IotaArray<__width_of<_TV>>;
|
||||
__v = _TV{(__i == __j ? __x : __v[__j])...};
|
||||
#endif
|
||||
}
|
||||
else
|
||||
{
|
||||
__v[__i] = __x;
|
||||
}
|
||||
}
|
||||
|
||||
/** @internal
|
||||
* Return vector builtin with all values from @p __a and @p __b.
|
||||
*/
|
||||
template <__vec_builtin _TV>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr __vec_builtin_type<__vec_value_type<_TV>, __width_of<_TV> * 2>
|
||||
__vec_concat(_TV __a, _TV __b)
|
||||
{
|
||||
constexpr auto [...__is] = _IotaArray<__width_of<_TV> * 2>;
|
||||
return __builtin_shufflevector(__a, __b, __is...);
|
||||
}
|
||||
|
||||
/** @internal
|
||||
* Concatenate the first @p _N0 elements from @p __a with the first @p _N1 elements from @p __b
|
||||
* with the elements from applying this function recursively to @p __rest.
|
||||
*
|
||||
* @pre _N0 <= __width_of<_TV0> && _N1 <= __width_of<_TV1> && _Ns <= __width_of<_TVs> && ...
|
||||
*
|
||||
* Strategy: Aim for a power-of-2 tree concat. E.g.
|
||||
* - cat(2, 2, 2, 2) -> cat(4, 2, 2) -> cat(4, 4)
|
||||
* - cat(2, 2, 2, 2, 8) -> cat(4, 2, 2, 8) -> cat(4, 4, 8) -> cat(8, 8)
|
||||
*/
|
||||
template <int _N0, int _N1, int... _Ns, __vec_builtin _TV0, __vec_builtin _TV1,
|
||||
__vec_builtin... _TVs>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr __vec_builtin_type<__vec_value_type<_TV0>,
|
||||
__bit_ceil(unsigned(_N0 + (_N1 + ... + _Ns)))>
|
||||
__vec_concat_sized(const _TV0& __a, const _TV1& __b, const _TVs&... __rest);
|
||||
|
||||
template <int _N0, int _N1, int _N2, int... _Ns, __vec_builtin _TV0, __vec_builtin _TV1,
|
||||
__vec_builtin _TV2, __vec_builtin... _TVs>
|
||||
requires (__has_single_bit(unsigned(_N0))) && (_N0 >= (_N1 + _N2))
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr __vec_builtin_type<__vec_value_type<_TV0>,
|
||||
__bit_ceil(unsigned(_N0 + _N1 + (_N2 + ... + _Ns)))>
|
||||
__vec_concat_sized(const _TV0& __a, const _TV1& __b, const _TV2& __c, const _TVs&... __rest)
|
||||
{
|
||||
return __vec_concat_sized<_N0, _N1 + _N2, _Ns...>(
|
||||
__a, __vec_concat_sized<_N1, _N2>(__b, __c), __rest...);
|
||||
}
|
||||
|
||||
template <int _N0, int _N1, int... _Ns, __vec_builtin _TV0, __vec_builtin _TV1,
|
||||
__vec_builtin... _TVs>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr __vec_builtin_type<__vec_value_type<_TV0>,
|
||||
__bit_ceil(unsigned(_N0 + (_N1 + ... + _Ns)))>
|
||||
__vec_concat_sized(const _TV0& __a, const _TV1& __b, const _TVs&... __rest)
|
||||
{
|
||||
// __is is rounded up because we need to generate a power-of-2 vector:
|
||||
constexpr auto [...__is] = _IotaArray<__bit_ceil(unsigned(_N0 + _N1)), int>;
|
||||
const auto __ab = __builtin_shufflevector(__a, __b, [](int __i) consteval {
|
||||
if (__i < _N0) // copy from __a
|
||||
return __i;
|
||||
else if (__i < _N0 + _N1) // copy from __b
|
||||
return __i - _N0 + __width_of<_TV0>; // _N0 <= __width_of<_TV0>
|
||||
else // can't index into __rest
|
||||
return -1; // don't care
|
||||
}(__is)...);
|
||||
if constexpr (sizeof...(__rest) == 0)
|
||||
return __ab;
|
||||
else
|
||||
return __vec_concat_sized<_N0 + _N1, _Ns...>(__ab, __rest...);
|
||||
}
|
||||
|
||||
template <__vec_builtin _TV>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr __half_vec_builtin_t<_TV>
|
||||
__vec_split_lo(_TV __v)
|
||||
{
|
||||
constexpr int __n = __width_of<_TV> / 2;
|
||||
constexpr auto [...__is] = _IotaArray<__n>;
|
||||
return __builtin_shufflevector(__v, __v, __is...);
|
||||
}
|
||||
|
||||
template <__vec_builtin _TV>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr __half_vec_builtin_t<_TV>
|
||||
__vec_split_hi(_TV __v)
|
||||
{
|
||||
constexpr int __n = __width_of<_TV> / 2;
|
||||
constexpr auto [...__is] = _IotaArray<__n>;
|
||||
return __builtin_shufflevector(__v, __v, (__n + __is)...);
|
||||
}
|
||||
|
||||
/** @internal
|
||||
* Return @p __x zero-padded to @p _Bytes bytes.
|
||||
*
|
||||
* Use this function when you need two objects of the same size (e.g. for __vec_concat).
|
||||
*/
|
||||
template <size_t _Bytes, __vec_builtin _TV>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr auto
|
||||
__vec_zero_pad_to(_TV __x)
|
||||
{
|
||||
if constexpr (sizeof(_TV) == _Bytes)
|
||||
return __x;
|
||||
else if constexpr (sizeof(_TV) <= sizeof(0ull))
|
||||
{
|
||||
using _Up = _UInt<sizeof(_TV)>;
|
||||
__vec_builtin_type_bytes<_Up, _Bytes> __tmp = {__builtin_bit_cast(_Up, __x)};
|
||||
return __builtin_bit_cast(__vec_builtin_type_bytes<__vec_value_type<_TV>, _Bytes>, __tmp);
|
||||
}
|
||||
else if constexpr (sizeof(_TV) < _Bytes)
|
||||
return __vec_zero_pad_to<_Bytes>(__vec_concat(__x, _TV()));
|
||||
else
|
||||
static_assert(false);
|
||||
}
|
||||
|
||||
/** @internal
|
||||
* Return a type with sizeof 16, add zero-padding to @p __x. The input must be smaller.
|
||||
*
|
||||
* Use this function instead of the above when you need to pad an argument for a SIMD builtin.
|
||||
*/
|
||||
template <__vec_builtin _TV>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr auto
|
||||
__vec_zero_pad_to_16(_TV __x)
|
||||
{
|
||||
static_assert(sizeof(_TV) < 16);
|
||||
return __vec_zero_pad_to<16>(__x);
|
||||
}
|
||||
|
||||
// work around __builtin_constant_p returning false unless passed a variable
|
||||
// (__builtin_constant_p(x[0]) is false while __is_const_known(x[0]) is true)
|
||||
template <typename _Tp>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr bool
|
||||
__is_const_known(const _Tp& __x)
|
||||
{
|
||||
return __builtin_constant_p(__x);
|
||||
}
|
||||
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr bool
|
||||
__is_const_known(const auto&... __xs) requires(sizeof...(__xs) >= 2)
|
||||
{
|
||||
if consteval
|
||||
{
|
||||
return true;
|
||||
}
|
||||
else
|
||||
{
|
||||
return (__is_const_known(__xs) && ...);
|
||||
}
|
||||
}
|
||||
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr bool
|
||||
__is_const_known_equal_to(const auto& __x, const auto& __expect)
|
||||
{ return __is_const_known(__x == __expect) && __x == __expect; }
|
||||
|
||||
#if _GLIBCXX_X86
|
||||
template <__vec_builtin _UV, __vec_builtin _TV>
|
||||
inline _UV
|
||||
__x86_cvt_f16c(_TV __v);
|
||||
#endif
|
||||
|
||||
|
||||
/** @internal
|
||||
* Simple wrapper around __builtin_convertvector to provide static_cast-like syntax.
|
||||
*
|
||||
* Works around GCC failing to use the F16C/AVX512F cvtps2ph/cvtph2ps instructions.
|
||||
*/
|
||||
template <__vec_builtin _UV, __vec_builtin _TV, _ArchTraits _Traits = {}>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr _UV
|
||||
__vec_cast(_TV __v)
|
||||
{
|
||||
static_assert(__width_of<_UV> == __width_of<_TV>);
|
||||
#if _GLIBCXX_X86
|
||||
using _Up = __vec_value_type<_UV>;
|
||||
using _Tp = __vec_value_type<_TV>;
|
||||
constexpr bool __to_f16 = is_same_v<_Up, _Float16>;
|
||||
constexpr bool __from_f16 = is_same_v<_Tp, _Float16>;
|
||||
constexpr bool __needs_f16c = _Traits._M_have_f16c() && !_Traits._M_have_avx512fp16()
|
||||
&& (__to_f16 || __from_f16);
|
||||
if (__needs_f16c && !__is_const_known(__v))
|
||||
{ // Work around PR121688
|
||||
if constexpr (__needs_f16c)
|
||||
return __x86_cvt_f16c<_UV>(__v);
|
||||
}
|
||||
if constexpr (is_floating_point_v<_Tp> && is_integral_v<_Up>
|
||||
&& sizeof(_UV) < sizeof(_TV) && sizeof(_Up) < sizeof(int))
|
||||
{
|
||||
using _Ip = __integer_from<std::min(sizeof(int), sizeof(_Tp))>;
|
||||
using _IV = __vec_builtin_type<_Ip, __width_of<_TV>>;
|
||||
return __vec_cast<_UV>(__vec_cast<_IV>(__v));
|
||||
}
|
||||
#endif
|
||||
return __builtin_convertvector(__v, _UV);
|
||||
}
|
||||
|
||||
/** @internal
|
||||
* Overload of the above cast function that determines the destination vector type from a given
|
||||
* element type @p _Up and the `__width_of` the argument type.
|
||||
*
|
||||
* Calls the above overload.
|
||||
*/
|
||||
template <__vectorizable _Up, __vec_builtin _TV>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr __vec_builtin_type<_Up, __width_of<_TV>>
|
||||
__vec_cast(_TV __v)
|
||||
{ return __vec_cast<__vec_builtin_type<_Up, __width_of<_TV>>>(__v); }
|
||||
|
||||
/** @internal
|
||||
* As above, but with additional precondition on possible values of the argument.
|
||||
*
|
||||
* Precondition: __k[i] is either 0 or -1 for all i.
|
||||
*/
|
||||
template <__vec_builtin _UV, __vec_builtin _TV>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr _UV
|
||||
__vec_mask_cast(_TV __k)
|
||||
{
|
||||
static_assert(signed_integral<__vec_value_type<_UV>>);
|
||||
static_assert(signed_integral<__vec_value_type<_TV>>);
|
||||
// TODO: __builtin_convertvector cannot be optimal because it doesn't consider input and
|
||||
// output can only be 0 or -1.
|
||||
return __builtin_convertvector(__k, _UV);
|
||||
}
|
||||
|
||||
template <__vec_builtin _TV>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr _TV
|
||||
__vec_xor(_TV __a, _TV __b)
|
||||
{
|
||||
using _Tp = __vec_value_type<_TV>;
|
||||
if constexpr (is_floating_point_v<_Tp>)
|
||||
{
|
||||
using _UV = __vec_builtin_type<__integer_from<sizeof(_Tp)>, __width_of<_TV>>;
|
||||
return __builtin_bit_cast(
|
||||
_TV, __builtin_bit_cast(_UV, __a) ^ __builtin_bit_cast(_UV, __b));
|
||||
}
|
||||
else
|
||||
return __a ^ __b;
|
||||
}
|
||||
|
||||
template <__vec_builtin _TV>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr _TV
|
||||
__vec_or(_TV __a, _TV __b)
|
||||
{
|
||||
using _Tp = __vec_value_type<_TV>;
|
||||
if constexpr (is_floating_point_v<_Tp>)
|
||||
{
|
||||
using _UV = __vec_builtin_type<__integer_from<sizeof(_Tp)>, __width_of<_TV>>;
|
||||
return __builtin_bit_cast(
|
||||
_TV, __builtin_bit_cast(_UV, __a) | __builtin_bit_cast(_UV, __b));
|
||||
}
|
||||
else
|
||||
return __a | __b;
|
||||
}
|
||||
|
||||
template <__vec_builtin _TV>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr _TV
|
||||
__vec_and(_TV __a, _TV __b)
|
||||
{
|
||||
using _Tp = __vec_value_type<_TV>;
|
||||
if constexpr (is_floating_point_v<_Tp>)
|
||||
{
|
||||
using _UV = __vec_builtin_type<__integer_from<sizeof(_Tp)>, __width_of<_TV>>;
|
||||
return __builtin_bit_cast(
|
||||
_TV, __builtin_bit_cast(_UV, __a) & __builtin_bit_cast(_UV, __b));
|
||||
}
|
||||
else
|
||||
return __a & __b;
|
||||
}
|
||||
|
||||
/** @internal
|
||||
* Returns the bit-wise and of not @p __a and @p __b.
|
||||
*
|
||||
* Use __vec_and(__vec_not(__a), __b) unless an andnot instruction is necessary for optimization.
|
||||
*
|
||||
* @see __vec_andnot in simd_x86.h
|
||||
*/
|
||||
template <__vec_builtin _TV>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr _TV
|
||||
__vec_andnot(_TV __a, _TV __b)
|
||||
{
|
||||
using _Tp = __vec_value_type<_TV>;
|
||||
using _UV = __vec_builtin_type<__integer_from<sizeof(_Tp)>, __width_of<_TV>>;
|
||||
return __builtin_bit_cast(
|
||||
_TV, ~__builtin_bit_cast(_UV, __a) & __builtin_bit_cast(_UV, __b));
|
||||
}
|
||||
|
||||
template <__vec_builtin _TV>
|
||||
[[__gnu__::__always_inline__]]
|
||||
constexpr _TV
|
||||
__vec_not(_TV __a)
|
||||
{
|
||||
using _Tp = __vec_value_type<_TV>;
|
||||
using _UV = __vec_builtin_type_bytes<__integer_from<sizeof(_Tp)>, sizeof(_TV)>;
|
||||
if constexpr (is_floating_point_v<__vec_value_type<_TV>>)
|
||||
return __builtin_bit_cast(_TV, ~__builtin_bit_cast(_UV, __a));
|
||||
else
|
||||
return ~__a;
|
||||
}
|
||||
|
||||
/**
|
||||
* An object of given type where only the sign bits are 1.
|
||||
*/
|
||||
template <__vec_builtin _V>
|
||||
requires std::floating_point<__vec_value_type<_V>>
|
||||
constexpr _V _S_signmask = __vec_xor(_V() + 1, _V() - 1);
|
||||
|
||||
template <__vec_builtin _TV, int _Np = __width_of<_TV>,
|
||||
typename = make_integer_sequence<int, _Np>>
|
||||
struct _VecOps;
|
||||
|
||||
template <__vec_builtin _TV, int _Np, int... _Is>
|
||||
struct _VecOps<_TV, _Np, integer_sequence<int, _Is...>>
|
||||
{
|
||||
static_assert(_Np <= __width_of<_TV>);
|
||||
|
||||
using _Tp = __vec_value_type<_TV>;
|
||||
|
||||
using _HV = __half_vec_builtin_t<__conditional_t<_Np >= 2, _TV, __double_vec_builtin_t<_TV>>>;
|
||||
|
||||
[[__gnu__::__always_inline__]]
|
||||
static constexpr _TV
|
||||
_S_broadcast_to_even(_Tp __init)
|
||||
{ return _TV {((_Is & 1) == 0 ? __init : _Tp())...}; }
|
||||
|
||||
[[__gnu__::__always_inline__]]
|
||||
static constexpr _TV
|
||||
_S_broadcast_to_odd(_Tp __init)
|
||||
{ return _TV {((_Is & 1) == 1 ? __init : _Tp())...}; }
|
||||
|
||||
[[__gnu__::__always_inline__]]
|
||||
static constexpr bool
|
||||
_S_all_of(_TV __k) noexcept
|
||||
{ return (... && (__k[_Is] != 0)); }
|
||||
|
||||
[[__gnu__::__always_inline__]]
|
||||
static constexpr bool
|
||||
_S_any_of(_TV __k) noexcept
|
||||
{ return (... || (__k[_Is] != 0)); }
|
||||
|
||||
[[__gnu__::__always_inline__]]
|
||||
static constexpr bool
|
||||
_S_none_of(_TV __k) noexcept
|
||||
{ return (... && (__k[_Is] == 0)); }
|
||||
|
||||
template <typename _Offset = integral_constant<int, 0>>
|
||||
[[__gnu__::__always_inline__]]
|
||||
static constexpr _TV
|
||||
_S_extract(__vec_builtin auto __x, _Offset = {})
|
||||
{
|
||||
static_assert(is_same_v<__vec_value_type<_TV>, __vec_value_type<decltype(__x)>>);
|
||||
return __builtin_shufflevector(__x, decltype(__x)(), (_Is + _Offset::value)...);
|
||||
}
|
||||
|
||||
// swap neighboring elements
|
||||
[[__gnu__::__always_inline__]]
|
||||
static constexpr _TV
|
||||
_S_swap_neighbors(_TV __x)
|
||||
{ return __builtin_shufflevector(__x, __x, (_Is ^ 1)...); }
|
||||
|
||||
// duplicate even indexed elements, dropping the odd ones
|
||||
[[__gnu__::__always_inline__]]
|
||||
static constexpr _TV
|
||||
_S_dup_even(_TV __x)
|
||||
{ return __builtin_shufflevector(__x, __x, (_Is & ~1)...); }
|
||||
|
||||
// duplicate odd indexed elements, dropping the even ones
|
||||
[[__gnu__::__always_inline__]]
|
||||
static constexpr _TV
|
||||
_S_dup_odd(_TV __x)
|
||||
{ return __builtin_shufflevector(__x, __x, (_Is | 1)...); }
|
||||
|
||||
[[__gnu__::__always_inline__]]
|
||||
static constexpr void
|
||||
_S_overwrite_even_elements(_TV& __x, _HV __y) requires (_Np > 1)
|
||||
{
|
||||
constexpr __simd_size_type __n = __width_of<_TV>;
|
||||
__x = __builtin_shufflevector(__x,
|
||||
#ifdef _GLIBCXX_CLANG
|
||||
__vec_concat(__y, __y),
|
||||
#else
|
||||
__y,
|
||||
#endif
|
||||
((_Is & 1) == 0 ? __n + _Is / 2 : _Is)...);
|
||||
}
|
||||
|
||||
[[__gnu__::__always_inline__]]
|
||||
static constexpr void
|
||||
_S_overwrite_even_elements(_TV& __xl, _TV& __xh, _TV __y)
|
||||
{
|
||||
constexpr __simd_size_type __nl = __width_of<_TV>;
|
||||
constexpr __simd_size_type __nh = __nl * 3 / 2;
|
||||
__xl = __builtin_shufflevector(__xl, __y, ((_Is & 1) == 0 ? __nl + _Is / 2 : _Is)...);
|
||||
__xh = __builtin_shufflevector(__xh, __y, ((_Is & 1) == 0 ? __nh + _Is / 2 : _Is)...);
|
||||
}
|
||||
|
||||
[[__gnu__::__always_inline__]]
|
||||
static constexpr void
|
||||
_S_overwrite_odd_elements(_TV& __x, _HV __y) requires (_Np > 1)
|
||||
{
|
||||
constexpr __simd_size_type __n = __width_of<_TV>;
|
||||
__x = __builtin_shufflevector(__x,
|
||||
#ifdef _GLIBCXX_CLANG
|
||||
__vec_concat(__y, __y),
|
||||
#else
|
||||
__y,
|
||||
#endif
|
||||
((_Is & 1) == 1 ? __n + _Is / 2 : _Is)...);
|
||||
}
|
||||
|
||||
[[__gnu__::__always_inline__]]
|
||||
static constexpr void
|
||||
_S_overwrite_odd_elements(_TV& __xl, _TV& __xh, _TV __y)
|
||||
{
|
||||
constexpr __simd_size_type __nl = __width_of<_TV>;
|
||||
constexpr __simd_size_type __nh = __nl * 3 / 2;
|
||||
__xl = __builtin_shufflevector(__xl, __y, ((_Is & 1) == 1 ? __nl + _Is / 2 : _Is)...);
|
||||
__xh = __builtin_shufflevector(__xh, __y, ((_Is & 1) == 1 ? __nh + _Is / 2 : _Is)...);
|
||||
}
|
||||
|
||||
// true if all elements are know to be equal to __ref at compile time
|
||||
[[__gnu__::__always_inline__]]
|
||||
static constexpr bool
|
||||
_S_is_const_known_equal_to(_TV __x, _Tp __ref)
|
||||
{ return (__is_const_known_equal_to(__x[_Is], __ref) && ...); }
|
||||
|
||||
};
|
||||
} // namespace simd
|
||||
_GLIBCXX_END_NAMESPACE_VERSION
|
||||
} // namespace std
|
||||
|
||||
#pragma GCC diagnostic pop
|
||||
#endif // C++26
|
||||
#endif // _GLIBCXX_VEC_OPS_H
|
||||
@@ -2333,6 +2333,19 @@ ftms = {
|
||||
};
|
||||
};
|
||||
|
||||
ftms = {
|
||||
name = simd;
|
||||
values = {
|
||||
no_stdname = true; // TODO: change once complete
|
||||
v = 202506;
|
||||
cxxmin = 26;
|
||||
extra_cond = "__cpp_structured_bindings >= 202411L "
|
||||
"&& __cpp_expansion_statements >= 202411L "
|
||||
"&& (__x86_64__ || __i386__)"; // TODO: lift initial restriction to x86
|
||||
hosted = yes;
|
||||
};
|
||||
};
|
||||
|
||||
// Standard test specifications.
|
||||
stds[97] = ">= 199711L";
|
||||
stds[03] = ">= 199711L";
|
||||
|
||||
@@ -2616,4 +2616,13 @@
|
||||
#endif /* !defined(__cpp_lib_contracts) */
|
||||
#undef __glibcxx_want_contracts
|
||||
|
||||
#if !defined(__cpp_lib_simd)
|
||||
# if (__cplusplus > 202302L) && _GLIBCXX_HOSTED && (__cpp_structured_bindings >= 202411L && __cpp_expansion_statements >= 202411L && (__x86_64__ || __i386__))
|
||||
# define __glibcxx_simd 202506L
|
||||
# if defined(__glibcxx_want_all) || defined(__glibcxx_want_simd)
|
||||
# endif
|
||||
# endif
|
||||
#endif /* !defined(__cpp_lib_simd) */
|
||||
#undef __glibcxx_want_simd
|
||||
|
||||
#undef __glibcxx_want_all
|
||||
|
||||
48
libstdc++-v3/include/std/simd
Normal file
48
libstdc++-v3/include/std/simd
Normal file
@@ -0,0 +1,48 @@
|
||||
// <simd> -*- C++ -*-
|
||||
|
||||
// Copyright The GNU Toolchain Authors.
|
||||
//
|
||||
// This file is part of the GNU ISO C++ Library. This library is free
|
||||
// software; you can redistribute it and/or modify it under the
|
||||
// terms of the GNU General Public License as published by the
|
||||
// Free Software Foundation; either version 3, or (at your option)
|
||||
// any later version.
|
||||
|
||||
// This library is distributed in the hope that it will be useful,
|
||||
// but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
// GNU General Public License for more details.
|
||||
|
||||
// Under Section 7 of GPL version 3, you are granted additional
|
||||
// permissions described in the GCC Runtime Library Exception, version
|
||||
// 3.1, as published by the Free Software Foundation.
|
||||
|
||||
// You should have received a copy of the GNU General Public License and
|
||||
// a copy of the GCC Runtime Library Exception along with this program;
|
||||
// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
|
||||
// <http://www.gnu.org/licenses/>.
|
||||
|
||||
/** @file simd
|
||||
* This is a Standard C++ Library header.
|
||||
*/
|
||||
|
||||
#ifndef _GLIBCXX_SIMD
|
||||
#define _GLIBCXX_SIMD 1
|
||||
|
||||
#ifdef _GLIBCXX_SYSHDR
|
||||
#pragma GCC system_header
|
||||
#endif
|
||||
|
||||
#define __glibcxx_want_simd
|
||||
#include <bits/version.h>
|
||||
|
||||
#ifdef __glibcxx_simd
|
||||
|
||||
#include "bits/simd_vec.h"
|
||||
#include "bits/simd_loadstore.h"
|
||||
#include "bits/simd_mask_reductions.h"
|
||||
#include "bits/simd_reductions.h"
|
||||
#include "bits/simd_alg.h"
|
||||
|
||||
#endif
|
||||
#endif
|
||||
329
libstdc++-v3/testsuite/std/simd/arithmetic.cc
Normal file
329
libstdc++-v3/testsuite/std/simd/arithmetic.cc
Normal file
@@ -0,0 +1,329 @@
|
||||
// { dg-do run { target c++26 } }
|
||||
// { dg-require-effective-target x86 }
|
||||
|
||||
#include "test_setup.h"
|
||||
|
||||
static constexpr bool is_iec559 =
|
||||
#ifdef __GCC_IEC_559
|
||||
__GCC_IEC_559 >= 2;
|
||||
#elif defined __STDC_IEC_559__
|
||||
__STDC_IEC_559__ == 1;
|
||||
#else
|
||||
false;
|
||||
#endif
|
||||
|
||||
#if VIR_NEXT_PATCH
|
||||
template <typename V>
|
||||
requires complex_like<typename V::value_type>
|
||||
struct Tests<V>
|
||||
{
|
||||
using T = typename V::value_type;
|
||||
using M = typename V::mask_type;
|
||||
using Real = typename T::value_type;
|
||||
using RealV = simd::rebind_t<Real, V>;
|
||||
|
||||
static_assert(std::is_floating_point_v<Real>);
|
||||
|
||||
static constexpr T min = std::numeric_limits<Real>::lowest();
|
||||
static constexpr T norm_min = std::numeric_limits<Real>::min();
|
||||
static constexpr T denorm_min = std::numeric_limits<Real>::denorm_min();
|
||||
static constexpr T max = std::numeric_limits<Real>::max();
|
||||
static constexpr T inf = std::numeric_limits<Real>::infinity();
|
||||
|
||||
ADD_TEST(plus_minus) {
|
||||
std::tuple {V(), init_vec<V, C(1, 1), C(2, 2), C(3, 3)>},
|
||||
[](auto& t, V x, V y) {
|
||||
t.verify_equal(x + x, x);
|
||||
t.verify_equal(x - x, x);
|
||||
t.verify_equal(x + y, y);
|
||||
t.verify_equal(y + x, y);
|
||||
t.verify_equal(x - y, -y);
|
||||
t.verify_equal(y - x, y);
|
||||
t.verify_equal(x += T(1, -2), T(1, -2));
|
||||
t.verify_equal(x = x + x, T(2, -4));
|
||||
t.verify_equal(x = x - y, init_vec<V, C(1, -5), C(0, -6), C(-1, -7)>);
|
||||
t.verify_equal(x, init_vec<V, C(1, -5), C(0, -6), C(-1, -7)>);
|
||||
}
|
||||
};
|
||||
|
||||
// complex multiplication & division has an edge case which is due to '-0. - -0.'. If we
|
||||
// interpret negative zero to represent a value between denorm_min and 0 (exclusive) then we
|
||||
// cannot know whether the resulting zero is negative or positive. ISO 60559 simply defines the
|
||||
// result to be positive zero, but that's throwing away half of the truth.
|
||||
//
|
||||
// Consider (https://compiler-explorer.com/z/61cYhrE48):
|
||||
// sqrt(x * complex{1.}) -> {0, +/-1}.
|
||||
// The sign of the imaginary part depends on whether x is double{-1} or complex{-1.}. This is
|
||||
// due to the type of the operand influencing the formula used for multiplication:
|
||||
//
|
||||
// 1. 'x * (u+iv)' is implemented as 'xu + i(xv)'
|
||||
//
|
||||
// 2. '(x+iy) * (u+iv)' is implemented as '(xu-yv) + i(xv+yu)'
|
||||
//
|
||||
// 'xv' is equal to -0 and 'yu' is equal to +0. Consequently the imaginary part in (1.) is -0
|
||||
// and in (2.) it is (-0 + 0) which is +0. The example above then uses that difference to hit
|
||||
// the branch cut on sqrt.
|
||||
|
||||
// (x+iy)(u+iv) = (xu-yv)+i(xv+yu)
|
||||
// depending on FMA contraction or FLT_EVAL_METHOD 'inf - inf' can be 0, inf, -inf, or NaN (no
|
||||
// contraction).
|
||||
//
|
||||
// Because of all these issues, verify_equal is implemented to interpret "an infinity" as equal
|
||||
// to another infinity according to the interpretation of C23 Annex G.3.
|
||||
|
||||
ADD_TEST(multiplication_corner_cases) {
|
||||
std::array {min, norm_min, denorm_min, max, inf},
|
||||
[](auto& t, V x) {
|
||||
t.verify_equal(x * x, x[0] * x[0]);
|
||||
const V y = x * T(1, 1);
|
||||
t.verify_equal(y * y, y[0] * y[0])(y);
|
||||
x *= T(0, 1);
|
||||
t.verify_equal(x * x, x[0] * x[0]);
|
||||
x *= T(1, 1);
|
||||
t.verify_equal(x * x, x[0] * x[0])(x);
|
||||
x *= T(1, Real(.5));
|
||||
t.verify_equal(x * x, x[0] * x[0])(x);
|
||||
}
|
||||
};
|
||||
|
||||
ADD_TEST(multiplication) {
|
||||
std::tuple {V(), V(RealV(1), RealV()), V(RealV(), RealV(1)), init_vec<V, C(0, 2), C(2, 0), C(-1, 2)>},
|
||||
[](auto& t, V x, V one, V I, V z) {
|
||||
t.verify_equal(x * x, x);
|
||||
t.verify_equal(x * z, x);
|
||||
t.verify_equal(z * x, x);
|
||||
t.verify_equal(one * one, one);
|
||||
t.verify_equal(one * z, z);
|
||||
t.verify_equal(z * one, z);
|
||||
|
||||
// Notes:
|
||||
// inf + -inf -> NaN
|
||||
// 0. + -0. -> 0. (this is arbitrary, why not NaN: indeterminable sign?)
|
||||
// complex(0.) * -complex(2., 2.) -> (0, -0)
|
||||
// 0. * -complex(2., 2.) -> (-0, -0)
|
||||
// => the *type* of the operand determines the sign of the zero, which is *impossible*
|
||||
// to implement with vec<complex>!
|
||||
// complex(DBL_MAX, DBL_MAX) * complex(2., 2.) -> (-nan, inf) => θ got lost
|
||||
// complex(1.) / complex(0., 0.) -> (inf, -nan) => θ got lost
|
||||
// complex(1.) / complex(-0., 0.) -> (inf, -nan) => θ got lost
|
||||
// complex(1.) / complex(0., -0.) -> (inf, -nan) => θ got lost
|
||||
// complex(1.) / complex(-DBL_INF, 0.) -> (-0, -0) => θ is wrong
|
||||
|
||||
t.verify_bit_equal(one * I, I);
|
||||
|
||||
// (0+i0) * (-0-i0) -> (-0 + 0) + i(-0 + -0) -> 0-i0
|
||||
t.verify_bit_equal(x * -x, T() * -T());
|
||||
t.verify_bit_equal(-x * x, -T() * T());
|
||||
|
||||
t.verify_bit_equal(x * conj(x), T() * conj(T()));
|
||||
t.verify_bit_equal(x * -conj(x), T() * -conj(T()));
|
||||
|
||||
// real * complex has extra overloads on complex but not on vec<complex>
|
||||
// for vec<complex> the result therefore needs to be "bit equal" only to
|
||||
// complex * complex
|
||||
t.verify_equal(x.real() * -x, T().real() * -T());
|
||||
t.verify_bit_equal(x.real() * -x, T() * -T());
|
||||
|
||||
t.verify_bit_equal(I * one, I);
|
||||
t.verify_bit_equal(I * I, T(-1, 0));
|
||||
t.verify_bit_equal(z * I, init_vec<V, C(-2, 0), C(0., 2.), C(-2, -1)>);
|
||||
t.verify_bit_equal(std::complex{-0., 0.} * std::complex{0., 1.}, std::complex{-0., 0.});
|
||||
t.verify_bit_equal(std::complex{-0., -1.} * std::complex{0., 0.}, std::complex{0., -0.});
|
||||
t.verify_bit_equal(0. + -0., 0.);
|
||||
}
|
||||
};
|
||||
};
|
||||
#endif
|
||||
|
||||
template <typename V>
|
||||
struct Tests
|
||||
{
|
||||
using T = typename V::value_type;
|
||||
using M = typename V::mask_type;
|
||||
|
||||
static constexpr T min = std::numeric_limits<T>::lowest();
|
||||
static constexpr T norm_min = std::numeric_limits<T>::min();
|
||||
static constexpr T max = std::numeric_limits<T>::max();
|
||||
|
||||
ADD_TEST(plus0, requires(T x) { x + x; }) {
|
||||
std::tuple{V(), init_vec<V, 1, 2, 3, 4, 5, 6, 7>},
|
||||
[](auto& t, V x, V y) {
|
||||
t.verify_equal(x + x, x);
|
||||
t.verify_equal(x = x + T(1), T(1));
|
||||
t.verify_equal(x + x, T(2));
|
||||
t.verify_equal(x = x + y, init_vec<V, 2, 3, 4, 5, 6, 7, 8>);
|
||||
t.verify_equal(x = x + -y, T(1));
|
||||
t.verify_equal(x += y, init_vec<V, 2, 3, 4, 5, 6, 7, 8>);
|
||||
t.verify_equal(x, init_vec<V, 2, 3, 4, 5, 6, 7, 8>);
|
||||
t.verify_equal(x += -y, T(1));
|
||||
t.verify_equal(x, T(1));
|
||||
}
|
||||
};
|
||||
|
||||
ADD_TEST(plus1, requires(T x) { x + x; }) {
|
||||
std::tuple{test_iota<V>},
|
||||
[](auto& t, V x) {
|
||||
t.verify_equal(x + std::cw<0>, x);
|
||||
t.verify_equal(std::cw<0> + x, x);
|
||||
t.verify_equal(x + T(), x);
|
||||
t.verify_equal(T() + x, x);
|
||||
t.verify_equal(x + -x, V());
|
||||
t.verify_equal(-x + x, V());
|
||||
}
|
||||
};
|
||||
|
||||
ADD_TEST(minus0, requires(T x) { x - x; }) {
|
||||
std::tuple{T(1), T(0), init_vec<V, 1, 2, 3, 4, 5, 6, 7>},
|
||||
[](auto& t, V x, V y, V z) {
|
||||
t.verify_equal(x - y, x);
|
||||
t.verify_equal(x - T(1), y);
|
||||
t.verify_equal(y, x - T(1));
|
||||
t.verify_equal(x - x, y);
|
||||
t.verify_equal(x = z - x, init_vec<V, 0, 1, 2, 3, 4, 5, 6>);
|
||||
t.verify_equal(x = z - x, V(1));
|
||||
t.verify_equal(z -= x, init_vec<V, 0, 1, 2, 3, 4, 5, 6>);
|
||||
t.verify_equal(z, init_vec<V, 0, 1, 2, 3, 4, 5, 6>);
|
||||
t.verify_equal(z -= z, V(0));
|
||||
t.verify_equal(z, V(0));
|
||||
}
|
||||
};
|
||||
|
||||
ADD_TEST(minus1, requires(T x) { x - x; }) {
|
||||
std::tuple{test_iota<V>},
|
||||
[](auto& t, V x) {
|
||||
t.verify_equal(x - x, V());
|
||||
t.verify_equal(x - std::cw<0>, x);
|
||||
t.verify_equal(std::cw<0> - x, -x);
|
||||
t.verify_equal(x - T(), x);
|
||||
t.verify_equal(T() - x, -x);
|
||||
}
|
||||
};
|
||||
|
||||
ADD_TEST(times0, requires(T x) { x * x; }) {
|
||||
std::tuple{T(0), T(1), T(2)},
|
||||
[](auto& t, T v0, T v1, T v2) {
|
||||
V x = v1;
|
||||
V y = v0;
|
||||
t.verify_equal(x * y, y);
|
||||
t.verify_equal(x = x * T(2), T(2));
|
||||
t.verify_equal(x * x, T(4));
|
||||
y = init_vec<V, 1, 2, 3, 4, 5, 6, 7>;
|
||||
t.verify_equal(x = x * y, init_vec<V, 2, 4, 6, 8, 10, 12, 14>);
|
||||
y = v2;
|
||||
// don't test norm_min/2*2 in the following. There's no guarantee, in
|
||||
// general, that the result isn't flushed to zero (e.g. NEON without
|
||||
// subnormals)
|
||||
for (T n : {T(max - T(1)), std::is_floating_point_v<T> ? T(norm_min * T(3)) : min})
|
||||
{
|
||||
x = T(n / 2);
|
||||
t.verify_equal(x * y, V(n));
|
||||
}
|
||||
if (std::is_integral<T>::value && std::is_unsigned<T>::value)
|
||||
{
|
||||
// test modulo arithmetics
|
||||
T n = max;
|
||||
x = n;
|
||||
for (T m : {T(2), T(7), T(max / 127), max})
|
||||
{
|
||||
y = m;
|
||||
// if T is of lower rank than int, `n * m` will promote to int
|
||||
// before executing the multiplication. In this case an overflow
|
||||
// will be UB (and ubsan will warn about it). The solution is to
|
||||
// cast to uint in that case.
|
||||
using U
|
||||
= std::conditional_t<(sizeof(T) < sizeof(int)), unsigned, T>;
|
||||
t.verify_equal(x * y, V(T(U(n) * U(m))));
|
||||
}
|
||||
}
|
||||
x = v2;
|
||||
t.verify_equal(x *= init_vec<V, 1, 2, 3>, init_vec<V, 2, 4, 6>);
|
||||
t.verify_equal(x, init_vec<V, 2, 4, 6>);
|
||||
}
|
||||
};
|
||||
|
||||
ADD_TEST(times1, requires(T x) { x * x; }) {
|
||||
std::tuple{test_iota<V, 0, 11>},
|
||||
[](auto& t, V x) {
|
||||
t.verify_equal(x * x, V([](int i) { return T(T(i % 12) * T(i % 12)); }));
|
||||
t.verify_equal(x * std::cw<1>, x);
|
||||
t.verify_equal(std::cw<1> * x, x);
|
||||
t.verify_equal(x * T(1), x);
|
||||
t.verify_equal(T(1) * x, x);
|
||||
t.verify_equal(x * T(-1), -x);
|
||||
t.verify_equal(T(-1) * x, -x);
|
||||
}
|
||||
};
|
||||
|
||||
// avoid testing subnormals and expect minor deltas for non-IEC559 float
|
||||
ADD_TEST(divide0, std::is_floating_point_v<T> && !is_iec559) {
|
||||
std::tuple{T(2), init_vec<V, 1, 2, 3, 4, 5, 6, 7>},
|
||||
[](auto& t, V x, V y) {
|
||||
t.verify_equal_to_ulp(x / x, V(T(1)), 1);
|
||||
t.verify_equal_to_ulp(T(3) / x, V(T(3) / T(2)), 1);
|
||||
t.verify_equal_to_ulp(x / T(3), V(T(2) / T(3)), 1);
|
||||
t.verify_equal_to_ulp(y / x, init_vec<V, .5, 1, 1.5, 2, 2.5, 3, 3.5>, 1);
|
||||
}
|
||||
};
|
||||
|
||||
// avoid testing subnormals and expect minor deltas for non-IEC559 float
|
||||
ADD_TEST(divide1, std::is_floating_point_v<T> && !is_iec559) {
|
||||
std::array{T{norm_min * 1024}, T{1}, T{}, T{-1}, T{max / 1024}, T{max / T(4.1)}, max, min},
|
||||
[](auto& t, V a) {
|
||||
V b = std::cw<2>;
|
||||
V ref([&](int i) { return a[i] / 2; });
|
||||
t.verify_equal_to_ulp(a / b, ref, 1);
|
||||
a = select(a == std::cw<0>, T(1), a);
|
||||
// -freciprocal-math together with flush-to-zero makes
|
||||
// the following range restriction necessary (i.e.
|
||||
// 1/|a| must be >= min). Intel vrcpps and vrcp14ps
|
||||
// need some extra slack (use 1.1 instead of 1).
|
||||
a = select(fabs(a) >= T(1.1) / norm_min, T(1), a);
|
||||
t.verify_equal_to_ulp(a / a, V(1), 1)("\na = ", a);
|
||||
ref = V([&](int i) { return 2 / a[i]; });
|
||||
t.verify_equal_to_ulp(b / a, ref, 1)("\na = ", a);
|
||||
t.verify_equal_to_ulp(b /= a, ref, 1);
|
||||
t.verify_equal_to_ulp(b, ref, 1);
|
||||
}
|
||||
};
|
||||
|
||||
ADD_TEST(divide2, (is_iec559 || !std::is_floating_point_v<T>) && requires(T x) { x / x; }) {
|
||||
std::tuple{T(2), init_vec<V, 1, 2, 3, 4, 5, 6, 7>, init_vec<V, T(max), T(norm_min)>,
|
||||
init_vec<V, T(norm_min), T(max)>, init_vec<V, T(max), T(norm_min) + 1>},
|
||||
[](auto& t, V x, V y, V z, V a, V b) {
|
||||
t.verify_equal(x / x, V(1));
|
||||
t.verify_equal(T(3) / x, V(T(3) / T(2)));
|
||||
t.verify_equal(x / T(3), V(T(2) / T(3)));
|
||||
t.verify_equal(y / x, init_vec<V, .5, 1, 1.5, 2, 2.5, 3, 3.5>);
|
||||
V ref = init_vec<V, T(max / 2), T(norm_min / 2)>;
|
||||
t.verify_equal(z / x, ref);
|
||||
ref = init_vec<V, T(norm_min / 2), T(max / 2)>;
|
||||
t.verify_equal(a / x, ref);
|
||||
t.verify_equal(b / b, V(1));
|
||||
ref = init_vec<V, T(2 / max), T(2 / (norm_min + 1))>;
|
||||
t.verify_equal(x / b, ref);
|
||||
t.verify_equal(x /= b, ref);
|
||||
t.verify_equal(x, ref);
|
||||
}
|
||||
};
|
||||
|
||||
static constexpr V from0 = test_iota<V, 0, 63>;
|
||||
static constexpr V from1 = test_iota<V, 1, 64>;
|
||||
static constexpr V from2 = test_iota<V, 2, 65>;
|
||||
|
||||
ADD_TEST(incdec, requires(T x) { ++x; x++; --x; x--; }) {
|
||||
std::tuple{from0},
|
||||
[](auto& t, V x) {
|
||||
t.verify_equal(x++, from0);
|
||||
t.verify_equal(x, from1);
|
||||
t.verify_equal(++x, from2);
|
||||
t.verify_equal(x, from2);
|
||||
|
||||
t.verify_equal(x--, from2);
|
||||
t.verify_equal(x, from1);
|
||||
t.verify_equal(--x, from0);
|
||||
t.verify_equal(x, from0);
|
||||
}
|
||||
};
|
||||
};
|
||||
|
||||
#include "create_tests.h"
|
||||
7
libstdc++-v3/testsuite/std/simd/arithmetic_expensive.cc
Normal file
7
libstdc++-v3/testsuite/std/simd/arithmetic_expensive.cc
Normal file
@@ -0,0 +1,7 @@
|
||||
// { dg-do run { target c++26 } }
|
||||
// { dg-require-effective-target x86 }
|
||||
// { dg-timeout-factor 2 }
|
||||
// { dg-require-effective-target run_expensive_tests }
|
||||
|
||||
#define EXPENSIVE_TESTS 1
|
||||
#include "arithmetic.cc" // { dg-prune-output "Wpsabi" }
|
||||
15
libstdc++-v3/testsuite/std/simd/create_tests.h
Normal file
15
libstdc++-v3/testsuite/std/simd/create_tests.h
Normal file
@@ -0,0 +1,15 @@
|
||||
#include <stdfloat>
|
||||
|
||||
void create_tests()
|
||||
{
|
||||
template for (auto t : {char(), short(), unsigned(), 0l, 0ull, float(), double()})
|
||||
{
|
||||
using T = decltype(t);
|
||||
#ifndef EXPENSIVE_TESTS
|
||||
[[maybe_unused]] Tests<simd::vec<T>> test;
|
||||
#else
|
||||
[[maybe_unused]] Tests<simd::vec<T, simd::vec<T>::size() + 3>> test0;
|
||||
[[maybe_unused]] Tests<simd::vec<T, 1>> test1;
|
||||
#endif
|
||||
}
|
||||
}
|
||||
69
libstdc++-v3/testsuite/std/simd/creation.cc
Normal file
69
libstdc++-v3/testsuite/std/simd/creation.cc
Normal file
@@ -0,0 +1,69 @@
|
||||
// { dg-do run { target c++26 } }
|
||||
// { dg-require-effective-target x86 }
|
||||
|
||||
#include "test_setup.h"
|
||||
|
||||
template <typename V>
|
||||
struct Tests
|
||||
{
|
||||
using T = typename V::value_type;
|
||||
using M = typename V::mask_type;
|
||||
|
||||
ADD_TEST(VecCatChunk) {
|
||||
std::tuple{test_iota<V>, test_iota<V, 1>},
|
||||
[](auto& t, const V v0, const V v1) {
|
||||
auto c = cat(v0, v1);
|
||||
t.verify_equal(c.size(), V::size() * 2);
|
||||
for (int i = 0; i < V::size(); ++i)
|
||||
t.verify_equal(c[i], v0[i])(i);
|
||||
for (int i = 0; i < V::size(); ++i)
|
||||
t.verify_equal(c[i + V::size()], v1[i])(i);
|
||||
const auto [c0, c1] = simd::chunk<V>(c);
|
||||
t.verify_equal(c0, v0);
|
||||
t.verify_equal(c1, v1);
|
||||
if constexpr (V::size() <= 35)
|
||||
{
|
||||
auto d = cat(v1, c, v0);
|
||||
for (int i = 0; i < V::size(); ++i)
|
||||
{
|
||||
t.verify_equal(d[i], v1[i])(i);
|
||||
t.verify_equal(d[i + V::size()], v0[i])(i);
|
||||
t.verify_equal(d[i + 2 * V::size()], v1[i])(i);
|
||||
t.verify_equal(d[i + 3 * V::size()], v0[i])(i);
|
||||
}
|
||||
const auto [...chunked] = simd::chunk<3>(d);
|
||||
t.verify_equal(cat(chunked...), d);
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
ADD_TEST(MaskCatChunk) {
|
||||
std::tuple{M([](int i) { return 1 == (i & 1); }), M([](int i) { return 1 == (i % 3); })},
|
||||
[](auto& t, const M k0, const M k1) {
|
||||
auto c = cat(k0, k1);
|
||||
t.verify_equal(c.size(), V::size() * 2);
|
||||
for (int i = 0; i < V::size(); ++i)
|
||||
t.verify_equal(c[i], k0[i])(i);
|
||||
for (int i = 0; i < V::size(); ++i)
|
||||
t.verify_equal(c[i + V::size()], k1[i])(i);
|
||||
const auto [c0, c1] = simd::chunk<M>(c);
|
||||
t.verify_equal(c0, k0);
|
||||
t.verify_equal(c1, k1);
|
||||
if constexpr (V::size() <= 35)
|
||||
{
|
||||
auto d = cat(k1, c, k0);
|
||||
for (int i = 0; i < V::size(); ++i)
|
||||
{
|
||||
t.verify_equal(d[i], k1[i])(i);
|
||||
t.verify_equal(d[i + V::size()], k0[i])(i);
|
||||
t.verify_equal(d[i + 2 * V::size()], k1[i])(i);
|
||||
t.verify_equal(d[i + 3 * V::size()], k0[i])(i);
|
||||
}
|
||||
const auto [...chunked] = simd::chunk<3>(d);
|
||||
t.verify_equal(cat(chunked...), d);
|
||||
}
|
||||
}
|
||||
};
|
||||
};
|
||||
|
||||
#include "create_tests.h" // { dg-prune-output "Wpsabi" }
|
||||
7
libstdc++-v3/testsuite/std/simd/creation_expensive.cc
Normal file
7
libstdc++-v3/testsuite/std/simd/creation_expensive.cc
Normal file
@@ -0,0 +1,7 @@
|
||||
// { dg-do run { target c++26 } }
|
||||
// { dg-require-effective-target x86 }
|
||||
// { dg-timeout-factor 2 }
|
||||
// { dg-require-effective-target run_expensive_tests }
|
||||
|
||||
#define EXPENSIVE_TESTS 1
|
||||
#include "creation.cc" // { dg-prune-output "Wpsabi" }
|
||||
121
libstdc++-v3/testsuite/std/simd/loads.cc
Normal file
121
libstdc++-v3/testsuite/std/simd/loads.cc
Normal file
@@ -0,0 +1,121 @@
|
||||
// { dg-do run { target c++26 } }
|
||||
// { dg-require-effective-target x86 }
|
||||
|
||||
#include "test_setup.h"
|
||||
#include <numeric>
|
||||
|
||||
template <typename T, std::size_t N, std::size_t Alignment>
|
||||
class alignas(Alignment) aligned_array
|
||||
: public std::array<T, N>
|
||||
{};
|
||||
|
||||
template <typename V>
|
||||
struct Tests
|
||||
{
|
||||
using T = typename V::value_type;
|
||||
using M = typename V::mask_type;
|
||||
|
||||
static_assert(simd::alignment_v<V> <= 256);
|
||||
|
||||
ADD_TEST(load_zeros) {
|
||||
std::tuple {aligned_array<T, V::size * 2, 256> {}, aligned_array<int, V::size * 2, 256> {}},
|
||||
[](auto& t, auto mem, auto ints) {
|
||||
t.verify_equal(simd::unchecked_load<V>(mem), V());
|
||||
t.verify_equal(simd::partial_load<V>(mem), V());
|
||||
|
||||
t.verify_equal(simd::unchecked_load<V>(mem, simd::flag_aligned), V());
|
||||
t.verify_equal(simd::partial_load<V>(mem, simd::flag_aligned), V());
|
||||
|
||||
t.verify_equal(simd::unchecked_load<V>(mem, simd::flag_overaligned<256>), V());
|
||||
t.verify_equal(simd::partial_load<V>(mem, simd::flag_overaligned<256>), V());
|
||||
|
||||
t.verify_equal(simd::unchecked_load<V>(mem.begin() + 1, mem.end()), V());
|
||||
t.verify_equal(simd::partial_load<V>(mem.begin() + 1, mem.end()), V());
|
||||
t.verify_equal(simd::partial_load<V>(mem.begin() + 1, mem.begin() + 1), V());
|
||||
t.verify_equal(simd::partial_load<V>(mem.begin() + 1, mem.begin() + 2), V());
|
||||
|
||||
t.verify_equal(simd::unchecked_load<V>(ints, simd::flag_convert), V());
|
||||
t.verify_equal(simd::partial_load<V>(ints, simd::flag_convert), V());
|
||||
|
||||
t.verify_equal(simd::unchecked_load<V>(mem, M(true)), V());
|
||||
t.verify_equal(simd::unchecked_load<V>(mem, M(false)), V());
|
||||
t.verify_equal(simd::partial_load<V>(mem, M(true)), V());
|
||||
t.verify_equal(simd::partial_load<V>(mem, M(false)), V());
|
||||
}
|
||||
};
|
||||
|
||||
static constexpr V ref = test_iota<V, 1, 0>;
|
||||
static constexpr V ref1 = V([](int i) { return i == 0 ? T(1): T(); });
|
||||
|
||||
template <typename U>
|
||||
static constexpr auto
|
||||
make_iota_array()
|
||||
{
|
||||
aligned_array<U, V::size * 2, simd::alignment_v<V, U>> arr = {};
|
||||
U init = 0;
|
||||
for (auto& x : arr) x = (init += U(1));
|
||||
return arr;
|
||||
}
|
||||
|
||||
ADD_TEST(load_iotas, requires {T() + T(1);}) {
|
||||
std::tuple {make_iota_array<T>(), make_iota_array<int>()},
|
||||
[](auto& t, auto mem, auto ints) {
|
||||
t.verify_equal(simd::unchecked_load<V>(mem), ref);
|
||||
t.verify_equal(simd::partial_load<V>(mem), ref);
|
||||
|
||||
t.verify_equal(simd::unchecked_load<V>(mem.begin() + 1, mem.end()), ref + T(1));
|
||||
t.verify_equal(simd::partial_load<V>(mem.begin() + 1, mem.end()), ref + T(1));
|
||||
t.verify_equal(simd::partial_load<V>(mem.begin(), mem.begin() + 1), ref1);
|
||||
|
||||
t.verify_equal(simd::unchecked_load<V>(mem, simd::flag_aligned), ref);
|
||||
t.verify_equal(simd::partial_load<V>(mem, simd::flag_aligned), ref);
|
||||
|
||||
t.verify_equal(simd::unchecked_load<V>(ints, simd::flag_convert), ref);
|
||||
t.verify_equal(simd::partial_load<V>(ints, simd::flag_convert), ref);
|
||||
t.verify_equal(simd::partial_load<V>(
|
||||
ints.begin(), ints.begin(), simd::flag_convert), V());
|
||||
t.verify_equal(simd::partial_load<V>(
|
||||
ints.begin(), ints.begin() + 1, simd::flag_convert), ref1);
|
||||
|
||||
t.verify_equal(simd::unchecked_load<V>(mem, M(true)), ref);
|
||||
t.verify_equal(simd::unchecked_load<V>(mem, M(false)), V());
|
||||
t.verify_equal(simd::partial_load<V>(mem, M(true)), ref);
|
||||
t.verify_equal(simd::partial_load<V>(mem, M(false)), V());
|
||||
}
|
||||
};
|
||||
|
||||
static constexpr M alternating = M([](int i) { return 1 == (i & 1); });
|
||||
static constexpr V ref_k = select(alternating, ref, T());
|
||||
static constexpr V ref_2 = select(M([](int i) { return i < 2; }), ref, T());
|
||||
static constexpr V ref_k_2 = select(M([](int i) { return i < 2; }), ref_k, T());
|
||||
|
||||
ADD_TEST(masked_loads) {
|
||||
std::tuple {make_iota_array<T>(), make_iota_array<int>(), alternating, M(true), M(false)},
|
||||
[](auto& t, auto mem, auto ints, M k, M tr, M fa) {
|
||||
t.verify_equal(simd::unchecked_load<V>(mem, tr), ref);
|
||||
t.verify_equal(simd::unchecked_load<V>(mem, fa), V());
|
||||
t.verify_equal(simd::unchecked_load<V>(mem, k), ref_k);
|
||||
|
||||
t.verify_equal(simd::unchecked_load<V>(ints, tr, simd::flag_convert), ref);
|
||||
t.verify_equal(simd::unchecked_load<V>(ints, fa, simd::flag_convert), V());
|
||||
t.verify_equal(simd::unchecked_load<V>(ints, k, simd::flag_convert), ref_k);
|
||||
|
||||
t.verify_equal(simd::partial_load<V>(mem, tr), ref);
|
||||
t.verify_equal(simd::partial_load<V>(mem, fa), V());
|
||||
t.verify_equal(simd::partial_load<V>(mem, k), ref_k);
|
||||
|
||||
t.verify_equal(simd::partial_load<V>(mem.begin(), mem.begin() + 2, tr), ref_2);
|
||||
t.verify_equal(simd::partial_load<V>(mem.begin(), mem.begin() + 2, fa), V());
|
||||
t.verify_equal(simd::partial_load<V>(mem.begin(), mem.begin() + 2, k), ref_k_2);
|
||||
|
||||
t.verify_equal(simd::partial_load<V>(ints.begin(), ints.begin() + 2, tr,
|
||||
simd::flag_convert), ref_2);
|
||||
t.verify_equal(simd::partial_load<V>(ints.begin(), ints.begin() + 2, fa,
|
||||
simd::flag_convert), V());
|
||||
t.verify_equal(simd::partial_load<V>(ints.begin(), ints.begin() + 2, k,
|
||||
simd::flag_convert), ref_k_2);
|
||||
}
|
||||
};
|
||||
};
|
||||
|
||||
#include "create_tests.h"
|
||||
7
libstdc++-v3/testsuite/std/simd/loads_expensive.cc
Normal file
7
libstdc++-v3/testsuite/std/simd/loads_expensive.cc
Normal file
@@ -0,0 +1,7 @@
|
||||
// { dg-do run { target c++26 } }
|
||||
// { dg-require-effective-target x86 }
|
||||
// { dg-timeout-factor 2 }
|
||||
// { dg-require-effective-target run_expensive_tests }
|
||||
|
||||
#define EXPENSIVE_TESTS 1
|
||||
#include "loads.cc" // { dg-prune-output "Wpsabi" }
|
||||
112
libstdc++-v3/testsuite/std/simd/mask.cc
Normal file
112
libstdc++-v3/testsuite/std/simd/mask.cc
Normal file
@@ -0,0 +1,112 @@
|
||||
// { dg-do run { target c++26 } }
|
||||
// { dg-require-effective-target x86 }
|
||||
|
||||
#include "test_setup.h"
|
||||
#include <utility>
|
||||
|
||||
namespace simd = std::simd;
|
||||
|
||||
template <std::size_t B, typename A>
|
||||
consteval std::size_t
|
||||
element_size(const simd::basic_mask<B, A>&)
|
||||
{ return B; }
|
||||
|
||||
template <typename V>
|
||||
struct Tests
|
||||
{
|
||||
using T = typename V::value_type;
|
||||
using M = typename V::mask_type;
|
||||
|
||||
ADD_TEST(Sanity) {
|
||||
std::tuple{M([](int i) { return 1 == (i & 1); })},
|
||||
[](auto& t, const M k) {
|
||||
t.verify_equal(element_size(k), sizeof(T));
|
||||
for (int i = 0; i < k.size(); i += 2)
|
||||
t.verify_equal(k[i], false)(k);
|
||||
for (int i = 1; i < k.size(); i += 2)
|
||||
t.verify_equal(k[i], true)(k);
|
||||
}
|
||||
};
|
||||
|
||||
ADD_TEST(Reductions) {
|
||||
std::tuple{M([](int i) { return 1 == (i & 1); }), M(true), M(false)},
|
||||
[](auto& t, const M k, const M tr, const M fa) {
|
||||
t.verify(!all_of(k))(k);
|
||||
if constexpr (V::size() > 1)
|
||||
{
|
||||
t.verify(any_of(k))(k);
|
||||
t.verify(!none_of(k))(k);
|
||||
}
|
||||
|
||||
t.verify(all_of(tr));
|
||||
t.verify(any_of(tr));
|
||||
t.verify(!none_of(tr));
|
||||
|
||||
t.verify(!all_of(fa));
|
||||
t.verify(!any_of(fa));
|
||||
t.verify(none_of(fa));
|
||||
}
|
||||
};
|
||||
|
||||
ADD_TEST(CvtToInt, (sizeof(T) <= sizeof(0ull))) {
|
||||
std::tuple{M([](int i) { return 1 == (i & 1); }), M(true), M(false), M([](int i) {
|
||||
return i % 13 == 0 || i % 7 == 0;
|
||||
})},
|
||||
[](auto& t, const M k, const M tr, const M fa, const M k2) {
|
||||
t.verify_equal(V(+tr), V(1));
|
||||
t.verify_equal(V(+fa), V());
|
||||
t.verify_equal(V(+k), init_vec<V, 0, 1>);
|
||||
|
||||
if constexpr (std::is_integral_v<T>)
|
||||
{
|
||||
t.verify_equal(V(~tr), ~V(1));
|
||||
t.verify_equal(V(~fa), ~V(0));
|
||||
t.verify_equal(V(~k), ~init_vec<V, 0, 1>);
|
||||
}
|
||||
|
||||
t.verify(all_of(simd::rebind_t<char, M>(tr)));
|
||||
t.verify(!all_of(simd::rebind_t<char, M>(fa)));
|
||||
t.verify(!all_of(simd::rebind_t<char, M>(k)));
|
||||
|
||||
t.verify_equal(fa.to_ullong(), 0ull);
|
||||
t.verify_equal(fa.to_bitset(), std::bitset<V::size()>());
|
||||
|
||||
// test whether 'M -> bitset -> M' is an identity transformation
|
||||
t.verify_equal(M(fa.to_bitset()), fa)(fa.to_bitset());
|
||||
t.verify_equal(M(tr.to_bitset()), tr)(tr.to_bitset());
|
||||
t.verify_equal(M(k.to_bitset()), k)(k.to_bitset());
|
||||
t.verify_equal(M(k2.to_bitset()), k2)(k2.to_bitset());
|
||||
|
||||
static_assert(sizeof(0ull) * CHAR_BIT == 64);
|
||||
if constexpr (V::size() <= 64)
|
||||
{
|
||||
constexpr unsigned long long full = -1ull >> (64 - V::size());
|
||||
t.verify_equal(tr.to_ullong(), full)(std::hex, tr.to_ullong(), '^', full, "->",
|
||||
tr.to_ullong() ^ full);
|
||||
t.verify_equal(tr.to_bitset(), full);
|
||||
|
||||
constexpr unsigned long long alternating = 0xaaaa'aaaa'aaaa'aaaaULL & full;
|
||||
t.verify_equal(k.to_ullong(), alternating)(std::hex, k.to_ullong(), '^', alternating,
|
||||
"->", k.to_ullong() ^ alternating);
|
||||
t.verify_equal(k.to_bitset(), alternating);
|
||||
|
||||
// 0, 7, 13, 14, 21, 26, 28, 35, 39, 42, 49, 52, 56, 63, 65, ...
|
||||
constexpr unsigned long long bits7_13 = 0x8112'0488'1420'6081ULL & full;
|
||||
t.verify_equal(k2.to_ullong(), bits7_13)(std::hex, k2.to_ullong());
|
||||
}
|
||||
else
|
||||
{
|
||||
constexpr unsigned long long full = -1ull;
|
||||
constexpr unsigned long long alternating = 0xaaaa'aaaa'aaaa'aaaaULL;
|
||||
int shift = M::size() - 64;
|
||||
t.verify_equal((tr.to_bitset() >> shift).to_ullong(), full);
|
||||
t.verify_equal((k.to_bitset() >> shift).to_ullong(), alternating);
|
||||
}
|
||||
|
||||
t.verify_equal(+tr, -(-tr));
|
||||
t.verify_equal(-+tr, -tr);
|
||||
}
|
||||
};
|
||||
};
|
||||
|
||||
#include "create_tests.h" // { dg-prune-output "Wpsabi" }
|
||||
108
libstdc++-v3/testsuite/std/simd/mask2.cc
Normal file
108
libstdc++-v3/testsuite/std/simd/mask2.cc
Normal file
@@ -0,0 +1,108 @@
|
||||
// { dg-do run { target c++26 } }
|
||||
// { dg-require-effective-target x86 }
|
||||
|
||||
#include "test_setup.h"
|
||||
#include <utility>
|
||||
|
||||
template <typename V>
|
||||
struct Tests
|
||||
{
|
||||
using T = typename V::value_type;
|
||||
using M = typename V::mask_type;
|
||||
|
||||
static constexpr M alternating = M([](int i) { return 1 == (i & 1); });
|
||||
static constexpr M k010 = M([](int i) { return 1 == (i % 3); });
|
||||
static constexpr M k00111 = M([](int i) { return 2 < (i % 5); });
|
||||
|
||||
ADD_TEST(mask_conversion) {
|
||||
std::array {alternating, k010, k00111},
|
||||
[](auto& t, M k) {
|
||||
template for (auto tmp : {char(), short(), int(), double()})
|
||||
{
|
||||
using U = decltype(tmp);
|
||||
using M2 = simd::rebind_t<U, M>;
|
||||
using M3 = simd::mask<U, V::size()>;
|
||||
const M2 ref2 = M2([&](int i) { return k[i]; });
|
||||
t.verify_equal(M2(k), ref2);
|
||||
t.verify_equal(M(M2(k)), k);
|
||||
if constexpr (!std::is_same_v<M2, M3>)
|
||||
{
|
||||
const M3 ref3 = M3([&](int i) { return k[i]; });
|
||||
t.verify_equal(M3(k), ref3);
|
||||
t.verify_equal(M(M3(k)), k);
|
||||
t.verify_equal(M2(M3(k)), ref2);
|
||||
t.verify_equal(M3(M2(k)), ref3);
|
||||
}
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
ADD_TEST(mask_reductions_sanity) {
|
||||
std::tuple {M(true)},
|
||||
[](auto& t, M x) {
|
||||
t.verify_equal(std::simd::reduce_min_index(x), 0);
|
||||
t.verify_equal(std::simd::reduce_max_index(x), V::size - 1);
|
||||
t.verify_precondition_failure("An empty mask does not have a min_index.", [&] {
|
||||
std::simd::reduce_min_index(!x);
|
||||
});
|
||||
t.verify_precondition_failure("An empty mask does not have a max_index.", [&] {
|
||||
std::simd::reduce_max_index(!x);
|
||||
});
|
||||
}
|
||||
};
|
||||
|
||||
ADD_TEST(mask_reductions) {
|
||||
std::tuple{test_iota<V>, test_iota<V> == T(0)},
|
||||
[](auto& t, V v, M k0) {
|
||||
// Caveat:
|
||||
// k0[n0 * (test_iota_max<V> + 1)] is true if it exists
|
||||
// k[n * (test_iota_max<V> + 1) + i] is true if it exists
|
||||
// none_of(k) is true if i > test_iota_max<V>
|
||||
// by construction of test_iota_max:
|
||||
static_assert(test_iota_max<V> < V::size());
|
||||
for (int i = 0; i < int(test_iota_max<V>) + 1; ++i)
|
||||
{
|
||||
M k = v == T(i);
|
||||
|
||||
const int nk = 1 + (V::size() - i - 1) / (test_iota_max<V> + 1);
|
||||
const int maxk = (nk - 1) * (test_iota_max<V> + 1) + i;
|
||||
t.verify(maxk < V::size());
|
||||
|
||||
const int nk0 = 1 + (V::size() - 1) / (test_iota_max<V> + 1);
|
||||
const int maxk0 = (nk0 - 1) * (test_iota_max<V> + 1);
|
||||
t.verify(maxk0 < V::size());
|
||||
|
||||
const int maxkork0 = std::max(maxk, maxk0);
|
||||
|
||||
t.verify_equal(k[i], true);
|
||||
t.verify_equal(std::as_const(k)[i], true);
|
||||
t.verify_equal(std::simd::reduce_min_index(k), i)(k);
|
||||
t.verify_equal(std::simd::reduce_max_index(k), maxk)(k);
|
||||
t.verify_equal(std::simd::reduce_min_index(k || k0), 0);
|
||||
t.verify_equal(std::simd::reduce_max_index(k || k0), maxkork0);
|
||||
t.verify_equal(k, k);
|
||||
t.verify_not_equal(!k, k);
|
||||
t.verify_equal(k | k, k);
|
||||
t.verify_equal(k & k, k);
|
||||
t.verify(none_of(k ^ k));
|
||||
t.verify_equal(std::simd::reduce_count(k), nk);
|
||||
if constexpr (sizeof(T) <= sizeof(0ULL))
|
||||
t.verify_equal(-std::simd::reduce(-k), nk)(k)(-k);
|
||||
t.verify_equal(std::simd::reduce_count(!k), V::size - nk)(!k);
|
||||
if constexpr (V::size <= 128 && sizeof(T) <= sizeof(0ULL))
|
||||
t.verify_equal(-std::simd::reduce(-!k), V::size - nk)(-!k);
|
||||
t.verify(any_of(k));
|
||||
t.verify(bool(any_of(k & k0) ^ (i != 0)));
|
||||
k = M([&](int j) { return j == 0 ? true : k[j]; });
|
||||
t.verify_equal(k[i], true);
|
||||
t.verify_equal(std::as_const(k)[i], true);
|
||||
t.verify_equal(k[0], true);
|
||||
t.verify_equal(std::as_const(k)[0], true);
|
||||
t.verify_equal(std::simd::reduce_min_index(k), 0)(k);
|
||||
t.verify_equal(std::simd::reduce_max_index(k), maxk)(k);
|
||||
}
|
||||
}
|
||||
};
|
||||
};
|
||||
|
||||
#include "create_tests.h" // { dg-prune-output "Wpsabi" }
|
||||
7
libstdc++-v3/testsuite/std/simd/mask2_expensive.cc
Normal file
7
libstdc++-v3/testsuite/std/simd/mask2_expensive.cc
Normal file
@@ -0,0 +1,7 @@
|
||||
// { dg-do run { target c++26 } }
|
||||
// { dg-require-effective-target x86 }
|
||||
// { dg-require-effective-target run_expensive_tests }
|
||||
// { dg-timeout-factor 2 }
|
||||
|
||||
#define EXPENSIVE_TESTS 1
|
||||
#include "mask2.cc" // { dg-prune-output "Wpsabi" }
|
||||
7
libstdc++-v3/testsuite/std/simd/mask_expensive.cc
Normal file
7
libstdc++-v3/testsuite/std/simd/mask_expensive.cc
Normal file
@@ -0,0 +1,7 @@
|
||||
// { dg-do run { target c++26 } }
|
||||
// { dg-require-effective-target x86 }
|
||||
// { dg-timeout-factor 2 }
|
||||
// { dg-require-effective-target run_expensive_tests }
|
||||
|
||||
#define EXPENSIVE_TESTS 1
|
||||
#include "mask.cc" // { dg-prune-output "Wpsabi" }
|
||||
90
libstdc++-v3/testsuite/std/simd/reductions.cc
Normal file
90
libstdc++-v3/testsuite/std/simd/reductions.cc
Normal file
@@ -0,0 +1,90 @@
|
||||
// { dg-do run { target c++26 } }
|
||||
// { dg-require-effective-target x86 }
|
||||
|
||||
#include "test_setup.h"
|
||||
|
||||
template <typename T, std::size_t N, std::size_t Alignment>
|
||||
class alignas(Alignment) aligned_array
|
||||
: public std::array<T, N>
|
||||
{};
|
||||
|
||||
inline constexpr std::multiplies<> mul;
|
||||
inline constexpr std::bit_and<> bit_and;
|
||||
inline constexpr std::bit_or<> bit_or;
|
||||
inline constexpr std::bit_xor<> bit_xor;
|
||||
|
||||
inline constexpr auto my_add = [](auto a, auto b) { return a + b; };
|
||||
|
||||
template <typename V>
|
||||
struct Tests
|
||||
{
|
||||
using T = typename V::value_type;
|
||||
using M = typename V::mask_type;
|
||||
|
||||
static_assert(simd::alignment_v<V> <= 256);
|
||||
|
||||
static consteval V
|
||||
poisoned(T x)
|
||||
{
|
||||
if constexpr (sizeof(V) == sizeof(T) * V::size())
|
||||
return V(x);
|
||||
else
|
||||
{
|
||||
using P = simd::resize_t<sizeof(V) / sizeof(T), V>;
|
||||
static_assert(P::size() > V::size());
|
||||
constexpr auto [...is] = std::_IotaArray<P::size()>;
|
||||
const T arr[P::size()] = {(is < V::size() ? x : T(7))...};
|
||||
return std::bit_cast<V>(P(arr));
|
||||
}
|
||||
}
|
||||
|
||||
ADD_TEST(Sum) {
|
||||
std::tuple {poisoned(0), poisoned(1)},
|
||||
[](auto& t, V v0, V v1) {
|
||||
t.verify_equal(simd::reduce(v0), T(0));
|
||||
t.verify_equal(simd::reduce(v1), T(V::size()));
|
||||
}
|
||||
};
|
||||
|
||||
ADD_TEST(Product) {
|
||||
std::tuple {poisoned(0), poisoned(1)},
|
||||
[](auto& t, V v0, V v1) {
|
||||
t.verify_equal(simd::reduce(v0, mul), T(0));
|
||||
t.verify_equal(simd::reduce(v1, mul), T(1));
|
||||
}
|
||||
};
|
||||
|
||||
ADD_TEST(UnknownSum) {
|
||||
std::tuple {poisoned(0), poisoned(1)},
|
||||
[](auto& t, V v0, V v1) {
|
||||
t.verify_equal(simd::reduce(v0, my_add), T(0));
|
||||
t.verify_equal(simd::reduce(v1, my_add), T(V::size()));
|
||||
}
|
||||
};
|
||||
|
||||
ADD_TEST(And, std::is_integral_v<T>) {
|
||||
std::tuple {poisoned(0), poisoned(1)},
|
||||
[](auto& t, V v0, V v1) {
|
||||
t.verify_equal(simd::reduce(v0, bit_and), T(0));
|
||||
t.verify_equal(simd::reduce(v1, bit_and), T(1));
|
||||
}
|
||||
};
|
||||
|
||||
ADD_TEST(Or, std::is_integral_v<T>) {
|
||||
std::tuple {poisoned(0), poisoned(1)},
|
||||
[](auto& t, V v0, V v1) {
|
||||
t.verify_equal(simd::reduce(v0, bit_or), T(0));
|
||||
t.verify_equal(simd::reduce(v1, bit_or), T(1));
|
||||
}
|
||||
};
|
||||
|
||||
ADD_TEST(Xor, std::is_integral_v<T>) {
|
||||
std::tuple {poisoned(0), poisoned(1)},
|
||||
[](auto& t, V v0, V v1) {
|
||||
t.verify_equal(simd::reduce(v0, bit_xor), T(0));
|
||||
t.verify_equal(simd::reduce(v1, bit_xor), T(V::size() & 1));
|
||||
}
|
||||
};
|
||||
};
|
||||
|
||||
#include "create_tests.h"
|
||||
7
libstdc++-v3/testsuite/std/simd/reductions_expensive.cc
Normal file
7
libstdc++-v3/testsuite/std/simd/reductions_expensive.cc
Normal file
@@ -0,0 +1,7 @@
|
||||
// { dg-do run { target c++26 } }
|
||||
// { dg-require-effective-target x86 }
|
||||
// { dg-timeout-factor 2 }
|
||||
// { dg-require-effective-target run_expensive_tests }
|
||||
|
||||
#define EXPENSIVE_TESTS 1
|
||||
#include "reductions.cc"
|
||||
67
libstdc++-v3/testsuite/std/simd/shift_left.cc
Normal file
67
libstdc++-v3/testsuite/std/simd/shift_left.cc
Normal file
@@ -0,0 +1,67 @@
|
||||
// { dg-do run { target c++26 } }
|
||||
// { dg-require-effective-target x86 }
|
||||
|
||||
#include "test_setup.h"
|
||||
|
||||
template <typename V>
|
||||
requires (V::size() * sizeof(typename V::value_type) <= 70 * 4) // avoid exploding RAM usage
|
||||
struct Tests<V>
|
||||
{
|
||||
using T = typename V::value_type;
|
||||
using M = typename V::mask_type;
|
||||
|
||||
static constexpr int max = sizeof(T) == 8 ? 64 : 32;
|
||||
|
||||
ADD_TEST_N(known_shift, 4, std::is_integral_v<T>) {
|
||||
std::tuple {test_iota<V, 0, 0>},
|
||||
[]<int N>(auto& t, const V x) {
|
||||
constexpr int shift = max * (N + 1) / 4 - 1;
|
||||
constexpr V vshift = T(shift);
|
||||
const V vshiftx = vshift ^ (x & std::cw<1>);
|
||||
V ref([](T i) -> T { return i << shift; });
|
||||
V refx([](T i) -> T { return i << (shift ^ (i & 1)); });
|
||||
t.verify_equal(x << shift, ref)("{:d} << {:d}", x, shift);
|
||||
t.verify_equal(x << vshift, ref)("{:d} << {:d}", x, vshift);
|
||||
t.verify_equal(x << vshiftx, refx)("{:d} << {:d}", x, vshiftx);
|
||||
const auto y = ~x;
|
||||
ref = V([](T i) -> T { return T(~i) << shift; });
|
||||
refx = V([](T i) -> T { return T(~i) << (shift ^ (i & 1)); });
|
||||
t.verify_equal(y << shift, ref)("{:d} << {:d}", y, shift);
|
||||
t.verify_equal(y << vshift, ref)("{:d} << {:d}", y, vshift);
|
||||
t.verify_equal(y << vshiftx, refx)("{:d} << {:d}", y, vshiftx);
|
||||
}
|
||||
};
|
||||
|
||||
ADD_TEST(unknown_shift, std::is_integral_v<T>) {
|
||||
std::tuple {test_iota<V, 0, 0>},
|
||||
[](auto& t, const V x) {
|
||||
if !consteval
|
||||
{
|
||||
for (int shift = 0; shift < max; ++shift)
|
||||
{
|
||||
const auto y = ~x;
|
||||
shift = make_value_unknown(shift);
|
||||
const V vshift = T(shift);
|
||||
V ref([=](T i) -> T { return i << shift; });
|
||||
t.verify_equal(x << shift, ref)("{:d} << {:d}", y, shift);
|
||||
t.verify_equal(x << vshift, ref)("{:d} << {:d}", y, vshift);
|
||||
ref = V([=](T i) -> T { return T(~i) << shift; });
|
||||
t.verify_equal(y << shift, ref)("{:d} << {:d}", y, shift);
|
||||
t.verify_equal(y << vshift, ref)("{:d} << {:d}", y, vshift);
|
||||
}
|
||||
}
|
||||
}
|
||||
};
|
||||
};
|
||||
|
||||
template <typename V>
|
||||
struct Tests
|
||||
{};
|
||||
|
||||
void create_tests()
|
||||
{
|
||||
template for (auto t : {char(), short(), unsigned(), 0l, 0ull})
|
||||
[[maybe_unused]] Tests<simd::vec<decltype(t)>> test;
|
||||
template for (constexpr int n : {1, 3, 17})
|
||||
[[maybe_unused]] Tests<simd::vec<int, n>> test;
|
||||
}
|
||||
7
libstdc++-v3/testsuite/std/simd/shift_left_expensive.cc
Normal file
7
libstdc++-v3/testsuite/std/simd/shift_left_expensive.cc
Normal file
@@ -0,0 +1,7 @@
|
||||
// { dg-do run { target c++26 } }
|
||||
// { dg-require-effective-target x86 }
|
||||
// { dg-timeout-factor 2 }
|
||||
// { dg-require-effective-target run_expensive_tests }
|
||||
|
||||
#define EXPENSIVE_TESTS 1
|
||||
#include "shift_left.cc"
|
||||
91
libstdc++-v3/testsuite/std/simd/shift_right.cc
Normal file
91
libstdc++-v3/testsuite/std/simd/shift_right.cc
Normal file
@@ -0,0 +1,91 @@
|
||||
// { dg-do run { target c++26 } }
|
||||
// { dg-require-effective-target x86 }
|
||||
|
||||
#include "test_setup.h"
|
||||
|
||||
template <typename V>
|
||||
requires (V::size() * sizeof(typename V::value_type) <= 70 * 4) // avoid exploding RAM usage
|
||||
struct Tests<V>
|
||||
{
|
||||
using T = typename V::value_type;
|
||||
using M = typename V::mask_type;
|
||||
|
||||
static constexpr int max = sizeof(T) == 8 ? 64 : 32;
|
||||
|
||||
ADD_TEST_N(known_shift, 4, std::is_integral_v<T>) {
|
||||
std::tuple {test_iota<V>},
|
||||
[]<int N>(auto& t, const V x) {
|
||||
constexpr int shift = max * (N + 1) / 4 - 1;
|
||||
constexpr T tmax = std::numeric_limits<T>::max();
|
||||
constexpr V vshift = T(shift);
|
||||
const V vshiftx = vshift ^ (x & std::cw<1>);
|
||||
t.verify(__is_const_known(vshift));
|
||||
|
||||
V ref([&](int i) -> T { return x[i] >> shift; });
|
||||
V refx([&](int i) -> T { return x[i] >> (shift ^ (i & 1)); });
|
||||
t.verify_equal(x >> shift, ref)("{:d} >> {:d}", x, shift);
|
||||
t.verify_equal(x >> vshift, ref)("{:d} >> {:d}", x, vshift);
|
||||
t.verify_equal(x >> vshiftx, refx)("{:d} >> {:d}", x, vshiftx);
|
||||
|
||||
const V y = ~x;
|
||||
ref = V([&](int i) -> T { return T(~x[i]) >> shift; });
|
||||
refx = V([&](int i) -> T { return T(~x[i]) >> (shift ^ (i & 1)); });
|
||||
t.verify_equal(y >> shift, ref)("{:d} >> {:d}", y, shift);
|
||||
t.verify_equal(y >> vshift, ref)("{:d} >> {:d}", y, vshift);
|
||||
t.verify_equal(y >> vshiftx, refx)("{:d} >> {:d}", y, vshiftx);
|
||||
|
||||
const V z = tmax - x;
|
||||
ref = V([&](int i) -> T { return T(tmax - x[i]) >> shift; });
|
||||
refx = V([&](int i) -> T { return T(tmax - x[i]) >> (shift ^ (i & 1)); });
|
||||
t.verify_equal(z >> shift, ref)("{:d} >> {:d}", z, shift);
|
||||
t.verify_equal(z >> vshift, ref)("{:d} >> {:d}", z, vshift);
|
||||
t.verify_equal(z >> vshiftx, refx)("{:d} >> {:d}", z, vshiftx);
|
||||
}
|
||||
};
|
||||
|
||||
ADD_TEST(unknown_shift, std::is_integral_v<T>) {
|
||||
std::tuple {test_iota<V>},
|
||||
[](auto& t, const V x) {
|
||||
for (int shift = 0; shift < max; ++shift)
|
||||
{
|
||||
constexpr T tmax = std::numeric_limits<T>::max();
|
||||
const V vshift = T(shift);
|
||||
const V vshiftx = vshift ^ (x & std::cw<1>);
|
||||
t.verify(std::is_constant_evaluated()
|
||||
|| (!is_const_known(vshift) && !is_const_known(shift)));
|
||||
|
||||
V ref([&](int i) -> T { return x[i] >> shift; });
|
||||
V refx([&](int i) -> T { return x[i] >> (shift ^ (i & 1)); });
|
||||
t.verify_equal(x >> shift, ref)("{:d} >> {:d}", x, shift);
|
||||
t.verify_equal(x >> vshift, ref)("{:d} >> {:d}", x, vshift);
|
||||
t.verify_equal(x >> vshiftx, refx)("{:d} >> {:d}", x, vshiftx);
|
||||
|
||||
const V y = ~x;
|
||||
ref = V([&](int i) -> T { return T(~x[i]) >> shift; });
|
||||
refx = V([&](int i) -> T { return T(~x[i]) >> (shift ^ (i & 1)); });
|
||||
t.verify_equal(y >> shift, ref)("{:d} >> {:d}", y, shift);
|
||||
t.verify_equal(y >> vshift, ref)("{:d} >> {:d}", y, vshift);
|
||||
t.verify_equal(y >> vshiftx, refx)("{:d} >> {:d}", y, vshiftx);
|
||||
|
||||
const V z = tmax - x;
|
||||
ref = V([&](int i) -> T { return T(tmax - x[i]) >> shift; });
|
||||
refx = V([&](int i) -> T { return T(tmax - x[i]) >> (shift ^ (i & 1)); });
|
||||
t.verify_equal(z >> shift, ref)("{:d} >> {:d}", z, shift);
|
||||
t.verify_equal(z >> vshift, ref)("{:d} >> {:d}", z, vshift);
|
||||
t.verify_equal(z >> vshiftx, refx)("{:d} >> {:d}", z, vshiftx);
|
||||
}
|
||||
}
|
||||
};
|
||||
};
|
||||
|
||||
template <typename V>
|
||||
struct Tests
|
||||
{};
|
||||
|
||||
void create_tests()
|
||||
{
|
||||
template for (auto t : {char(), short(), unsigned(), 0l, 0ull})
|
||||
[[maybe_unused]] Tests<simd::vec<decltype(t)>> test;
|
||||
template for (constexpr int n : {1, 3, 17})
|
||||
[[maybe_unused]] Tests<simd::vec<int, n>> test;
|
||||
}
|
||||
7
libstdc++-v3/testsuite/std/simd/shift_right_expensive.cc
Normal file
7
libstdc++-v3/testsuite/std/simd/shift_right_expensive.cc
Normal file
@@ -0,0 +1,7 @@
|
||||
// { dg-do run { target c++26 } }
|
||||
// { dg-require-effective-target x86 }
|
||||
// { dg-timeout-factor 2 }
|
||||
// { dg-require-effective-target run_expensive_tests }
|
||||
|
||||
#define EXPENSIVE_TESTS 1
|
||||
#include "shift_right.cc"
|
||||
137
libstdc++-v3/testsuite/std/simd/simd_alg.cc
Normal file
137
libstdc++-v3/testsuite/std/simd/simd_alg.cc
Normal file
@@ -0,0 +1,137 @@
|
||||
// { dg-do run { target c++26 } }
|
||||
// { dg-require-effective-target x86 }
|
||||
|
||||
#include "test_setup.h"
|
||||
#include <utility>
|
||||
|
||||
template <typename V>
|
||||
struct Tests
|
||||
{
|
||||
using T = typename V::value_type;
|
||||
|
||||
using M = typename V::mask_type;
|
||||
|
||||
using pair = std::pair<V, V>;
|
||||
static constexpr std::conditional_t<std::is_floating_point_v<T>, short, T> x_max
|
||||
= test_iota_max<V, 1>;
|
||||
static constexpr int x_max_int = static_cast<int>(x_max);
|
||||
|
||||
static constexpr V
|
||||
reverse_iota(const V x)
|
||||
{
|
||||
if constexpr (std::is_enum_v<T>)
|
||||
{
|
||||
using Vu = simd::rebind_t<std::underlying_type_t<T>, V>;
|
||||
return static_cast<V>(std::to_underlying(x_max) - static_cast<Vu>(x));
|
||||
}
|
||||
else
|
||||
return x_max - x;
|
||||
}
|
||||
|
||||
ADD_TEST(Select) {
|
||||
std::tuple{test_iota<V, 0, 63>, test_iota<V, 1, 64>, T(2),
|
||||
M([](int i) { return 1 == (i & 1); }),
|
||||
M([](int i) { return 1 == (i % 3); })},
|
||||
[](auto& t, const V x, const V y, const T z, const M k, const M k3) {
|
||||
t.verify_equal(select(M(true), x, y), x);
|
||||
t.verify_equal(select(M(false), x, y), y);
|
||||
t.verify_equal(select(M(true), y, x), y);
|
||||
t.verify_equal(select(M(false), y, x), x);
|
||||
t.verify_equal(select(k, x, T()),
|
||||
V([](int i) { return (1 == (i & 1)) ? T(i & 63) : T(); }));
|
||||
|
||||
t.verify_equal(select(M(true), z, T()), z);
|
||||
t.verify_equal(select(M(true), T(), z), V());
|
||||
t.verify_equal(select(k, z, T()), V([](int i) { return (1 == (i & 1)) ? T(2) : T(); }));
|
||||
t.verify_equal(select(k3, z, T()), V([](int i) { return (1 == (i % 3)) ? T(2) : T(); }));
|
||||
}
|
||||
};
|
||||
|
||||
ADD_TEST(Min, std::totally_ordered<T>) {
|
||||
std::tuple{test_iota<V, 0, -1>, reverse_iota(test_iota<V, 0, -1>), test_iota<V, 1>},
|
||||
[](auto& t, const V x, const V y, const V x1) {
|
||||
t.verify_equal(min(x, x), x);
|
||||
t.verify_equal(min(V(), x), V());
|
||||
t.verify_equal(min(x, V()), V());
|
||||
if constexpr (std::is_signed_v<T>)
|
||||
{
|
||||
t.verify_equal(min(-x, x), -x);
|
||||
t.verify_equal(min(x, -x), -x);
|
||||
}
|
||||
t.verify_equal(min(x1, x), x);
|
||||
t.verify_equal(min(x, x1), x);
|
||||
t.verify_equal(min(x, y), min(y, x));
|
||||
t.verify_equal(min(x, y), V([](int i) {
|
||||
i %= x_max_int;
|
||||
return std::min(T(x_max_int - i), T(i));
|
||||
}));
|
||||
}
|
||||
};
|
||||
|
||||
ADD_TEST(Max, std::totally_ordered<T>) {
|
||||
std::tuple{test_iota<V, 0, -1>, reverse_iota(test_iota<V, 0, -1>), test_iota<V, 1>},
|
||||
[](auto& t, const V x, const V y, const V x1) {
|
||||
t.verify_equal(max(x, x), x);
|
||||
t.verify_equal(max(V(), x), x);
|
||||
t.verify_equal(max(x, V()), x);
|
||||
if constexpr (std::is_signed_v<T>)
|
||||
{
|
||||
t.verify_equal(max(-x, x), x);
|
||||
t.verify_equal(max(x, -x), x);
|
||||
}
|
||||
t.verify_equal(max(x1, x), x1);
|
||||
t.verify_equal(max(x, x1), x1);
|
||||
t.verify_equal(max(x, y), max(y, x));
|
||||
t.verify_equal(max(x, y), V([](int i) {
|
||||
i %= x_max_int;
|
||||
return std::max(T(x_max_int - i), T(i));
|
||||
}));
|
||||
}
|
||||
};
|
||||
|
||||
ADD_TEST(Minmax, std::totally_ordered<T>) {
|
||||
std::tuple{test_iota<V, 0, -1>, reverse_iota(test_iota<V, 0, -1>), test_iota<V, 1>},
|
||||
[](auto& t, const V x, const V y, const V x1) {
|
||||
t.verify_equal(minmax(x, x), pair{x, x});
|
||||
t.verify_equal(minmax(V(), x), pair{V(), x});
|
||||
t.verify_equal(minmax(x, V()), pair{V(), x});
|
||||
if constexpr (std::is_signed_v<T>)
|
||||
{
|
||||
t.verify_equal(minmax(-x, x), pair{-x, x});
|
||||
t.verify_equal(minmax(x, -x), pair{-x, x});
|
||||
}
|
||||
t.verify_equal(minmax(x1, x), pair{x, x1});
|
||||
t.verify_equal(minmax(x, x1), pair{x, x1});
|
||||
t.verify_equal(minmax(x, y), minmax(y, x));
|
||||
t.verify_equal(minmax(x, y),
|
||||
pair{V([](int i) {
|
||||
i %= x_max_int;
|
||||
return std::min(T(x_max_int - i), T(i));
|
||||
}),
|
||||
V([](int i) {
|
||||
i %= x_max_int;
|
||||
return std::max(T(x_max_int - i), T(i));
|
||||
})});
|
||||
}
|
||||
};
|
||||
|
||||
ADD_TEST(Clamp, std::totally_ordered<T>) {
|
||||
std::tuple{test_iota<V>, reverse_iota(test_iota<V>)},
|
||||
[](auto& t, const V x, const V y) {
|
||||
t.verify_equal(clamp(x, V(), x), x);
|
||||
t.verify_equal(clamp(x, x, x), x);
|
||||
t.verify_equal(clamp(V(), x, x), x);
|
||||
t.verify_equal(clamp(V(), V(), x), V());
|
||||
t.verify_equal(clamp(x, V(), V()), V());
|
||||
t.verify_equal(clamp(x, V(), y), min(x, y));
|
||||
t.verify_equal(clamp(y, V(), x), min(x, y));
|
||||
if constexpr (std::is_signed_v<T>)
|
||||
{
|
||||
t.verify_equal(clamp(V(T(-test_iota_max<V>)), -x, x), -x);
|
||||
t.verify_equal(clamp(V(T(test_iota_max<V>)), -x, x), x);
|
||||
}
|
||||
}
|
||||
};
|
||||
};
|
||||
|
||||
#include "create_tests.h"
|
||||
7
libstdc++-v3/testsuite/std/simd/simd_alg_expensive.cc
Normal file
7
libstdc++-v3/testsuite/std/simd/simd_alg_expensive.cc
Normal file
@@ -0,0 +1,7 @@
|
||||
// { dg-do run { target c++26 } }
|
||||
// { dg-require-effective-target x86 }
|
||||
// { dg-timeout-factor 2 }
|
||||
// { dg-require-effective-target run_expensive_tests }
|
||||
|
||||
#define EXPENSIVE_TESTS 1
|
||||
#include "simd_alg.cc" // { dg-prune-output "Wpsabi" }
|
||||
42
libstdc++-v3/testsuite/std/simd/sse_intrin.cc
Normal file
42
libstdc++-v3/testsuite/std/simd/sse_intrin.cc
Normal file
@@ -0,0 +1,42 @@
|
||||
// { dg-do run { target c++26 } }
|
||||
// { dg-require-effective-target x86 }
|
||||
|
||||
#include "test_setup.h"
|
||||
|
||||
#ifdef __SSE__
|
||||
#include <x86intrin.h>
|
||||
#endif
|
||||
|
||||
template <typename V>
|
||||
struct Tests
|
||||
{
|
||||
using T = typename V::value_type;
|
||||
using M = typename V::mask_type;
|
||||
|
||||
ADD_TEST(misc, !simd::__scalar_abi_tag<typename V::abi_type>) {
|
||||
std::tuple{init_vec<V, 0, 100, 2, 54, 3>},
|
||||
[](auto& t, V x) {
|
||||
t.verify_equal(x, x);
|
||||
if !consteval
|
||||
{
|
||||
#ifdef __SSE__
|
||||
V r = x;
|
||||
if constexpr (sizeof(x) == 16 && std::is_same_v<T, float>)
|
||||
t.verify_equal(r = _mm_and_ps(x, x), x);
|
||||
#endif
|
||||
#ifdef __SSE2__
|
||||
if constexpr (sizeof(x) == 16 && std::is_integral_v<T>)
|
||||
t.verify_equal(r = _mm_and_si128(x, x), x);
|
||||
if constexpr (sizeof(x) == 16 && std::is_same_v<T, double>)
|
||||
t.verify_equal(r = _mm_and_pd(x, x), x);
|
||||
#endif
|
||||
}
|
||||
}
|
||||
};
|
||||
};
|
||||
|
||||
void create_tests()
|
||||
{
|
||||
template for (auto t : {char(), short(), unsigned(), 0l, 0ull, float(), double()})
|
||||
[[maybe_unused]] Tests<simd::vec<decltype(t), 16 / sizeof(t)>> test;
|
||||
}
|
||||
67
libstdc++-v3/testsuite/std/simd/stores.cc
Normal file
67
libstdc++-v3/testsuite/std/simd/stores.cc
Normal file
@@ -0,0 +1,67 @@
|
||||
// { dg-do run { target c++26 } }
|
||||
// { dg-require-effective-target x86 }
|
||||
|
||||
#include "test_setup.h"
|
||||
|
||||
template <typename V>
|
||||
struct Tests
|
||||
{
|
||||
using T = typename V::value_type;
|
||||
using M = typename V::mask_type;
|
||||
|
||||
static_assert(simd::alignment_v<V> <= 256);
|
||||
|
||||
ADD_TEST(stores, requires {T() + T(1);}) {
|
||||
std::tuple {test_iota<V, 1, 0>, std::array<T, V::size * 2> {}, std::array<int, V::size * 2> {}},
|
||||
[](auto& t, const V v, const auto& mem_init, const auto& ints_init) {
|
||||
alignas(256) std::array<T, V::size * 2> mem = mem_init;
|
||||
alignas(256) std::array<int, V::size * 2> ints = ints_init;
|
||||
|
||||
simd::unchecked_store(v, mem, simd::flag_aligned);
|
||||
simd::unchecked_store(v, mem.begin() + V::size(), mem.end());
|
||||
for (int i = 0; i < V::size; ++i)
|
||||
{
|
||||
t.verify_equal(mem[i], T(i + 1));
|
||||
t.verify_equal(mem[V::size + i], T(i + 1));
|
||||
}
|
||||
#if VIR_NEXT_PATCH
|
||||
if constexpr (complex_like<T>)
|
||||
{
|
||||
}
|
||||
else
|
||||
#endif
|
||||
{
|
||||
simd::unchecked_store(v, ints, simd::flag_convert);
|
||||
simd::partial_store(v, ints.begin() + V::size() + 1, ints.end(),
|
||||
simd::flag_convert | simd::flag_overaligned<alignof(int)>);
|
||||
for (int i = 0; i < V::size; ++i)
|
||||
{
|
||||
t.verify_equal(ints[i], int(T(i + 1)));
|
||||
t.verify_equal(ints[V::size + i], int(T(i)));
|
||||
}
|
||||
|
||||
simd::unchecked_store(V(), ints.begin(), V::size(), simd::flag_convert);
|
||||
simd::unchecked_store(V(), ints.begin() + V::size(), V::size(), simd::flag_convert);
|
||||
for (int i = 0; i < 2 * V::size; ++i)
|
||||
t.verify_equal(ints[i], 0)("i =", i);
|
||||
|
||||
if constexpr (V::size() > 1)
|
||||
{
|
||||
simd::partial_store(v, ints.begin() + 1, V::size() - 2, simd::flag_convert);
|
||||
for (int i = 0; i < V::size - 2; ++i)
|
||||
t.verify_equal(ints[i], int(T(i)));
|
||||
t.verify_equal(ints[V::size - 1], 0);
|
||||
t.verify_equal(ints[V::size], 0);
|
||||
}
|
||||
else
|
||||
{
|
||||
simd::partial_store(v, ints.begin() + 1, 0, simd::flag_convert);
|
||||
t.verify_equal(ints[0], 0);
|
||||
t.verify_equal(ints[1], 0);
|
||||
}
|
||||
}
|
||||
}
|
||||
};
|
||||
};
|
||||
|
||||
#include "create_tests.h"
|
||||
7
libstdc++-v3/testsuite/std/simd/stores_expensive.cc
Normal file
7
libstdc++-v3/testsuite/std/simd/stores_expensive.cc
Normal file
@@ -0,0 +1,7 @@
|
||||
// { dg-do run { target c++26 } }
|
||||
// { dg-require-effective-target x86 }
|
||||
// { dg-timeout-factor 2 }
|
||||
// { dg-require-effective-target run_expensive_tests }
|
||||
|
||||
#define EXPENSIVE_TESTS 1
|
||||
#include "stores.cc"
|
||||
809
libstdc++-v3/testsuite/std/simd/test_setup.h
Normal file
809
libstdc++-v3/testsuite/std/simd/test_setup.h
Normal file
@@ -0,0 +1,809 @@
|
||||
// Test framework for <simd> -*- C++ -*-
|
||||
|
||||
// Copyright The GNU Toolchain Authors.
|
||||
//
|
||||
// This file is part of the GNU ISO C++ Library. This library is free
|
||||
// software; you can redistribute it and/or modify it under the
|
||||
// terms of the GNU General Public License as published by the
|
||||
// Free Software Foundation; either version 3, or (at your option)
|
||||
// any later version.
|
||||
|
||||
// This library is distributed in the hope that it will be useful,
|
||||
// but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
// GNU General Public License for more details.
|
||||
|
||||
// Under Section 7 of GPL version 3, you are granted additional
|
||||
// permissions described in the GCC Runtime Library Exception, version
|
||||
// 3.1, as published by the Free Software Foundation.
|
||||
|
||||
// You should have received a copy of the GNU General Public License and
|
||||
// a copy of the GCC Runtime Library Exception along with this program;
|
||||
// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
|
||||
// <http://www.gnu.org/licenses/>.
|
||||
|
||||
#ifndef SIMD_TEST_SETUP_H
|
||||
#define SIMD_TEST_SETUP_H
|
||||
|
||||
#include <bits/simd_details.h>
|
||||
#include <string_view>
|
||||
|
||||
namespace test
|
||||
{
|
||||
struct precondition_failure
|
||||
{
|
||||
std::string_view file;
|
||||
int line;
|
||||
std::string_view expr;
|
||||
std::string_view msg;
|
||||
};
|
||||
|
||||
#undef __glibcxx_simd_precondition
|
||||
|
||||
#define __glibcxx_simd_precondition(expr, msg, ...) \
|
||||
do { \
|
||||
if (__builtin_expect(!bool(expr), false)) \
|
||||
throw test::precondition_failure{__FILE__, __LINE__, #expr, msg}; \
|
||||
} while(false)
|
||||
}
|
||||
|
||||
#undef _GLIBCXX_SIMD_NOEXCEPT
|
||||
#define _GLIBCXX_SIMD_NOEXCEPT noexcept(false)
|
||||
|
||||
#include <simd>
|
||||
|
||||
#include <source_location>
|
||||
#include <iostream>
|
||||
#include <concepts>
|
||||
#include <cfenv>
|
||||
#include <vector>
|
||||
#include <cstdint>
|
||||
#include <climits>
|
||||
|
||||
// global objects
|
||||
static std::vector<void(*)()> test_functions = {};
|
||||
|
||||
static std::int64_t passed_tests = 0;
|
||||
|
||||
static std::int64_t failed_tests = 0;
|
||||
|
||||
static std::string_view test_name = "unknown";
|
||||
|
||||
// ------------------------------------------------
|
||||
|
||||
namespace simd = std::simd;
|
||||
|
||||
template <typename T>
|
||||
struct is_character_type
|
||||
: std::bool_constant<false>
|
||||
{};
|
||||
|
||||
template <typename T>
|
||||
inline constexpr bool is_character_type_v = is_character_type<T>::value;
|
||||
|
||||
template <typename T>
|
||||
struct is_character_type<const T>
|
||||
: is_character_type<T>
|
||||
{};
|
||||
|
||||
template <typename T>
|
||||
struct is_character_type<T&>
|
||||
: is_character_type<T>
|
||||
{};
|
||||
|
||||
template <> struct is_character_type<char> : std::bool_constant<true> {};
|
||||
template <> struct is_character_type<wchar_t> : std::bool_constant<true> {};
|
||||
template <> struct is_character_type<char8_t> : std::bool_constant<true> {};
|
||||
template <> struct is_character_type<char16_t> : std::bool_constant<true> {};
|
||||
template <> struct is_character_type<char32_t> : std::bool_constant<true> {};
|
||||
|
||||
std::ostream& operator<<(std::ostream& s, std::byte b)
|
||||
{ return s << std::hex << static_cast<unsigned>(b) << std::dec; }
|
||||
|
||||
template <typename T, typename Abi>
|
||||
std::ostream& operator<<(std::ostream& s, std::simd::basic_vec<T, Abi> const& v)
|
||||
{
|
||||
if constexpr (std::is_arithmetic_v<T>)
|
||||
{
|
||||
using U = std::conditional_t<
|
||||
sizeof(T) == 1, int, std::conditional_t<
|
||||
is_character_type_v<T>,
|
||||
std::simd::_UInt<sizeof(T)>, T>>;
|
||||
s << '[' << U(v[0]);
|
||||
for (int i = 1; i < v.size(); ++i)
|
||||
s << ", " << U(v[i]);
|
||||
}
|
||||
else
|
||||
{
|
||||
s << '[' << v[0];
|
||||
for (int i = 1; i < v.size(); ++i)
|
||||
s << ", " << v[i];
|
||||
}
|
||||
return s << ']';
|
||||
}
|
||||
|
||||
template <std::size_t B, typename Abi>
|
||||
std::ostream& operator<<(std::ostream& s, std::simd::basic_mask<B, Abi> const& v)
|
||||
{
|
||||
s << '<';
|
||||
for (int i = 0; i < v.size(); ++i)
|
||||
s << int(v[i]);
|
||||
return s << '>';
|
||||
}
|
||||
|
||||
template <std::simd::__vec_builtin V>
|
||||
std::ostream& operator<<(std::ostream& s, V v)
|
||||
{ return s << std::simd::vec<std::simd::__vec_value_type<V>, std::simd::__width_of<V>>(v); }
|
||||
|
||||
template <typename T, typename U>
|
||||
std::ostream& operator<<(std::ostream& s, const std::pair<T, U>& x)
|
||||
{ return s << '{' << x.first << ", " << x.second << '}'; }
|
||||
|
||||
template <typename T>
|
||||
concept is_string_type
|
||||
= is_character_type_v<std::ranges::range_value_t<T>>
|
||||
&& std::is_convertible_v<T, std::basic_string_view<std::ranges::range_value_t<T>>>;
|
||||
|
||||
template <std::ranges::range R>
|
||||
requires (!is_string_type<R>)
|
||||
std::ostream& operator<<(std::ostream& s, R&& x)
|
||||
{
|
||||
s << '[';
|
||||
auto it = std::ranges::begin(x);
|
||||
if (it != std::ranges::end(x))
|
||||
{
|
||||
s << *it;
|
||||
while (++it != std::ranges::end(x))
|
||||
s << ',' << *it;
|
||||
}
|
||||
return s << ']';
|
||||
}
|
||||
|
||||
struct additional_info
|
||||
{
|
||||
const bool failed = false;
|
||||
|
||||
additional_info
|
||||
operator()(auto const& value0, auto const&... more)
|
||||
{
|
||||
if (failed)
|
||||
[&] {
|
||||
std::cout << " " << value0;
|
||||
((std::cout << ' ' << more), ...);
|
||||
std::cout << std::endl;
|
||||
}();
|
||||
return *this;
|
||||
}
|
||||
};
|
||||
|
||||
struct log_novalue {};
|
||||
|
||||
template <typename T>
|
||||
struct unwrap_value_types
|
||||
{ using type = T; };
|
||||
|
||||
template <typename T>
|
||||
requires requires { typename T::value_type; }
|
||||
struct unwrap_value_types<T>
|
||||
{ using type = typename unwrap_value_types<typename T::value_type>::type; };
|
||||
|
||||
template <typename T>
|
||||
using value_type_t = typename unwrap_value_types<std::remove_cvref_t<T>>::type;
|
||||
|
||||
template <typename T>
|
||||
struct as_unsigned;
|
||||
|
||||
template <typename T>
|
||||
using as_unsigned_t = typename as_unsigned<T>::type;
|
||||
|
||||
template <typename T>
|
||||
requires (sizeof(T) == sizeof(unsigned char))
|
||||
struct as_unsigned<T>
|
||||
{ using type = unsigned char; };
|
||||
|
||||
template <typename T>
|
||||
requires (sizeof(T) == sizeof(unsigned short))
|
||||
struct as_unsigned<T>
|
||||
{ using type = unsigned short; };
|
||||
|
||||
template <typename T>
|
||||
requires (sizeof(T) == sizeof(unsigned int))
|
||||
struct as_unsigned<T>
|
||||
{ using type = unsigned int; };
|
||||
|
||||
template <typename T>
|
||||
requires (sizeof(T) == sizeof(unsigned long long))
|
||||
struct as_unsigned<T>
|
||||
{ using type = unsigned long long; };
|
||||
|
||||
template <typename T, typename Abi>
|
||||
struct as_unsigned<std::simd::basic_vec<T, Abi>>
|
||||
{ using type = std::simd::rebind_t<as_unsigned_t<T>, std::simd::basic_vec<T, Abi>>; };
|
||||
|
||||
template <typename T0, typename T1>
|
||||
constexpr T0
|
||||
ulp_distance_signed(T0 val0, const T1& ref1)
|
||||
{
|
||||
if constexpr (std::is_floating_point_v<T1>)
|
||||
return ulp_distance_signed(val0, std::simd::rebind_t<T1, T0>(ref1));
|
||||
else if constexpr (std::is_floating_point_v<value_type_t<T0>>)
|
||||
{
|
||||
int fp_exceptions = 0;
|
||||
if !consteval
|
||||
{
|
||||
fp_exceptions = std::fetestexcept(FE_ALL_EXCEPT);
|
||||
}
|
||||
using std::isnan;
|
||||
using std::abs;
|
||||
using T = value_type_t<T0>;
|
||||
using L = std::numeric_limits<T>;
|
||||
constexpr T0 signexp_mask = -L::infinity();
|
||||
T0 ref0(ref1);
|
||||
T1 val1(val0);
|
||||
const auto subnormal = fabs(ref1) < L::min();
|
||||
using I = as_unsigned_t<T1>;
|
||||
const T1 eps1 = select(subnormal, L::denorm_min(),
|
||||
L::epsilon() * std::bit_cast<T0>(
|
||||
std::bit_cast<I>(ref1)
|
||||
& std::bit_cast<I>(signexp_mask)));
|
||||
const T0 ulp = select(val0 == ref0 || (isnan(val0) && isnan(ref0)),
|
||||
T0(), T0((ref1 - val1) / eps1));
|
||||
if !consteval
|
||||
{
|
||||
std::feclearexcept(FE_ALL_EXCEPT ^ fp_exceptions);
|
||||
}
|
||||
return ulp;
|
||||
}
|
||||
else
|
||||
return ref1 - val0;
|
||||
}
|
||||
|
||||
template <typename T0, typename T1>
|
||||
constexpr T0
|
||||
ulp_distance(const T0& val, const T1& ref)
|
||||
{
|
||||
auto ulp = ulp_distance_signed(val, ref);
|
||||
using T = value_type_t<decltype(ulp)>;
|
||||
if constexpr (std::is_unsigned_v<T>)
|
||||
return ulp;
|
||||
else
|
||||
{
|
||||
using std::abs;
|
||||
return fabs(ulp);
|
||||
}
|
||||
}
|
||||
|
||||
template <typename T>
|
||||
constexpr bool
|
||||
bit_equal(const T& a, const T& b)
|
||||
{
|
||||
using std::simd::_UInt;
|
||||
if constexpr (sizeof(T) <= sizeof(0ull))
|
||||
return std::bit_cast<_UInt<sizeof(T)>>(a) == std::bit_cast<_UInt<sizeof(T)>>(b);
|
||||
else if constexpr (std::simd::__simd_vec_or_mask_type<T>)
|
||||
{
|
||||
using TT = typename T::value_type;
|
||||
if constexpr (std::is_integral_v<TT>)
|
||||
return all_of(a == b);
|
||||
else
|
||||
{
|
||||
constexpr size_t uint_size = std::min(size_t(8), sizeof(TT));
|
||||
struct B
|
||||
{
|
||||
alignas(T) simd::rebind_t<_UInt<uint_size>,
|
||||
simd::resize_t<T::size() * sizeof(TT) / uint_size, T>> data;
|
||||
};
|
||||
if constexpr (sizeof(B) == sizeof(a))
|
||||
return all_of(std::bit_cast<B>(a).data == std::bit_cast<B>(b).data);
|
||||
else
|
||||
{
|
||||
auto [a0, a1] = chunk<std::bit_ceil(unsigned(T::size())) / 2>(a);
|
||||
auto [b0, b1] = chunk<std::bit_ceil(unsigned(T::size())) / 2>(b);
|
||||
return bit_equal(a0, b0) && bit_equal(a1, b1);
|
||||
}
|
||||
}
|
||||
}
|
||||
else
|
||||
static_assert(false);
|
||||
}
|
||||
|
||||
// treat as equal if either:
|
||||
// - operator== yields true
|
||||
// - or for floats, a and b are NaNs
|
||||
template <typename V>
|
||||
constexpr bool
|
||||
equal_with_nan_and_inf_fixup(const V& a, const V& b)
|
||||
{
|
||||
auto eq = a == b;
|
||||
if (std::simd::all_of(eq))
|
||||
return true;
|
||||
else if constexpr (std::simd::__simd_vec_type<V>)
|
||||
{
|
||||
using M = typename V::mask_type;
|
||||
using T = typename V::value_type;
|
||||
if constexpr (std::is_floating_point_v<T>)
|
||||
{ // fix up nan == nan results
|
||||
eq |= a._M_isnan() && b._M_isnan();
|
||||
}
|
||||
else
|
||||
return false;
|
||||
return std::simd::all_of(eq);
|
||||
}
|
||||
else if constexpr (std::is_floating_point_v<V>)
|
||||
return std::isnan(a) && std::isnan(b);
|
||||
else
|
||||
return false;
|
||||
}
|
||||
|
||||
struct constexpr_verifier
|
||||
{
|
||||
struct ignore_the_rest
|
||||
{
|
||||
constexpr ignore_the_rest
|
||||
operator()(auto const&, auto const&...)
|
||||
{ return *this; }
|
||||
};
|
||||
|
||||
bool okay = true;
|
||||
|
||||
constexpr ignore_the_rest
|
||||
verify_precondition_failure(std::string_view expected_msg, auto&& f) &
|
||||
{
|
||||
try
|
||||
{
|
||||
f();
|
||||
okay = false;
|
||||
}
|
||||
catch (const test::precondition_failure& failure)
|
||||
{
|
||||
okay = okay && failure.msg == expected_msg;
|
||||
}
|
||||
catch (...)
|
||||
{
|
||||
okay = false;
|
||||
}
|
||||
return {};
|
||||
}
|
||||
|
||||
constexpr ignore_the_rest
|
||||
verify(const auto& k) &
|
||||
{
|
||||
okay = okay && std::simd::all_of(k);
|
||||
return {};
|
||||
}
|
||||
|
||||
constexpr ignore_the_rest
|
||||
verify_equal(const auto& v, const auto& ref) &
|
||||
{
|
||||
using V = decltype(std::simd::select(v == ref, v, ref));
|
||||
okay = okay && equal_with_nan_and_inf_fixup<V>(v, ref);
|
||||
return {};
|
||||
}
|
||||
|
||||
constexpr ignore_the_rest
|
||||
verify_bit_equal(const auto& v, const auto& ref) &
|
||||
{
|
||||
using V = decltype(std::simd::select(v == ref, v, ref));
|
||||
okay = okay && bit_equal<V>(v, ref);
|
||||
return {};
|
||||
}
|
||||
|
||||
template <typename T, typename U>
|
||||
constexpr ignore_the_rest
|
||||
verify_equal(const std::pair<T, U>& x, const std::pair<T, U>& y) &
|
||||
{
|
||||
verify_equal(x.first, y.first);
|
||||
verify_equal(x.second, y.second);
|
||||
return {};
|
||||
}
|
||||
|
||||
constexpr ignore_the_rest
|
||||
verify_not_equal(const auto& v, const auto& ref) &
|
||||
{
|
||||
okay = okay && std::simd::all_of(v != ref);
|
||||
return {};
|
||||
}
|
||||
|
||||
constexpr ignore_the_rest
|
||||
verify_equal_to_ulp(const auto& x, const auto& y, float allowed_distance) &
|
||||
{
|
||||
okay = okay && std::simd::all_of(ulp_distance(x, y) <= allowed_distance);
|
||||
return {};
|
||||
}
|
||||
|
||||
constexpr_verifier() = default;
|
||||
|
||||
constexpr_verifier(const constexpr_verifier&) = delete;
|
||||
|
||||
constexpr_verifier(constexpr_verifier&&) = delete;
|
||||
};
|
||||
|
||||
template <int... is>
|
||||
[[nodiscard]]
|
||||
consteval bool
|
||||
constexpr_test(auto&& fun, auto&&... args)
|
||||
{
|
||||
constexpr_verifier t;
|
||||
try
|
||||
{
|
||||
fun.template operator()<is...>(t, args...);
|
||||
}
|
||||
catch(const test::precondition_failure& fail)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
return t.okay;
|
||||
}
|
||||
|
||||
template <typename T>
|
||||
T
|
||||
make_value_unknown(const T& x)
|
||||
{ return *std::start_lifetime_as<T>(&x); }
|
||||
|
||||
template <typename T>
|
||||
concept pair_specialization
|
||||
= std::same_as<std::remove_cvref_t<T>, std::pair<typename std::remove_cvref_t<T>::first_type,
|
||||
typename std::remove_cvref_t<T>::second_type>>;
|
||||
|
||||
struct runtime_verifier
|
||||
{
|
||||
const std::string_view test_kind;
|
||||
|
||||
template <typename X, typename Y>
|
||||
additional_info
|
||||
log_failure(const X& x, const Y& y, std::source_location loc, std::string_view s)
|
||||
{
|
||||
++failed_tests;
|
||||
std::cout << loc.file_name() << ':' << loc.line() << ':' << loc.column() << ": in "
|
||||
<< test_kind << " test of '" << test_name
|
||||
<< "' " << s << " failed";
|
||||
if constexpr (!std::is_same_v<X, log_novalue>)
|
||||
{
|
||||
std::cout << ":\n result: " << std::boolalpha;
|
||||
if constexpr (is_character_type_v<X>)
|
||||
std::cout << int(x);
|
||||
else
|
||||
std::cout << x;
|
||||
if constexpr (!std::is_same_v<decltype(y), const log_novalue&>)
|
||||
{
|
||||
std::cout << "\n expected: ";
|
||||
if constexpr (is_character_type_v<Y>)
|
||||
std::cout << int(y);
|
||||
else
|
||||
std::cout << y;
|
||||
}
|
||||
}
|
||||
std::cout << std::endl;
|
||||
return additional_info {true};
|
||||
}
|
||||
|
||||
[[gnu::always_inline]]
|
||||
additional_info
|
||||
verify_precondition_failure(std::string_view expected_msg, auto&& f,
|
||||
std::source_location loc = std::source_location::current()) &
|
||||
{
|
||||
try
|
||||
{
|
||||
f();
|
||||
++failed_tests;
|
||||
return log_failure(log_novalue(), log_novalue(), loc, "precondition failure not detected");
|
||||
}
|
||||
catch (const test::precondition_failure& failure)
|
||||
{
|
||||
if (failure.msg != expected_msg)
|
||||
{
|
||||
++failed_tests;
|
||||
return log_failure(failure.msg, expected_msg, loc, "unexpected exception");
|
||||
}
|
||||
else
|
||||
{
|
||||
++passed_tests;
|
||||
return {};
|
||||
}
|
||||
}
|
||||
catch (...)
|
||||
{
|
||||
++failed_tests;
|
||||
return log_failure(log_novalue(), log_novalue(), loc, "unexpected exception");
|
||||
}
|
||||
}
|
||||
|
||||
[[gnu::always_inline]]
|
||||
additional_info
|
||||
verify(auto&& k, std::source_location loc = std::source_location::current())
|
||||
{
|
||||
if (std::simd::all_of(k))
|
||||
{
|
||||
++passed_tests;
|
||||
return {};
|
||||
}
|
||||
else
|
||||
return log_failure(log_novalue(), log_novalue(), loc, "verify");
|
||||
}
|
||||
|
||||
[[gnu::always_inline]]
|
||||
additional_info
|
||||
verify_equal(auto&& x, auto&& y,
|
||||
std::source_location loc = std::source_location::current())
|
||||
{
|
||||
bool ok;
|
||||
if constexpr (pair_specialization<decltype(x)> && pair_specialization<decltype(y)>)
|
||||
ok = std::simd::all_of(x.first == y.first) && std::simd::all_of(x.second == y.second);
|
||||
else
|
||||
ok = equal_with_nan_and_inf_fixup<decltype(std::simd::select(x == y, x, y))>(x, y);
|
||||
if (ok)
|
||||
{
|
||||
++passed_tests;
|
||||
return {};
|
||||
}
|
||||
else
|
||||
return log_failure(x, y, loc, "verify_equal");
|
||||
}
|
||||
|
||||
[[gnu::always_inline]]
|
||||
additional_info
|
||||
verify_bit_equal(auto&& x, auto&& y,
|
||||
std::source_location loc = std::source_location::current())
|
||||
{
|
||||
using V = decltype(std::simd::select(x == y, x, y));
|
||||
if (bit_equal<V>(x, y))
|
||||
{
|
||||
++passed_tests;
|
||||
return {};
|
||||
}
|
||||
else
|
||||
return log_failure(x, y, loc, "verify_bit_equal");
|
||||
}
|
||||
|
||||
[[gnu::always_inline]]
|
||||
additional_info
|
||||
verify_not_equal(auto&& x, auto&& y,
|
||||
std::source_location loc = std::source_location::current())
|
||||
{
|
||||
if (std::simd::all_of(x != y))
|
||||
{
|
||||
++passed_tests;
|
||||
return {};
|
||||
}
|
||||
else
|
||||
return log_failure(x, y, loc, "verify_not_equal");
|
||||
}
|
||||
|
||||
// ulp_distance_signed can raise FP exceptions and thus must be conditionally executed
|
||||
[[gnu::always_inline]]
|
||||
additional_info
|
||||
verify_equal_to_ulp(auto&& x, auto&& y, float allowed_distance,
|
||||
std::source_location loc = std::source_location::current())
|
||||
{
|
||||
const bool success = std::simd::all_of(ulp_distance(x, y) <= allowed_distance);
|
||||
if (success)
|
||||
{
|
||||
++passed_tests;
|
||||
return {};
|
||||
}
|
||||
else
|
||||
return log_failure(x, y, loc, "verify_equal_to_ulp")
|
||||
("distance:", ulp_distance_signed(x, y),
|
||||
"\n allowed:", allowed_distance);
|
||||
}
|
||||
};
|
||||
|
||||
template <int... is>
|
||||
[[gnu::noinline, gnu::noipa]]
|
||||
void
|
||||
runtime_test(auto&& fun, auto&&... args)
|
||||
{
|
||||
runtime_verifier t {"runtime"};
|
||||
fun.template operator()<is...>(t, make_value_unknown(args)...);
|
||||
}
|
||||
|
||||
template <typename T>
|
||||
concept constant_value = requires {
|
||||
typename std::integral_constant<std::remove_cvref_t<decltype(T::value)>, T::value>;
|
||||
};
|
||||
|
||||
template <typename T>
|
||||
[[gnu::always_inline]] inline bool
|
||||
is_const_known(const T& x)
|
||||
{ return constant_value<T> || __builtin_constant_p(x); }
|
||||
|
||||
template <typename T, typename Abi>
|
||||
[[gnu::always_inline]] inline bool
|
||||
is_const_known(const std::simd::basic_vec<T, Abi>& x)
|
||||
{ return __is_const_known(x); }
|
||||
|
||||
template <std::size_t B, typename Abi>
|
||||
[[gnu::always_inline]] inline bool
|
||||
is_const_known(const std::simd::basic_mask<B, Abi>& x)
|
||||
{ return __is_const_known(x); }
|
||||
|
||||
template <std::ranges::sized_range R>
|
||||
[[gnu::always_inline]] inline bool
|
||||
is_const_known(const R& arr)
|
||||
{
|
||||
constexpr std::size_t N = std::ranges::size(arr);
|
||||
constexpr auto [...is] = std::_IotaArray<N>;
|
||||
return (is_const_known(arr[is]) && ...);
|
||||
}
|
||||
|
||||
template <int... is>
|
||||
[[gnu::always_inline, gnu::flatten]]
|
||||
inline void
|
||||
constprop_test(auto&& fun, auto... args)
|
||||
{
|
||||
runtime_verifier t{"constprop"};
|
||||
#ifndef __clang__
|
||||
t.verify((is_const_known(args) && ...))("=> Some argument(s) failed to constant-propagate.");
|
||||
#endif
|
||||
fun.template operator()<is...>(t, args...);
|
||||
}
|
||||
|
||||
/**
|
||||
* The value of the largest element in test_iota<V, Init>.
|
||||
*/
|
||||
template <typename V, int Init = 0, int Max = V::size() + Init - 1>
|
||||
constexpr value_type_t<V> test_iota_max
|
||||
= sizeof(value_type_t<V>) < sizeof(int)
|
||||
? std::min(int(std::numeric_limits<value_type_t<V>>::max()),
|
||||
Max < 0 ? std::min(V::size() + Init - 1,
|
||||
int(std::numeric_limits<value_type_t<V>>::max()) + Max)
|
||||
: Max)
|
||||
: V::size() + Init - 1;
|
||||
|
||||
template <typename T, typename Abi, int Init, int Max>
|
||||
requires std::is_enum_v<T>
|
||||
constexpr T test_iota_max<simd::basic_vec<T, Abi>, Init, Max>
|
||||
= static_cast<T>(test_iota_max<simd::basic_vec<std::underlying_type_t<T>, Abi>, Init, Max>);
|
||||
|
||||
/**
|
||||
* Starts iota sequence at Init.
|
||||
*
|
||||
* With `Max == 0`: Wrap-around on overflow
|
||||
* With `Max < 0`: Subtract from numeric_limits::max (to leave room for arithmetic ops)
|
||||
* Otherwise: [Init..Max, Init..Max, ...] (inclusive)
|
||||
*
|
||||
* Use simd::__iota if a non-monotonic sequence is a bug.
|
||||
*/
|
||||
template <typename V, int Init = 0, int MaxArg = int(test_iota_max<V, Init>)>
|
||||
constexpr V test_iota = V([](int i) {
|
||||
constexpr int Max = MaxArg < 0 ? int(test_iota_max<V, Init, MaxArg>) : MaxArg;
|
||||
static_assert(Max == 0 || Max > Init || V::size() == 1);
|
||||
i += Init;
|
||||
if constexpr (Max > Init)
|
||||
{
|
||||
while (i > Max)
|
||||
i -= Max - Init + 1;
|
||||
}
|
||||
using T = value_type_t<V>;
|
||||
return static_cast<T>(i);
|
||||
});
|
||||
|
||||
/**
|
||||
* A data-parallel object initialized with {values..., values..., ...}
|
||||
*/
|
||||
template <typename V, auto... values>
|
||||
constexpr V init_vec = [] {
|
||||
using T = typename V::value_type;
|
||||
constexpr std::array<T, sizeof...(values)> arr = {T(values)...};
|
||||
return V([&](size_t i) { return arr[i % arr.size()]; });
|
||||
}();
|
||||
|
||||
template <typename V>
|
||||
struct Tests;
|
||||
|
||||
template <typename T>
|
||||
concept array_specialization
|
||||
= std::same_as<T, std::array<typename T::value_type, std::tuple_size_v<T>>>;
|
||||
|
||||
template <typename Args = void, typename Fun = void>
|
||||
struct add_test
|
||||
{
|
||||
alignas(std::bit_floor(sizeof(Args))) Args args;
|
||||
Fun fun;
|
||||
};
|
||||
|
||||
struct dummy_test
|
||||
{
|
||||
static constexpr std::array<int, 0> args = {};
|
||||
static constexpr auto fun = [](auto&, auto...) {};
|
||||
};
|
||||
|
||||
template <auto test_ref, int... is, std::size_t... arg_idx>
|
||||
void
|
||||
invoke_test_impl(std::index_sequence<arg_idx...>)
|
||||
{
|
||||
constexpr auto fun = test_ref->fun;
|
||||
[[maybe_unused]] constexpr auto args = test_ref->args;
|
||||
#ifdef EXPENSIVE_TESTS
|
||||
constprop_test<is...>(fun, std::get<arg_idx>(args)...);
|
||||
constexpr bool passed = constexpr_test<is...>(fun, std::get<arg_idx>(args)...);
|
||||
if (passed)
|
||||
++passed_tests;
|
||||
else
|
||||
{
|
||||
++failed_tests;
|
||||
std::cout << "=> constexpr test of '" << test_name << "' failed.\n";
|
||||
}
|
||||
#endif
|
||||
runtime_test<is...>(fun, std::get<arg_idx>(args)...);
|
||||
}
|
||||
|
||||
template <auto test_ref, int... is>
|
||||
void
|
||||
invoke_test(std::string_view name)
|
||||
{
|
||||
test_name = name;
|
||||
constexpr auto args = test_ref->args;
|
||||
using A = std::remove_const_t<decltype(args)>;
|
||||
if constexpr (array_specialization<A>)
|
||||
{ // call for each element
|
||||
template for (constexpr std::size_t I : std::_IotaArray<args.size()>)
|
||||
{
|
||||
std::string tmp_name = std::string(name) + '|' + std::to_string(I);
|
||||
test_name = tmp_name;
|
||||
((std::cout << "Testing '" << test_name) << ... << (' ' + std::to_string(is)))
|
||||
<< ' ' << args[I] << "'\n";
|
||||
invoke_test_impl<test_ref, is...>(std::index_sequence<I>());
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
((std::cout << "Testing '" << test_name) << ... << (' ' + std::to_string(is))) << "'\n";
|
||||
invoke_test_impl<test_ref, is...>(std::make_index_sequence<std::tuple_size_v<A>>());
|
||||
}
|
||||
}
|
||||
|
||||
#define ADD_TEST(name, ...) \
|
||||
template <int> \
|
||||
static constexpr auto name##_tmpl = dummy_test {}; \
|
||||
\
|
||||
const int init_##name = [] { \
|
||||
test_functions.push_back([] { invoke_test<&name##_tmpl<0>>(#name); }); \
|
||||
return 0; \
|
||||
}(); \
|
||||
\
|
||||
template <int Tmp> \
|
||||
requires (Tmp == 0) __VA_OPT__(&& (__VA_ARGS__)) \
|
||||
static constexpr auto name##_tmpl<Tmp> = add_test
|
||||
|
||||
#define ADD_TEST_N(name, N, ...) \
|
||||
template <int> \
|
||||
static constexpr auto name##_tmpl = dummy_test {}; \
|
||||
\
|
||||
static void \
|
||||
name() \
|
||||
{ \
|
||||
template for (constexpr int i : std::_IotaArray<N, int>) \
|
||||
invoke_test<&name##_tmpl<0>, i>(#name); \
|
||||
} \
|
||||
\
|
||||
const int init_##name = [] { \
|
||||
test_functions.push_back(name); \
|
||||
return 0; \
|
||||
}(); \
|
||||
\
|
||||
template <int Tmp> \
|
||||
requires (Tmp == 0) __VA_OPT__(&& (__VA_ARGS__)) \
|
||||
static constexpr auto name##_tmpl<Tmp> = add_test
|
||||
|
||||
void create_tests();
|
||||
|
||||
int main()
|
||||
{
|
||||
create_tests();
|
||||
try
|
||||
{
|
||||
for (auto f : test_functions)
|
||||
f();
|
||||
}
|
||||
catch(const test::precondition_failure& fail)
|
||||
{
|
||||
std::cout << fail.file << ':' << fail.line << ": Error: precondition '" << fail.expr
|
||||
<< "' does not hold: " << fail.msg << '\n';
|
||||
return EXIT_FAILURE;
|
||||
}
|
||||
std::cout << "Passed tests: " << passed_tests << "\nFailed tests: " << failed_tests << '\n';
|
||||
return failed_tests != 0 ? EXIT_FAILURE : EXIT_SUCCESS;
|
||||
}
|
||||
|
||||
#endif // SIMD_TEST_SETUP_H
|
||||
710
libstdc++-v3/testsuite/std/simd/traits_common.cc
Normal file
710
libstdc++-v3/testsuite/std/simd/traits_common.cc
Normal file
@@ -0,0 +1,710 @@
|
||||
// { dg-do compile { target c++26 } }
|
||||
// { dg-require-effective-target x86 }
|
||||
// { dg-timeout-factor 2 }
|
||||
|
||||
#include <simd>
|
||||
#include <stdfloat>
|
||||
|
||||
namespace simd = std::simd;
|
||||
|
||||
// test that instantiation of the complete class is well-formed
|
||||
template class simd::basic_vec<int, typename simd::vec<int, 1>::abi_type>;
|
||||
template class simd::basic_vec<int, typename simd::vec<int, 5>::abi_type>;
|
||||
template class simd::basic_vec<int, typename simd::vec<int, 8>::abi_type>;
|
||||
template class simd::basic_vec<int, typename simd::vec<int, 13>::abi_type>;
|
||||
template class simd::basic_vec<float, typename simd::vec<float, 8>::abi_type>;
|
||||
template class simd::basic_vec<float, typename simd::vec<float, 13>::abi_type>;
|
||||
|
||||
constexpr auto default_mask_abi_variant =
|
||||
#ifdef __AVX512F__
|
||||
simd::_AbiVariant::_BitMask;
|
||||
#else
|
||||
simd::_AbiVariant();
|
||||
#endif
|
||||
|
||||
namespace test01
|
||||
{
|
||||
using std::same_as;
|
||||
|
||||
using Abi1 = simd::_Abi_t<1, 1, default_mask_abi_variant>;
|
||||
static_assert(same_as<simd::vec<int, 1>::abi_type, Abi1>);
|
||||
static_assert(same_as<simd::vec<float, 1>::abi_type, Abi1>);
|
||||
|
||||
#if defined __SSE__ && !defined __AVX__
|
||||
static_assert(same_as<simd::vec<float>::abi_type, simd::_Abi_t<4, 1>>);
|
||||
static_assert(same_as<simd::vec<float, 3>::abi_type, simd::_Abi_t<3, 1>>);
|
||||
static_assert(same_as<simd::vec<float, 7>::abi_type, simd::_Abi_t<7, 2>>);
|
||||
|
||||
static_assert(simd::vec<float>::size > 1);
|
||||
static_assert(alignof(simd::vec<float>) > alignof(float));
|
||||
static_assert(alignof(simd::vec<float, 4>) > alignof(float));
|
||||
static_assert(alignof(simd::vec<float, 3>) > alignof(float));
|
||||
static_assert(sizeof(simd::vec<float, 7>) == 2 * sizeof(simd::vec<float>));
|
||||
static_assert(alignof(simd::vec<float, 7>) == alignof(simd::vec<float>));
|
||||
#endif
|
||||
}
|
||||
|
||||
namespace test02
|
||||
{
|
||||
using namespace std;
|
||||
using namespace std::simd;
|
||||
|
||||
static_assert(!destructible<simd::basic_mask<7>>);
|
||||
|
||||
static_assert(same_as<simd::vec<int>::mask_type, simd::mask<int>>);
|
||||
static_assert(same_as<simd::vec<float>::mask_type, simd::mask<float>>);
|
||||
static_assert(same_as<simd::vec<float, 1>::mask_type, simd::mask<float, 1>>);
|
||||
|
||||
// ensure 'true ? int : vec<float>' doesn't work
|
||||
template <typename T>
|
||||
concept has_type_member = requires { typename T::type; };
|
||||
static_assert(has_type_member<common_type<int, simd::vec<float>>>);
|
||||
}
|
||||
|
||||
#if defined __AVX__ && !defined __AVX2__
|
||||
static_assert(alignof(simd::mask<int, 8>) == 16);
|
||||
static_assert(alignof(simd::mask<float, 8>) == 32);
|
||||
static_assert(alignof(simd::mask<int, 16>) == 16);
|
||||
static_assert(alignof(simd::mask<float, 16>) == 32);
|
||||
static_assert(alignof(simd::mask<long long, 4>) == 16);
|
||||
static_assert(alignof(simd::mask<double, 4>) == 32);
|
||||
static_assert(alignof(simd::mask<long long, 8>) == 16);
|
||||
static_assert(alignof(simd::mask<double, 8>) == 32);
|
||||
static_assert(std::same_as<decltype(+simd::mask<float, 8>()), simd::vec<int, 8>>);
|
||||
#endif
|
||||
|
||||
#if defined __SSE__ && !defined __F16C__ && defined __STDCPP_FLOAT16_T__
|
||||
static_assert(simd::vec<std::float16_t>::size() == 1);
|
||||
static_assert(simd::mask<std::float16_t>::size() == 1);
|
||||
static_assert(alignof(simd::vec<std::float16_t, 8>) == alignof(std::float16_t));
|
||||
static_assert(alignof(simd::rebind_t<std::float16_t, simd::vec<float>>) == alignof(std::float16_t));
|
||||
static_assert(simd::rebind_t<std::float16_t, simd::mask<float>>::abi_type::_S_nreg
|
||||
== simd::vec<float>::size());
|
||||
#endif
|
||||
|
||||
template <auto X>
|
||||
using Ic = std::integral_constant<std::remove_const_t<decltype(X)>, X>;
|
||||
|
||||
static_assert( std::convertible_to<Ic<1>, simd::vec<float>>);
|
||||
static_assert(!std::convertible_to<Ic<1.1>, simd::vec<float>>);
|
||||
static_assert(!std::convertible_to<simd::vec<int, 4>, simd::vec<float, 4>>);
|
||||
static_assert(!std::convertible_to<simd::vec<float, 4>, simd::vec<int, 4>>);
|
||||
static_assert( std::convertible_to<int, simd::vec<float>>);
|
||||
static_assert( std::convertible_to<simd::vec<int, 4>, simd::vec<double, 4>>);
|
||||
|
||||
template <typename V>
|
||||
concept has_static_size = requires {
|
||||
{ V::size } -> std::convertible_to<int>;
|
||||
{ V::size() } -> std::signed_integral;
|
||||
{ auto(V::size.value) } -> std::signed_integral;
|
||||
};
|
||||
|
||||
template <typename V, typename T = typename V::value_type>
|
||||
concept usable_vec_or_mask
|
||||
= std::destructible<V>
|
||||
&& std::is_nothrow_move_constructible_v<V>
|
||||
&& std::is_nothrow_move_assignable_v<V>
|
||||
&& std::is_nothrow_default_constructible_v<V>
|
||||
&& std::is_trivially_copyable_v<V>
|
||||
&& std::is_standard_layout_v<V>
|
||||
&& std::ranges::random_access_range<V&>
|
||||
&& !std::ranges::output_range<V&, T>
|
||||
&& std::constructible_from<V, T> // broadcast
|
||||
&& has_static_size<V>
|
||||
&& simd::__simd_vec_or_mask_type<V>
|
||||
;
|
||||
|
||||
template <typename V, typename T = typename V::value_type>
|
||||
concept usable_vec
|
||||
= usable_vec_or_mask<V, T>
|
||||
&& !std::convertible_to<V, std::array<T, V::size()>>
|
||||
&& std::convertible_to<std::array<T, V::size()>, V>
|
||||
&& std::constructible_from<V, simd::rebind_t<int, V>>
|
||||
&& std::constructible_from<V, simd::rebind_t<float, V>>
|
||||
&& !std::constructible_from<V, simd::resize_t<V::size() + 1, V>>
|
||||
&& !std::constructible_from<V, simd::resize_t<V::size() + 1, typename V::mask_type>>
|
||||
&& !std::constructible_from<typename V::mask_type, V>
|
||||
;
|
||||
|
||||
template <typename M, typename T = typename M::value_type>
|
||||
concept usable_mask
|
||||
= std::is_same_v<T, bool>
|
||||
&& usable_vec_or_mask<M, T>
|
||||
&& std::convertible_to<std::bitset<M::size()>, M>
|
||||
&& std::constructible_from<M, unsigned long long>
|
||||
&& std::constructible_from<M, unsigned char>
|
||||
&& std::constructible_from<M, simd::rebind_t<int, M>>
|
||||
&& std::constructible_from<M, simd::rebind_t<float, M>>
|
||||
&& !std::constructible_from<M, simd::resize_t<M::size() + 1, M>>
|
||||
&& !std::convertible_to<unsigned long long, M>
|
||||
&& !std::convertible_to<unsigned char, M>
|
||||
&& !std::convertible_to<bool, M>
|
||||
&& !std::constructible_from<M, std::bitset<M::size() + 1>>
|
||||
&& !std::constructible_from<M, std::bitset<M::size() - 1>>
|
||||
&& !std::constructible_from<M, int>
|
||||
&& !std::constructible_from<M, float>
|
||||
;
|
||||
|
||||
template <typename T>
|
||||
struct test_usable_simd
|
||||
{
|
||||
static_assert(!usable_vec<simd::vec<T, 0>>);
|
||||
static_assert(!has_static_size<simd::vec<T, 0>>);
|
||||
static_assert(usable_vec<simd::vec<T, 1>>);
|
||||
static_assert(usable_vec<simd::vec<T, 2>>);
|
||||
static_assert(usable_vec<simd::vec<T, 3>>);
|
||||
static_assert(usable_vec<simd::vec<T, 4>>);
|
||||
static_assert(usable_vec<simd::vec<T, 7>>);
|
||||
static_assert(usable_vec<simd::vec<T, 8>>);
|
||||
static_assert(usable_vec<simd::vec<T, 16>>);
|
||||
static_assert(usable_vec<simd::vec<T, 32>>);
|
||||
static_assert(usable_vec<simd::vec<T, 63>>);
|
||||
static_assert(usable_vec<simd::vec<T, 64>>);
|
||||
|
||||
static_assert(!usable_mask<simd::mask<T, 0>>);
|
||||
static_assert(!has_static_size<simd::mask<T, 0>>);
|
||||
static_assert(usable_mask<simd::mask<T, 1>>);
|
||||
static_assert(usable_mask<simd::mask<T, 2>>);
|
||||
static_assert(usable_mask<simd::mask<T, 3>>);
|
||||
static_assert(usable_mask<simd::mask<T, 4>>);
|
||||
static_assert(usable_mask<simd::mask<T, 7>>);
|
||||
static_assert(usable_mask<simd::mask<T, 8>>);
|
||||
static_assert(usable_mask<simd::mask<T, 16>>);
|
||||
static_assert(usable_mask<simd::mask<T, 32>>);
|
||||
static_assert(usable_mask<simd::mask<T, 63>>);
|
||||
static_assert(usable_mask<simd::mask<T, 64>>);
|
||||
};
|
||||
|
||||
template <template <typename> class Tpl>
|
||||
struct instantiate_all_vectorizable
|
||||
{
|
||||
Tpl<float> a;
|
||||
Tpl<double> b;
|
||||
Tpl<char> c;
|
||||
Tpl<char8_t> c8;
|
||||
Tpl<char16_t> d;
|
||||
Tpl<char32_t> e;
|
||||
Tpl<wchar_t> f;
|
||||
Tpl<signed char> g;
|
||||
Tpl<unsigned char> h;
|
||||
Tpl<short> i;
|
||||
Tpl<unsigned short> j;
|
||||
Tpl<int> k;
|
||||
Tpl<unsigned int> l;
|
||||
Tpl<long> m;
|
||||
Tpl<unsigned long> n;
|
||||
Tpl<long long> o;
|
||||
Tpl<unsigned long long> p;
|
||||
#ifdef __STDCPP_FLOAT16_T__
|
||||
Tpl<std::float16_t> q;
|
||||
#endif
|
||||
#ifdef __STDCPP_FLOAT32_T__
|
||||
Tpl<std::float32_t> r;
|
||||
#endif
|
||||
#ifdef __STDCPP_FLOAT64_T__
|
||||
Tpl<std::float64_t> s;
|
||||
#endif
|
||||
};
|
||||
|
||||
template struct instantiate_all_vectorizable<test_usable_simd>;
|
||||
|
||||
// vec generator ctor ///////////////
|
||||
|
||||
namespace test_generator
|
||||
{
|
||||
struct udt_convertible_to_float
|
||||
{ operator float() const; };
|
||||
|
||||
static_assert( std::constructible_from<simd::vec<float>, float (&)(int)>);
|
||||
static_assert(!std::convertible_to<float (&)(int), simd::vec<float>>);
|
||||
static_assert(!std::constructible_from<simd::vec<float>, int (&)(int)>);
|
||||
static_assert(!std::constructible_from<simd::vec<float>, double (&)(int)>);
|
||||
static_assert( std::constructible_from<simd::vec<float>, short (&)(int)>);
|
||||
static_assert(!std::constructible_from<simd::vec<float>, long double (&)(int)>);
|
||||
static_assert( std::constructible_from<simd::vec<float>, udt_convertible_to_float (&)(int)>);
|
||||
}
|
||||
|
||||
// mask generator ctor ///////////////
|
||||
|
||||
static_assert(
|
||||
all_of(simd::mask<float, 4>([](int) { return true; }) == simd::mask<float, 4>(true)));
|
||||
static_assert(
|
||||
all_of(simd::mask<float, 4>([](int) { return false; }) == simd::mask<float, 4>(false)));
|
||||
static_assert(
|
||||
all_of(simd::mask<float, 4>([](int i) { return i < 2; })
|
||||
== simd::mask<float, 4>([](int i) {
|
||||
return std::array{true, true, false, false}[i];
|
||||
})));
|
||||
|
||||
static_assert(all_of((simd::vec<int, 4>([](int i) { return i << 10; }) >> 10)
|
||||
== simd::__iota<simd::vec<int, 4>>));
|
||||
|
||||
// vec iterators /////////////////////
|
||||
|
||||
#if SIMD_IS_A_RANGE
|
||||
static_assert([] { simd::vec<float> x = {}; return x.begin() == x.begin(); }());
|
||||
static_assert([] { simd::vec<float> x = {}; return x.begin() == x.cbegin(); }());
|
||||
static_assert([] { simd::vec<float> x = {}; return x.cbegin() == x.begin(); }());
|
||||
static_assert([] { simd::vec<float> x = {}; return x.cbegin() == x.cbegin(); }());
|
||||
static_assert([] { simd::vec<float> x = {}; return x.begin() + x.size() == x.end(); }());
|
||||
static_assert([] { simd::vec<float> x = {}; return x.end() == x.begin() + x.size(); }());
|
||||
static_assert([] { simd::vec<float> x = {}; return x.begin() < x.end(); }());
|
||||
static_assert([] { simd::vec<float> x = {}; return x.begin() <= x.end(); }());
|
||||
static_assert(![] { simd::vec<float> x = {}; return x.begin() > x.end(); }());
|
||||
static_assert(![] { simd::vec<float> x = {}; return x.begin() >= x.end(); }());
|
||||
static_assert(![] { simd::vec<float> x = {}; return x.end() < x.begin(); }());
|
||||
static_assert(![] { simd::vec<float> x = {}; return x.end() <= x.begin(); }());
|
||||
static_assert([] { simd::vec<float> x = {}; return x.end() > x.begin(); }());
|
||||
static_assert([] { simd::vec<float> x = {}; return x.end() >= x.begin(); }());
|
||||
static_assert([] { simd::vec<float> x = {}; return x.end() - x.begin(); }() == simd::vec<float>::size());
|
||||
static_assert([] { simd::vec<float> x = {}; return x.begin() - x.end(); }() == -simd::vec<float>::size());
|
||||
static_assert([] { simd::vec<float> x = {}; return x.begin() - x.begin(); }() == 0);
|
||||
static_assert([] { simd::vec<float> x = {}; return x.begin() + 1 - x.begin(); }() == 1);
|
||||
static_assert([] { simd::vec<float> x = {}; return x.begin() + 1 - x.cbegin(); }() == 1);
|
||||
#endif
|
||||
|
||||
// mask to vec ///////////////////////
|
||||
|
||||
// Clang says all kinds of expressions are not constant expressions. Why? Come on … explain! 🤷
|
||||
#ifdef __clang__
|
||||
#define AVOID_BROKEN_CLANG_FAILURES 1
|
||||
#endif
|
||||
|
||||
#ifndef AVOID_BROKEN_CLANG_FAILURES
|
||||
|
||||
static_assert([] constexpr {
|
||||
constexpr simd::mask<float, 7> a([](int i) -> bool { return i < 3; });
|
||||
constexpr simd::basic_vec b = -a;
|
||||
static_assert(b[0] == -(0 < 3));
|
||||
static_assert(b[1] == -(1 < 3));
|
||||
static_assert(b[2] == -(2 < 3));
|
||||
static_assert(b[3] == -(3 < 3));
|
||||
return all_of(b == simd::vec<int, 7>([](int i) { return -int(i < 3); }));
|
||||
}());
|
||||
|
||||
static_assert([] constexpr {
|
||||
constexpr simd::mask<float, 7> a([](int i) -> bool { return i < 3; });
|
||||
constexpr simd::basic_vec b = ~a;
|
||||
static_assert(b[0] == ~int(0 < 3));
|
||||
static_assert(b[1] == ~int(1 < 3));
|
||||
static_assert(b[2] == ~int(2 < 3));
|
||||
static_assert(b[3] == ~int(3 < 3));
|
||||
return all_of(b == simd::vec<int, 7>([](int i) { return ~int(i < 3); }));
|
||||
}());
|
||||
|
||||
static_assert([] constexpr {
|
||||
constexpr simd::mask<float, 4> a([](int i) -> bool { return i < 2; });
|
||||
constexpr simd::basic_vec b = a;
|
||||
static_assert(b[0] == 1);
|
||||
static_assert(b[1] == 1);
|
||||
static_assert(b[2] == 0);
|
||||
return b[3] == 0;
|
||||
}());
|
||||
|
||||
static_assert([] constexpr {
|
||||
// Corner case on AVX w/o AVX2 systems. <float, 5> is an AVX register;
|
||||
// <int, 5> is deduced as SSE + scalar.
|
||||
constexpr simd::mask<float, 5> a([](int i) -> bool { return i >= 2; });
|
||||
constexpr simd::basic_vec b = a;
|
||||
static_assert(b[0] == 0);
|
||||
static_assert(b[1] == 0);
|
||||
static_assert(b[2] == 1);
|
||||
static_assert(b[3] == 1);
|
||||
static_assert(b[4] == 1);
|
||||
#if defined __AVX2__ || !defined __AVX__
|
||||
static_assert(all_of((b == 1) == a));
|
||||
#endif
|
||||
constexpr simd::mask<float, 8> a8([](int i) -> bool { return i <= 4; });
|
||||
constexpr simd::basic_vec b8 = a8;
|
||||
static_assert(b8[0] == 1);
|
||||
static_assert(b8[1] == 1);
|
||||
static_assert(b8[2] == 1);
|
||||
static_assert(b8[3] == 1);
|
||||
static_assert(b8[4] == 1);
|
||||
static_assert(b8[5] == 0);
|
||||
static_assert(b8[6] == 0);
|
||||
static_assert(b8[7] == 0);
|
||||
#if SIMD_MASK_IMPLICIT_CONVERSIONS || defined __AVX2__ || !defined __AVX__
|
||||
static_assert(all_of((b8 == 1) == a8));
|
||||
#endif
|
||||
constexpr simd::mask<float, 15> a15([](int i) -> bool { return i <= 4; });
|
||||
constexpr simd::basic_vec b15 = a15;
|
||||
static_assert(b15[0] == 1);
|
||||
static_assert(b15[4] == 1);
|
||||
static_assert(b15[5] == 0);
|
||||
static_assert(b15[8] == 0);
|
||||
static_assert(b15[14] == 0);
|
||||
static_assert(all_of((b15 == 1) == a15));
|
||||
return true;
|
||||
}());
|
||||
|
||||
static_assert([] constexpr {
|
||||
constexpr simd::mask<float, 4> a([](int i) -> bool { return i < 2; });
|
||||
constexpr simd::basic_vec b = ~a;
|
||||
constexpr simd::basic_vec c = a;
|
||||
static_assert(c[0] == int(a[0]));
|
||||
static_assert(c[1] == int(a[1]));
|
||||
static_assert(c[2] == int(a[2]));
|
||||
static_assert(c[3] == int(a[3]));
|
||||
static_assert(b[0] == ~int(0 < 2));
|
||||
static_assert(b[1] == ~int(1 < 2));
|
||||
static_assert(b[2] == ~int(2 < 2));
|
||||
static_assert(b[3] == ~int(3 < 2));
|
||||
return all_of(b == simd::vec<int, 4>([](int i) { return ~int(i < 2); }));
|
||||
}());
|
||||
#endif
|
||||
|
||||
// mask conversions //////////////////
|
||||
namespace mask_conversion_tests
|
||||
{
|
||||
using simd::mask;
|
||||
|
||||
struct TestResult
|
||||
{
|
||||
int state;
|
||||
unsigned long long a, b;
|
||||
};
|
||||
|
||||
template <auto Res>
|
||||
consteval void
|
||||
check()
|
||||
{
|
||||
if constexpr (Res.state != 0 && Res.a != Res.b)
|
||||
static_assert(Res.a == Res.b);
|
||||
else
|
||||
static_assert(Res.state == 0);
|
||||
}
|
||||
|
||||
template <typename U>
|
||||
consteval TestResult
|
||||
do_test(const auto& k)
|
||||
{
|
||||
using M = simd::mask<U, k.size()>;
|
||||
if constexpr (std::is_destructible_v<M>)
|
||||
{
|
||||
if (!std::ranges::equal(M(k), k))
|
||||
{
|
||||
if constexpr (k.size() <= 64)
|
||||
return {1, M(k).to_ullong(), k.to_ullong()};
|
||||
else
|
||||
return {1, 0, 0};
|
||||
}
|
||||
else
|
||||
return {0, 0, 0};
|
||||
}
|
||||
else
|
||||
return {0, 0, 0};
|
||||
}
|
||||
|
||||
template <typename T, int N, int P = 0>
|
||||
consteval void
|
||||
do_test()
|
||||
{
|
||||
if constexpr (std::is_destructible_v<simd::mask<T, N>>)
|
||||
{
|
||||
constexpr simd::mask<T, N> k([](int i) {
|
||||
if constexpr (P == 2)
|
||||
return std::has_single_bit(unsigned(i));
|
||||
else if constexpr (P == 3)
|
||||
return !std::has_single_bit(unsigned(i));
|
||||
else
|
||||
return (i & 1) == P;
|
||||
});
|
||||
check<do_test<char>( k)>();
|
||||
check<do_test<char>(!k)>();
|
||||
check<do_test<short>( k)>();
|
||||
check<do_test<short>(!k)>();
|
||||
check<do_test<int>( k)>();
|
||||
check<do_test<int>(!k)>();
|
||||
check<do_test<double>( k)>();
|
||||
check<do_test<double>(!k)>();
|
||||
#ifdef __STDCPP_FLOAT16_T__
|
||||
check<do_test<std::float16_t>( k)>();
|
||||
check<do_test<std::float16_t>(!k)>();
|
||||
#endif
|
||||
if constexpr (P <= 2)
|
||||
do_test<T, N, P + 1>();
|
||||
}
|
||||
}
|
||||
|
||||
template <typename T>
|
||||
consteval bool
|
||||
test()
|
||||
{
|
||||
using V = simd::mask<T>;
|
||||
do_test<T, 1>();
|
||||
do_test<T, V::size()>();
|
||||
do_test<T, 2 * V::size()>();
|
||||
do_test<T, 4 * V::size()>();
|
||||
do_test<T, 5 * V::size()>();
|
||||
do_test<T, 2 * V::size() + 1>();
|
||||
do_test<T, 2 * V::size() - 1>();
|
||||
do_test<T, V::size() / 2>();
|
||||
do_test<T, V::size() / 3>();
|
||||
do_test<T, V::size() / 5>();
|
||||
return true;
|
||||
}
|
||||
|
||||
static_assert(test<char>());
|
||||
static_assert(test<short>());
|
||||
static_assert(test<float>());
|
||||
static_assert(test<double>());
|
||||
#ifdef __STDCPP_FLOAT16_T__
|
||||
static_assert(test<std::float16_t>());
|
||||
#endif
|
||||
}
|
||||
|
||||
// vec reductions ///////////////////
|
||||
|
||||
namespace simd_reduction_tests
|
||||
{
|
||||
static_assert(reduce(simd::vec<int, 7>(1)) == 7);
|
||||
static_assert(reduce(simd::vec<int, 7>(2), std::multiplies<>()) == 128);
|
||||
static_assert(reduce(simd::vec<int, 8>(2), std::bit_and<>()) == 2);
|
||||
static_assert(reduce(simd::vec<int, 8>(2), std::bit_or<>()) == 2);
|
||||
static_assert(reduce(simd::vec<int, 8>(2), std::bit_xor<>()) == 0);
|
||||
static_assert(reduce(simd::vec<int, 3>(2), std::bit_and<>()) == 2);
|
||||
static_assert(reduce(simd::vec<int, 6>(2), std::bit_and<>()) == 2);
|
||||
static_assert(reduce(simd::vec<int, 7>(2), std::bit_and<>()) == 2);
|
||||
static_assert(reduce(simd::vec<int, 7>(2), std::bit_or<>()) == 2);
|
||||
static_assert(reduce(simd::vec<int, 7>(2), std::bit_xor<>()) == 2);
|
||||
#ifndef AVOID_BROKEN_CLANG_FAILURES
|
||||
static_assert(reduce(simd::vec<int, 4>(2), simd::mask<int, 4>(false)) == 0);
|
||||
static_assert(reduce(simd::vec<int, 4>(2), simd::mask<int, 4>(false), std::multiplies<>()) == 1);
|
||||
static_assert(reduce(simd::vec<int, 4>(2), simd::mask<int, 4>(false), std::bit_and<>()) == ~0);
|
||||
static_assert(reduce(simd::vec<int, 4>(2), simd::mask<int, 4>(false), [](auto a, auto b) {
|
||||
return select(a < b, a, b);
|
||||
}, __INT_MAX__) == __INT_MAX__);
|
||||
#endif
|
||||
|
||||
template <typename BinaryOperation>
|
||||
concept masked_reduce_works = requires(simd::vec<int, 4> a, simd::vec<int, 4> b) {
|
||||
reduce(a, a < b, BinaryOperation());
|
||||
};
|
||||
|
||||
static_assert(!masked_reduce_works<std::minus<>>);
|
||||
}
|
||||
|
||||
// mask reductions ///////////////////
|
||||
|
||||
static_assert(all_of(simd::vec<float>() == simd::vec<float>()));
|
||||
static_assert(any_of(simd::vec<float>() == simd::vec<float>()));
|
||||
static_assert(!none_of(simd::vec<float>() == simd::vec<float>()));
|
||||
static_assert(reduce_count(simd::vec<float>() == simd::vec<float>()) == simd::vec<float>::size);
|
||||
static_assert(reduce_min_index(simd::vec<float>() == simd::vec<float>()) == 0);
|
||||
static_assert(reduce_max_index(simd::vec<float>() == simd::vec<float>()) == simd::vec<float>::size - 1);
|
||||
|
||||
// chunk ////////////////////////
|
||||
|
||||
static_assert([] {
|
||||
constexpr auto a = simd::vec<int, 8>([] (int i) { return i; });
|
||||
auto a4 = chunk<simd::vec<int, 4>>(a);
|
||||
auto a3 = chunk<simd::vec<int, 3>>(a);
|
||||
auto a3_ = chunk<3>(a);
|
||||
return a4.size() == 2 && std::same_as<decltype(a4), std::array<simd::vec<int, 4>, 2>>
|
||||
&& std::tuple_size_v<decltype(a3)> == 3
|
||||
&& all_of(std::get<0>(a3) == simd::vec<int, 3>([] (int i) { return i; }))
|
||||
&& all_of(std::get<1>(a3) == simd::vec<int, 3>([] (int i) { return i + 3; }))
|
||||
&& all_of(std::get<2>(a3) == simd::vec<int, 2>([] (int i) { return i + 6; }))
|
||||
&& std::same_as<decltype(a3), decltype(a3_)>
|
||||
&& all_of(std::get<0>(a3) == std::get<0>(a3_));
|
||||
}());
|
||||
|
||||
static_assert([] {
|
||||
constexpr simd::mask<int, 8> a([] (int i) -> bool { return i & 1; });
|
||||
auto a4 = chunk<simd::mask<int, 4>>(a);
|
||||
auto a3 = chunk<simd::mask<int, 3>>(a);
|
||||
auto a3_ = chunk<3>(a);
|
||||
return a4.size() == 2 && std::same_as<decltype(a4), std::array<simd::mask<int, 4>, 2>>
|
||||
&& std::tuple_size_v<decltype(a3)> == 3
|
||||
&& all_of(std::get<0>(a3) == simd::mask<int, 3>(
|
||||
[] (int i) -> bool { return i & 1; }))
|
||||
&& all_of(std::get<1>(a3) == simd::mask<int, 3>(
|
||||
[] (int i) -> bool { return (i + 3) & 1; }))
|
||||
&& all_of(std::get<2>(a3) == simd::mask<int, 2>(
|
||||
[] (int i) -> bool { return (i + 6) & 1; }))
|
||||
&& std::same_as<decltype(a3), decltype(a3_)>
|
||||
&& all_of(std::get<0>(a3) == std::get<0>(a3_));
|
||||
}());
|
||||
|
||||
// cat ///////////////////////////
|
||||
|
||||
static_assert(all_of(simd::cat(simd::__iota<simd::vec<int, 3>>, simd::vec<int, 1>(3))
|
||||
== simd::__iota<simd::vec<int, 4>>));
|
||||
|
||||
static_assert(all_of(simd::cat(simd::__iota<simd::vec<int, 4>>, simd::__iota<simd::vec<int, 4>> + 4)
|
||||
== simd::__iota<simd::vec<int, 8>>));
|
||||
|
||||
static_assert(all_of(simd::cat(simd::__iota<simd::vec<double, 4>>, simd::__iota<simd::vec<double, 2>> + 4)
|
||||
== simd::__iota<simd::vec<double, 6>>));
|
||||
|
||||
static_assert(all_of(simd::cat(simd::__iota<simd::vec<double, 4>>, simd::__iota<simd::vec<double, 4>> + 4)
|
||||
== simd::__iota<simd::vec<double, 8>>));
|
||||
|
||||
// select ////////////////////////
|
||||
|
||||
#ifndef AVOID_BROKEN_CLANG_FAILURES
|
||||
static_assert(all_of(simd::vec<long long, 8>(std::array{0, 0, 0, 0, 4, 4, 4, 4})
|
||||
== select(simd::__iota<simd::vec<double, 8>> < 4, 0ll, 4ll)));
|
||||
|
||||
static_assert(all_of(simd::vec<int, 8>(std::array{0, 0, 0, 0, 4, 4, 4, 4})
|
||||
== select(simd::__iota<simd::vec<float, 8>> < 4.f, 0, 4)));
|
||||
#endif
|
||||
|
||||
// permute ////////////////////////
|
||||
|
||||
namespace permutations
|
||||
{
|
||||
struct _DuplicateEven
|
||||
{
|
||||
consteval unsigned
|
||||
operator()(unsigned __i) const
|
||||
{ return __i & ~1u; }
|
||||
};
|
||||
|
||||
inline constexpr _DuplicateEven duplicate_even {};
|
||||
|
||||
struct _DuplicateOdd
|
||||
{
|
||||
consteval unsigned
|
||||
operator()(unsigned __i) const
|
||||
{ return __i | 1u; }
|
||||
};
|
||||
|
||||
inline constexpr _DuplicateOdd duplicate_odd {};
|
||||
|
||||
template <unsigned _Np>
|
||||
struct _SwapNeighbors
|
||||
{
|
||||
consteval unsigned
|
||||
operator()(unsigned __i, unsigned __size) const
|
||||
{
|
||||
if (__size % (2 * _Np) != 0)
|
||||
abort(); // swap_neighbors<N> permutation requires a multiple of 2N elements
|
||||
else if (std::has_single_bit(_Np))
|
||||
return __i ^ _Np;
|
||||
else if (__i % (2 * _Np) >= _Np)
|
||||
return __i - _Np;
|
||||
else
|
||||
return __i + _Np;
|
||||
}
|
||||
};
|
||||
|
||||
template <unsigned _Np = 1u>
|
||||
inline constexpr _SwapNeighbors<_Np> swap_neighbors {};
|
||||
|
||||
template <int _Position>
|
||||
struct _Broadcast
|
||||
{
|
||||
consteval int
|
||||
operator()(int, int __size) const
|
||||
{ return _Position < 0 ? __size + _Position : _Position; }
|
||||
};
|
||||
|
||||
template <int _Position>
|
||||
inline constexpr _Broadcast<_Position> broadcast {};
|
||||
|
||||
inline constexpr _Broadcast<0> broadcast_first {};
|
||||
|
||||
inline constexpr _Broadcast<-1> broadcast_last {};
|
||||
|
||||
struct _Reverse
|
||||
{
|
||||
consteval int
|
||||
operator()(int __i, int __size) const
|
||||
{ return __size - 1 - __i; }
|
||||
};
|
||||
|
||||
inline constexpr _Reverse reverse {};
|
||||
|
||||
template <int _Offset>
|
||||
struct _Rotate
|
||||
{
|
||||
consteval int
|
||||
operator()(int __i, int __size) const
|
||||
{
|
||||
__i += _Offset;
|
||||
__i %= __size;
|
||||
if (__i < 0)
|
||||
__i += __size;
|
||||
return __i;
|
||||
}
|
||||
};
|
||||
|
||||
template <int _Offset>
|
||||
inline constexpr _Rotate<_Offset> rotate {};
|
||||
|
||||
template <int _Offset>
|
||||
struct _Shift
|
||||
{
|
||||
consteval int
|
||||
operator()(int __i, int __size) const
|
||||
{
|
||||
const int __j = __i + _Offset;
|
||||
if (__j >= __size || -__j > __size)
|
||||
return simd::zero_element;
|
||||
else if (__j < 0)
|
||||
return __size + __j;
|
||||
else
|
||||
return __j;
|
||||
}
|
||||
};
|
||||
|
||||
template <int _Offset>
|
||||
inline constexpr _Shift<_Offset> shift {};
|
||||
}
|
||||
|
||||
static_assert(
|
||||
all_of(simd::permute(simd::__iota<simd::vec<int>>, permutations::duplicate_even)
|
||||
== simd::__iota<simd::vec<int>> / 2 * 2));
|
||||
|
||||
static_assert(
|
||||
all_of(simd::permute(simd::__iota<simd::vec<int>>, permutations::duplicate_odd)
|
||||
== simd::__iota<simd::vec<int>> / 2 * 2 + 1));
|
||||
|
||||
static_assert(
|
||||
all_of(simd::permute(simd::__iota<simd::vec<int>>, permutations::swap_neighbors<1>)
|
||||
== simd::vec<int>([](int i) { return i ^ 1; })));
|
||||
|
||||
static_assert(
|
||||
all_of(simd::permute(simd::__iota<simd::vec<int, 8>>,
|
||||
permutations::swap_neighbors<2>)
|
||||
== simd::vec<int, 8>(std::array{2, 3, 0, 1, 6, 7, 4, 5})));
|
||||
|
||||
static_assert(
|
||||
all_of(simd::permute(simd::__iota<simd::vec<int, 12>>,
|
||||
permutations::swap_neighbors<3>)
|
||||
== simd::vec<int, 12>(
|
||||
std::array{3, 4, 5, 0, 1, 2, 9, 10, 11, 6, 7, 8})));
|
||||
|
||||
static_assert(
|
||||
all_of(simd::permute(simd::__iota<simd::vec<int>>, permutations::broadcast<1>)
|
||||
== simd::vec<int>(1)));
|
||||
|
||||
static_assert(
|
||||
all_of(simd::permute(simd::__iota<simd::vec<int>>, permutations::broadcast_first)
|
||||
== simd::vec<int>(0)));
|
||||
|
||||
static_assert(
|
||||
all_of(simd::permute(simd::__iota<simd::vec<int>>, permutations::broadcast_last)
|
||||
== simd::vec<int>(int(simd::vec<int>::size() - 1))));
|
||||
|
||||
static_assert(
|
||||
all_of(simd::permute(simd::__iota<simd::vec<int>>, permutations::reverse)
|
||||
== simd::vec<int>([](int i) { return int(simd::vec<int>::size()) - 1 - i; })));
|
||||
|
||||
static_assert(
|
||||
all_of(simd::permute(simd::__iota<simd::vec<int>>, permutations::rotate<1>)
|
||||
== (simd::__iota<simd::vec<int>> + 1) % int(simd::vec<int>::size())));
|
||||
|
||||
static_assert(
|
||||
all_of(simd::permute(simd::__iota<simd::vec<int>>, permutations::rotate<2>)
|
||||
== (simd::__iota<simd::vec<int>> + 2) % int(simd::vec<int>::size())));
|
||||
|
||||
static_assert(
|
||||
all_of(simd::permute(simd::__iota<simd::vec<int, 7>>, permutations::rotate<2>)
|
||||
== simd::vec<int, 7>(std::array {2, 3, 4, 5, 6, 0, 1})));
|
||||
|
||||
static_assert(
|
||||
all_of(simd::permute(simd::__iota<simd::vec<int, 7>>, permutations::rotate<-2>)
|
||||
== simd::vec<int, 7>(std::array {5, 6, 0, 1, 2, 3, 4}))); // { dg-prune-output "Wpsabi" }
|
||||
185
libstdc++-v3/testsuite/std/simd/traits_impl.cc
Normal file
185
libstdc++-v3/testsuite/std/simd/traits_impl.cc
Normal file
@@ -0,0 +1,185 @@
|
||||
// { dg-do compile { target c++26 } }
|
||||
// { dg-require-effective-target x86 }
|
||||
|
||||
#define _GLIBCXX_SIMD_THROW_ON_BAD_VALUE 1
|
||||
|
||||
#include <bits/simd_details.h>
|
||||
#include <bits/simd_flags.h>
|
||||
#include <stdfloat>
|
||||
|
||||
namespace simd = std::simd;
|
||||
|
||||
using std::float16_t;
|
||||
using std::float32_t;
|
||||
using std::float64_t;
|
||||
|
||||
using namespace std::simd;
|
||||
|
||||
void test()
|
||||
{
|
||||
template for (auto t : {float(), double(), float16_t(), float32_t(), float64_t()})
|
||||
{
|
||||
using T = decltype(t);
|
||||
static_assert(__vectorizable<T>);
|
||||
}
|
||||
|
||||
static_assert(!__vectorizable<const float>);
|
||||
static_assert(!__vectorizable<float&>);
|
||||
static_assert(!__vectorizable<std::bfloat16_t>);
|
||||
|
||||
template for (constexpr int N : {1, 2, 4, 8})
|
||||
{
|
||||
static_assert(std::signed_integral<__integer_from<N>>);
|
||||
static_assert(sizeof(__integer_from<N>) == N);
|
||||
static_assert(__vectorizable<__integer_from<N>>);
|
||||
}
|
||||
|
||||
static_assert(__div_ceil(5, 3) == 2);
|
||||
|
||||
static_assert(sizeof(_Bitmask<3>) == 1);
|
||||
static_assert(sizeof(_Bitmask<30>) == 4);
|
||||
|
||||
static_assert(__scalar_abi_tag<_ScalarAbi<1>>);
|
||||
static_assert(__scalar_abi_tag<_ScalarAbi<2>>);
|
||||
static_assert(!__scalar_abi_tag<_Abi_t<1, 1>>);
|
||||
|
||||
static_assert(__abi_tag<_ScalarAbi<1>>);
|
||||
static_assert(__abi_tag<_ScalarAbi<2>>);
|
||||
|
||||
using AN = decltype(__native_abi<float>());
|
||||
using A1 = decltype(__native_abi<float>()._S_resize<1>());
|
||||
static_assert(A1::_S_size == 1);
|
||||
static_assert(A1::_S_nreg == 1);
|
||||
static_assert(A1::_S_variant == AN::_S_variant);
|
||||
static_assert(__scalar_abi_tag<A1> == __scalar_abi_tag<AN>);
|
||||
static_assert(std::is_same_v<decltype(__abi_rebind<float, AN::_S_size, A1>()), AN>);
|
||||
if constexpr (AN::_S_size >= 2) // the target has SIMD support for float
|
||||
{
|
||||
{
|
||||
using A2 = decltype(__abi_rebind<float, 2, AN>());
|
||||
static_assert(A2::_S_size == 2);
|
||||
static_assert(A2::_S_nreg == 1);
|
||||
static_assert(A2::_S_variant == AN::_S_variant);
|
||||
using A2x = decltype(__abi_rebind<float, 2, decltype(__abi_rebind<float, 1, A2>())>());
|
||||
static_assert(std::is_same_v<A2, A2x>);
|
||||
}
|
||||
using A4 = decltype(__abi_rebind<float, 4, AN>());
|
||||
static_assert(A4::_S_size == 4);
|
||||
}
|
||||
|
||||
static_assert(__streq_to_1("1"));
|
||||
static_assert(!__streq_to_1(""));
|
||||
static_assert(!__streq_to_1(nullptr));
|
||||
static_assert(!__streq_to_1("0"));
|
||||
static_assert(!__streq_to_1("1 "));
|
||||
|
||||
static_assert(__static_sized_range<int[4]>);
|
||||
static_assert(__static_sized_range<int[4], 4>);
|
||||
static_assert(__static_sized_range<std::array<int, 4>, 4>);
|
||||
|
||||
static_assert( __value_preserving_convertible_to<int, double>);
|
||||
static_assert(!__value_preserving_convertible_to<int, float>);
|
||||
static_assert( __value_preserving_convertible_to<float, double>);
|
||||
static_assert(!__value_preserving_convertible_to<double, float>);
|
||||
|
||||
static_assert(__explicitly_convertible_to<float, float16_t>);
|
||||
static_assert(__explicitly_convertible_to<long, float16_t>);
|
||||
|
||||
static_assert(__constexpr_wrapper_like<std::constant_wrapper<2>>);
|
||||
static_assert(__constexpr_wrapper_like<std::integral_constant<int, 1>>);
|
||||
|
||||
static_assert(!__broadcast_constructible<int, float>);
|
||||
static_assert(!__broadcast_constructible<int&, float>);
|
||||
static_assert(!__broadcast_constructible<int&&, float>);
|
||||
static_assert(!__broadcast_constructible<const int&, float>);
|
||||
static_assert(!__broadcast_constructible<const int, float>);
|
||||
|
||||
static_assert(__broadcast_constructible<decltype(std::cw<2>), float>);
|
||||
static_assert(__broadcast_constructible<decltype(std::cw<0.f>), std::float16_t>);
|
||||
|
||||
|
||||
static_assert(__higher_rank_than<long, int>);
|
||||
static_assert(__higher_rank_than<long long, long>);
|
||||
static_assert(__higher_rank_than<int, short>);
|
||||
static_assert(__higher_rank_than<short, char>);
|
||||
|
||||
static_assert(!__higher_rank_than<char, signed char>);
|
||||
static_assert(!__higher_rank_than<signed char, char>);
|
||||
static_assert(!__higher_rank_than<char, unsigned char>);
|
||||
static_assert(!__higher_rank_than<unsigned char, char>);
|
||||
|
||||
static_assert(__higher_rank_than<unsigned int, short>);
|
||||
static_assert(__higher_rank_than<unsigned long, int>);
|
||||
static_assert(__higher_rank_than<unsigned long long, long>);
|
||||
|
||||
static_assert(__higher_rank_than<float, float16_t>);
|
||||
static_assert(__higher_rank_than<float32_t, float>);
|
||||
static_assert(__higher_rank_than<double, float32_t>);
|
||||
static_assert(__higher_rank_than<double, float>);
|
||||
static_assert(__higher_rank_than<float64_t, float32_t>);
|
||||
static_assert(__higher_rank_than<float64_t, float>);
|
||||
static_assert(__higher_rank_than<float64_t, double>);
|
||||
|
||||
static_assert(__loadstore_convertible_to<float, double>);
|
||||
static_assert(__loadstore_convertible_to<int, double>);
|
||||
static_assert(!__loadstore_convertible_to<int, float>);
|
||||
static_assert(!__loadstore_convertible_to<int, float, __aligned_flag>);
|
||||
static_assert(__loadstore_convertible_to<int, float, __convert_flag>);
|
||||
static_assert(__loadstore_convertible_to<int, float, __aligned_flag, __convert_flag>);
|
||||
|
||||
static_assert(__mask_element_size<basic_mask<4>> == 4);
|
||||
|
||||
static_assert(__highest_bit(0b1000u) == 3);
|
||||
static_assert(__highest_bit(0b10000001000ull) == 10);
|
||||
}
|
||||
|
||||
consteval bool
|
||||
throws(auto f)
|
||||
{
|
||||
try { f(); }
|
||||
catch (...) { return true; }
|
||||
return false;
|
||||
}
|
||||
|
||||
static_assert(!throws([] { __value_preserving_cast<float>(1); }));
|
||||
static_assert(!throws([] { __value_preserving_cast<float>(1.5); }));
|
||||
static_assert(throws([] { __value_preserving_cast<float>(0x5EAF00D); }));
|
||||
static_assert(throws([] { __value_preserving_cast<unsigned>(-1); }));
|
||||
static_assert(!throws([] { __value_preserving_cast<unsigned short>(0xffff); }));
|
||||
static_assert(throws([] { __value_preserving_cast<unsigned short>(0x10000); }));
|
||||
|
||||
static_assert(__converts_trivially<int, unsigned>);
|
||||
#if __SIZEOF_LONG__ == __SIZEOF_LONG_LONG__
|
||||
static_assert(__converts_trivially<long long, long>);
|
||||
#elif __SIZEOF_INT__ == __SIZEOF_LONG__
|
||||
static_assert(__converts_trivially<int, long>);
|
||||
#endif
|
||||
static_assert(__converts_trivially<float, float32_t>);
|
||||
|
||||
static_assert([] {
|
||||
bool to_find[10] = {0, 1, 1, 1, 0, 1, 0, 0, 1};
|
||||
__bit_foreach(0b100101110u, [&](int i) {
|
||||
if (!to_find[i]) throw false;
|
||||
to_find[i] = false;
|
||||
});
|
||||
for (bool b : to_find)
|
||||
if (b)
|
||||
return false;
|
||||
return true;
|
||||
}());
|
||||
|
||||
// flags ////////////////////////
|
||||
static_assert(std::is_same_v<decltype(flag_default | flag_default), flags<>>);
|
||||
static_assert(std::is_same_v<decltype(flag_convert | flag_default), flags<__convert_flag>>);
|
||||
static_assert(std::is_same_v<decltype(flag_convert | flag_convert), flags<__convert_flag>>);
|
||||
static_assert(std::is_same_v<decltype(flag_aligned | flag_convert),
|
||||
flags<__aligned_flag, __convert_flag>>);
|
||||
static_assert(std::is_same_v<decltype(flag_aligned | flag_convert | flag_aligned),
|
||||
flags<__aligned_flag, __convert_flag>>);
|
||||
static_assert(std::is_same_v<decltype(flag_aligned | (flag_convert | flag_aligned)),
|
||||
flags<__aligned_flag, __convert_flag>>);
|
||||
|
||||
static_assert(!flag_default._S_test(flag_convert));
|
||||
static_assert(flag_convert._S_test(flag_convert));
|
||||
static_assert(!flag_convert._S_test(flag_aligned));
|
||||
static_assert((flag_overaligned<32> | flag_convert | flag_aligned)._S_test(flag_convert));
|
||||
62
libstdc++-v3/testsuite/std/simd/traits_math.cc
Normal file
62
libstdc++-v3/testsuite/std/simd/traits_math.cc
Normal file
@@ -0,0 +1,62 @@
|
||||
// { dg-do compile { target c++26 } }
|
||||
// { dg-require-effective-target x86 }
|
||||
|
||||
#include <simd>
|
||||
#include <stdfloat>
|
||||
|
||||
namespace simd = std::simd;
|
||||
|
||||
// vec.math ///////////////////////////////////////
|
||||
|
||||
namespace math_tests
|
||||
{
|
||||
using simd::__deduced_vec_t;
|
||||
using simd::__math_floating_point;
|
||||
using std::is_same_v;
|
||||
|
||||
using vf2 = simd::vec<float, 2>;
|
||||
using vf4 = simd::vec<float, 4>;
|
||||
|
||||
template <typename T0, typename T1>
|
||||
concept has_common_type = requires { typename std::common_type<T0, T1>::type; };
|
||||
|
||||
template <typename T>
|
||||
concept has_deduced_vec = requires { typename simd::__deduced_vec_t<T>; };
|
||||
|
||||
static_assert(!has_common_type<vf2, vf4>);
|
||||
static_assert( has_common_type<int, vf2>);
|
||||
|
||||
template <typename T, bool Strict = false>
|
||||
struct holder
|
||||
{
|
||||
T value;
|
||||
|
||||
constexpr
|
||||
operator const T&() const
|
||||
{ return value; }
|
||||
|
||||
template <typename U>
|
||||
requires (!std::same_as<T, U>) && Strict
|
||||
operator U() const = delete;
|
||||
};
|
||||
|
||||
// The next always has a common_type because the UDT is convertible_to<float> and is not an
|
||||
// arithmetic type:
|
||||
static_assert( has_common_type<holder<int>, vf2>);
|
||||
|
||||
// It's up to the UDT to constrain itself better:
|
||||
static_assert(!has_common_type<holder<int, true>, vf2>);
|
||||
|
||||
// However, a strict UDT can still work
|
||||
static_assert( has_common_type<holder<float, true>, vf2>);
|
||||
|
||||
// Except if it needs any kind of conversion, even if it's value-preserving. Again the semantics
|
||||
// are what the UDT defined.
|
||||
static_assert(!has_common_type<holder<short, true>, vf2>);
|
||||
|
||||
static_assert(!has_deduced_vec<int>);
|
||||
static_assert(!__math_floating_point<int>);
|
||||
static_assert(!__math_floating_point<float>);
|
||||
static_assert(!__math_floating_point<simd::vec<int>>);
|
||||
static_assert( __math_floating_point<simd::vec<float>>);
|
||||
}
|
||||
Reference in New Issue
Block a user