mirror of
https://github.com/gcc-mirror/gcc.git
synced 2026-05-06 06:49:09 +02:00
This implementation differs significantly from the
std::experimental::simd implementation. One goal was a reduction in
template instantiations wrt. what std::experimental::simd did.
Design notes:
- bits/vec_ops.h contains concepts, traits, and functions for working
with GNU vector builtins that are mostly independent from std::simd.
These could move from std::simd:: to std::__vec (or similar). However,
we would then need to revisit naming. For now we kept everything in
the std::simd namespace with __vec_ prefix in the names. The __vec_*
functions can be called unqualified because they can never be called
on user-defined types (no ADL). If we ever get simd<UDT> support this
will be implemented via bit_cast to/from integral vector
builtins/intrinsics.
- bits/simd_x86.h extends vec_ops.h with calls to __builtin_ia32_* that
can only be used after uttering the right GCC target pragma.
- basic_vec and basic_mask are built on top of register-size GNU vector
builtins (for now / x86). Any larger vec/mask is a tree of power-of-2
#elements on the "first" branch. Anything non-power-of-2 that is
smaller than register size uses padding elements that participate in
element-wise operations. The library ensures that padding elements
lead to no side effects. The implementation makes no assumption on the
values of these padding elements since the user can bit_cast to
basic_vec/basic_mask.
Implementation status:
- The implementation is prepared for more than x86 but is x86-only for
now.
- Parts of [simd] *not* implemented in this patch:
- std::complex<floating-point> as vectorizable types
- [simd.permute.dynamic]
- [simd.permute.mask]
- [simd.permute.memory]
- [simd.bit]
- [simd.math]
- mixed operations with vec-mask and bit-mask types
- some conversion optimizations (open questions wrt. missed
optimizations in the compiler)
- This patch implements P3844R3 "Restore simd::vec broadcast from int",
which is not part of the C++26 WD draft yet. If the paper does not get
accepted the feature will be reverted.
- This patch implements D4042R0 "incorrect cast between simd::vec and
simd::mask via conversion to and from impl-defined vector types" (to be
published once the reported LWG issue gets a number).
- The standard feature test macro __cpp_lib_simd is not defined yet.
Tests:
- Full coverage requires testing
1. constexpr,
2. constant-propagating inputs, and
3. unknown (to the optimizer) inputs
- for all vectorizable types
* for every supported width (1–64 and higher)
+ for all possible ISA extensions (combinations)
= with different fast-math flags
... leading to a test matrix that's far out of reach for regular
testsuite builds.
- The tests in testsuite/std/simd/ try to cover all of the API. The
tests can be build in every combination listed above. Per default only
a small subset is built and tested.
- Use GCC_TEST_RUN_EXPENSIVE=something to compile the more expensive
tests (constexpr and const-prop testing) and to enable more /
different widths for the test type.
- Tests can still emit bogus -Wpsabi warnings (see PR98734) which are
filtered out via dg-prune-output.
Benchmarks:
- The current implementation has been benchmarked in some aspects on
x86_64 hardware. There is more optimization potential. However, it is
not always clear whether optimizations should be part of the library
if they can be implemented in the compiler.
- No benchmark code is included in this patch.
libstdc++-v3/ChangeLog:
* include/Makefile.am: Add simd headers.
* include/Makefile.in: Regenerate.
* include/bits/version.def (simd): New.
* include/bits/version.h: Regenerate.
* include/bits/simd_alg.h: New file.
* include/bits/simd_details.h: New file.
* include/bits/simd_flags.h: New file.
* include/bits/simd_iterator.h: New file.
* include/bits/simd_loadstore.h: New file.
* include/bits/simd_mask.h: New file.
* include/bits/simd_mask_reductions.h: New file.
* include/bits/simd_reductions.h: New file.
* include/bits/simd_vec.h: New file.
* include/bits/simd_x86.h: New file.
* include/bits/vec_ops.h: New file.
* include/std/simd: New file.
* testsuite/std/simd/arithmetic.cc: New test.
* testsuite/std/simd/arithmetic_expensive.cc: New test.
* testsuite/std/simd/create_tests.h: New file.
* testsuite/std/simd/creation.cc: New test.
* testsuite/std/simd/creation_expensive.cc: New test.
* testsuite/std/simd/loads.cc: New test.
* testsuite/std/simd/loads_expensive.cc: New test.
* testsuite/std/simd/mask2.cc: New test.
* testsuite/std/simd/mask2_expensive.cc: New test.
* testsuite/std/simd/mask.cc: New test.
* testsuite/std/simd/mask_expensive.cc: New test.
* testsuite/std/simd/reductions.cc: New test.
* testsuite/std/simd/reductions_expensive.cc: New test.
* testsuite/std/simd/shift_left.cc: New test.
* testsuite/std/simd/shift_left_expensive.cc: New test.
* testsuite/std/simd/shift_right.cc: New test.
* testsuite/std/simd/shift_right_expensive.cc: New test.
* testsuite/std/simd/simd_alg.cc: New test.
* testsuite/std/simd/simd_alg_expensive.cc: New test.
* testsuite/std/simd/sse_intrin.cc: New test.
* testsuite/std/simd/stores.cc: New test.
* testsuite/std/simd/stores_expensive.cc: New test.
* testsuite/std/simd/test_setup.h: New file.
* testsuite/std/simd/traits_common.cc: New test.
* testsuite/std/simd/traits_impl.cc: New test.
* testsuite/std/simd/traits_math.cc: New test.
Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
138 lines
4.3 KiB
C++
138 lines
4.3 KiB
C++
// { dg-do run { target c++26 } }
|
|
// { dg-require-effective-target x86 }
|
|
|
|
#include "test_setup.h"
|
|
#include <utility>
|
|
|
|
template <typename V>
|
|
struct Tests
|
|
{
|
|
using T = typename V::value_type;
|
|
|
|
using M = typename V::mask_type;
|
|
|
|
using pair = std::pair<V, V>;
|
|
static constexpr std::conditional_t<std::is_floating_point_v<T>, short, T> x_max
|
|
= test_iota_max<V, 1>;
|
|
static constexpr int x_max_int = static_cast<int>(x_max);
|
|
|
|
static constexpr V
|
|
reverse_iota(const V x)
|
|
{
|
|
if constexpr (std::is_enum_v<T>)
|
|
{
|
|
using Vu = simd::rebind_t<std::underlying_type_t<T>, V>;
|
|
return static_cast<V>(std::to_underlying(x_max) - static_cast<Vu>(x));
|
|
}
|
|
else
|
|
return x_max - x;
|
|
}
|
|
|
|
ADD_TEST(Select) {
|
|
std::tuple{test_iota<V, 0, 63>, test_iota<V, 1, 64>, T(2),
|
|
M([](int i) { return 1 == (i & 1); }),
|
|
M([](int i) { return 1 == (i % 3); })},
|
|
[](auto& t, const V x, const V y, const T z, const M k, const M k3) {
|
|
t.verify_equal(select(M(true), x, y), x);
|
|
t.verify_equal(select(M(false), x, y), y);
|
|
t.verify_equal(select(M(true), y, x), y);
|
|
t.verify_equal(select(M(false), y, x), x);
|
|
t.verify_equal(select(k, x, T()),
|
|
V([](int i) { return (1 == (i & 1)) ? T(i & 63) : T(); }));
|
|
|
|
t.verify_equal(select(M(true), z, T()), z);
|
|
t.verify_equal(select(M(true), T(), z), V());
|
|
t.verify_equal(select(k, z, T()), V([](int i) { return (1 == (i & 1)) ? T(2) : T(); }));
|
|
t.verify_equal(select(k3, z, T()), V([](int i) { return (1 == (i % 3)) ? T(2) : T(); }));
|
|
}
|
|
};
|
|
|
|
ADD_TEST(Min, std::totally_ordered<T>) {
|
|
std::tuple{test_iota<V, 0, -1>, reverse_iota(test_iota<V, 0, -1>), test_iota<V, 1>},
|
|
[](auto& t, const V x, const V y, const V x1) {
|
|
t.verify_equal(min(x, x), x);
|
|
t.verify_equal(min(V(), x), V());
|
|
t.verify_equal(min(x, V()), V());
|
|
if constexpr (std::is_signed_v<T>)
|
|
{
|
|
t.verify_equal(min(-x, x), -x);
|
|
t.verify_equal(min(x, -x), -x);
|
|
}
|
|
t.verify_equal(min(x1, x), x);
|
|
t.verify_equal(min(x, x1), x);
|
|
t.verify_equal(min(x, y), min(y, x));
|
|
t.verify_equal(min(x, y), V([](int i) {
|
|
i %= x_max_int;
|
|
return std::min(T(x_max_int - i), T(i));
|
|
}));
|
|
}
|
|
};
|
|
|
|
ADD_TEST(Max, std::totally_ordered<T>) {
|
|
std::tuple{test_iota<V, 0, -1>, reverse_iota(test_iota<V, 0, -1>), test_iota<V, 1>},
|
|
[](auto& t, const V x, const V y, const V x1) {
|
|
t.verify_equal(max(x, x), x);
|
|
t.verify_equal(max(V(), x), x);
|
|
t.verify_equal(max(x, V()), x);
|
|
if constexpr (std::is_signed_v<T>)
|
|
{
|
|
t.verify_equal(max(-x, x), x);
|
|
t.verify_equal(max(x, -x), x);
|
|
}
|
|
t.verify_equal(max(x1, x), x1);
|
|
t.verify_equal(max(x, x1), x1);
|
|
t.verify_equal(max(x, y), max(y, x));
|
|
t.verify_equal(max(x, y), V([](int i) {
|
|
i %= x_max_int;
|
|
return std::max(T(x_max_int - i), T(i));
|
|
}));
|
|
}
|
|
};
|
|
|
|
ADD_TEST(Minmax, std::totally_ordered<T>) {
|
|
std::tuple{test_iota<V, 0, -1>, reverse_iota(test_iota<V, 0, -1>), test_iota<V, 1>},
|
|
[](auto& t, const V x, const V y, const V x1) {
|
|
t.verify_equal(minmax(x, x), pair{x, x});
|
|
t.verify_equal(minmax(V(), x), pair{V(), x});
|
|
t.verify_equal(minmax(x, V()), pair{V(), x});
|
|
if constexpr (std::is_signed_v<T>)
|
|
{
|
|
t.verify_equal(minmax(-x, x), pair{-x, x});
|
|
t.verify_equal(minmax(x, -x), pair{-x, x});
|
|
}
|
|
t.verify_equal(minmax(x1, x), pair{x, x1});
|
|
t.verify_equal(minmax(x, x1), pair{x, x1});
|
|
t.verify_equal(minmax(x, y), minmax(y, x));
|
|
t.verify_equal(minmax(x, y),
|
|
pair{V([](int i) {
|
|
i %= x_max_int;
|
|
return std::min(T(x_max_int - i), T(i));
|
|
}),
|
|
V([](int i) {
|
|
i %= x_max_int;
|
|
return std::max(T(x_max_int - i), T(i));
|
|
})});
|
|
}
|
|
};
|
|
|
|
ADD_TEST(Clamp, std::totally_ordered<T>) {
|
|
std::tuple{test_iota<V>, reverse_iota(test_iota<V>)},
|
|
[](auto& t, const V x, const V y) {
|
|
t.verify_equal(clamp(x, V(), x), x);
|
|
t.verify_equal(clamp(x, x, x), x);
|
|
t.verify_equal(clamp(V(), x, x), x);
|
|
t.verify_equal(clamp(V(), V(), x), V());
|
|
t.verify_equal(clamp(x, V(), V()), V());
|
|
t.verify_equal(clamp(x, V(), y), min(x, y));
|
|
t.verify_equal(clamp(y, V(), x), min(x, y));
|
|
if constexpr (std::is_signed_v<T>)
|
|
{
|
|
t.verify_equal(clamp(V(T(-test_iota_max<V>)), -x, x), -x);
|
|
t.verify_equal(clamp(V(T(test_iota_max<V>)), -x, x), x);
|
|
}
|
|
}
|
|
};
|
|
};
|
|
|
|
#include "create_tests.h"
|