TeamHeptaMirrors/gcc

mirror of https://github.com/gcc-mirror/gcc.git synced 2026-05-06 14:59:39 +02:00

Author	SHA1	Message	Date
Ian Lance Taylor	143cec1646	libbacktrace: recognize PE bigobj objects at configure time Patch from Christopher Wellons. * filetype.awk: Recognize PE bigobj objects at configure time.	2025-09-28 14:48:03 -07:00
liuhongt	dd645f6b9e	Deprecate -mstore-max= and related tuning. For memset, the size of used vector is decided by MIN(MOVE_MAX_PIECES, STORE_MAX_PIECES). Unless there's u-arch prefer big size vector for memcpy and small size vector for memset, there's no need to have a separate option or tune for it. In general, x86 backend always prefer big size vector for memset due to STLF issue. gcc/ChangeLog: PR target/121970 * config/i386/i386-options.cc (ix86_target_string): Remove store_max. (ix86_debug_options): Ditto. (ix86_function_specific_print): Ditto. (ix86_valid_target_attribute_tree): Ditto. (ix86_option_override_internal): Ditto. * config/i386/i386-expand.cc (ix86_expand_builtin): Ditto. * config/i386/i386-options.h (ix86_target_string): Ditto. * config/i386/i386.h (MOVE_MAX): Ditto. (STORE_MAX_PIECES): Set by move_max. * config/i386/i386.opt: Deprecate mmove-max=. * config/i386/x86-tune.def (X86_TUNE_AVX256_STORE_BY_PIECES): Removed. (X86_TUNE_AVX512_STORE_BY_PIECES): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pieces-memcmp-2.c: Remove mstore-max. * gcc.target/i386/pieces-memcpy-19.c: Ditto. * gcc.target/i386/pieces-memcpy-20.c: Ditto. * gcc.target/i386/pr104610.c: Ditto. * gcc.target/i386/pieces-memset-47.c: Scan warning for mstore-max deprecation. * gcc.target/i386/pieces-memset-48.c: Change mstore-max to mmove-max. * gcc.target/i386/pr121410.c: Ditto. * gcc.target/i386/pieces-memset-11.c: Change avx256_store_by_pieces to avx256_move_by_pieces. * gcc.target/i386/pieces-memset-14.c: Ditto. * gcc.target/i386/pieces-memset-2.c: Ditto. * gcc.target/i386/pieces-memset-20.c: Ditto. * gcc.target/i386/pieces-memset-23.c: Ditto. * gcc.target/i386/pieces-memset-29.c: Ditto. * gcc.target/i386/pieces-memset-30.c: Ditto. * gcc.target/i386/pieces-memset-33.c: Ditto. * gcc.target/i386/pieces-memset-34.c: Ditto. * gcc.target/i386/pieces-memset-37.c: Ditto. * gcc.target/i386/pieces-memset-44.c: Ditto. * gcc.target/i386/pieces-memset-5.c: Ditto. * gcc.target/i386/pr100865-10a.c: Ditto. * gcc.target/i386/pr100865-4a.c: Ditto. * gcc.target/i386/pr90773-20.c: Ditto. * gcc.target/i386/pr90773-21.c: Ditto. * gcc.target/i386/pr90773-22.c: Ditto. * gcc.target/i386/pr90773-23.c: Ditto. * g++.target/i386/pr80566-1.C: Ditto. * gcc.target/i386/pieces-memset-45.c: Change avx512_store_by_pieces to avx512_move_by_pieces.	2025-09-27 22:14:05 -07:00
Peter Damianov	bd352bd592	diagnostics: Fix mojibake from displaying UTF-8 on Windows consoles UTF-8 characters in diagnostic output (such as the warning emoji ⚠️ used by fanalyzer) display as mojibake on Windows unless the utf8 code page is being used This patch adds UTF-8 to UTF-16 conversion when outputting to a console on Windows. gcc/ChangeLog: * pretty-print.cc (decode_utf8_char): Move forward declaration. (mingw_utf8_str_to_utf16_str): New function to convert UTF-8 to UTF-16. (is_console_handle): New function to detect Windows console handles. (write_all): Add UTF-8 to UTF-16 conversion for console output, falling back to WriteFile for ASCII strings and regular files. Signed-off-by: Peter Damianov <peter0x44@disroot.org> Signed-off-by: Jonathan Yong <10walls@gmail.com>	2025-09-28 01:32:43 +00:00
GCC Administrator	214372031a	Daily bump.	2025-09-28 00:19:51 +00:00
Jonathan Wakely	e1b9ccaa10	libstdc++: Fix -Wmaybe-uninitialized warnings in testsuite These are false positives, but we might as well just value-init the variables to avoid the warnings. libstdc++-v3/ChangeLog: * testsuite/20_util/allocator_traits/members/allocate_hint.cc: Value-initialize variables to avoid -Wmaybe-uninitialized warning. * testsuite/20_util/allocator_traits/members/allocate_hint_nonpod.cc: Likewise. * testsuite/20_util/duration/114244.cc: Likewise. * testsuite/20_util/duration/io.cc: Likewise.	2025-09-27 21:18:43 +01:00
Jonathan Wakely	c2ccc43dda	libstdc++: Fix some -Wsign-compare warnings in headers In all these cases we know the value with signed type is not negative so the cast is safe. libstdc++-v3/ChangeLog: * include/bits/deque.tcc (deque::_M_shrink_to_fit): Cast difference_type to size_type to avoid -Wsign-compare warning. * include/std/spanstream (basic_spanbuf::seekoff): Cast streamoff to size_t to avoid -Wsign-compare warning.	2025-09-27 21:18:42 +01:00
Jonathan Wakely	f6c71c2079	libstdc++: Fix VERIFY(idx = 1) bugs in tests These should be checking for equality, not performing assignments. The tests for from_range on associative containers were actually checking the wrong thing, but the bug in the is_equal function was making the incorrect checks pass anyway, because all the values being used were non-zero, so the result of lhs.id = rhs.id was true, but would have been false if lhs.id == rhs.id had been used as intended. libstdc++-v3/ChangeLog: * testsuite/21_strings/basic_string/numeric_conversions/char/stoi.cc: Fix assignment used instead of equality comparison. * testsuite/21_strings/basic_string/numeric_conversions/char/stol.cc: Likewise. * testsuite/21_strings/basic_string/numeric_conversions/char/stoll.cc: Likewise. * testsuite/21_strings/basic_string/numeric_conversions/char/stoul.cc: Likewise. * testsuite/21_strings/basic_string/numeric_conversions/char/stoull.cc: Likewise. * testsuite/21_strings/basic_string/numeric_conversions/wchar_t/stoi.cc: Likewise. * testsuite/21_strings/basic_string/numeric_conversions/wchar_t/stol.cc: Likewise. * testsuite/21_strings/basic_string/numeric_conversions/wchar_t/stoll.cc: Likewise. * testsuite/21_strings/basic_string/numeric_conversions/wchar_t/stoul.cc: Likewise. * testsuite/21_strings/basic_string/numeric_conversions/wchar_t/stoull.cc: Likewise. * testsuite/23_containers/map/cons/from_range.cc: Fix is_equal function and expected value of comparison functions after construction. * testsuite/23_containers/multimap/cons/from_range.cc: Likewise. * testsuite/23_containers/multiset/cons/from_range.cc: Likewise. * testsuite/23_containers/set/cons/from_range.cc: Likewise. * testsuite/23_containers/unordered_map/cons/from_range.cc: Fix is_equal functions. * testsuite/23_containers/unordered_multimap/cons/from_range.cc: Likewise. * testsuite/23_containers/unordered_multiset/cons/from_range.cc: Likewise. * testsuite/23_containers/unordered_set/cons/from_range.cc: Likewise. * testsuite/25_algorithms/minmax/constrained.cc: Fix assignment used instead of equality comparison. * testsuite/27_io/manipulators/extended/get_time/wchar_t/1.cc: Likewise.	2025-09-27 21:17:40 +01:00
YunQiang Su	10bb371eee	MIPS/testsuite: Use isa_rev=2 instead of >=2 So that they won't fail for r6 targets. gcc/testsuite/ChangeLog: * gcc.target/mips/mips16e2.c: Use isa_rev=2 instead of >=2. * gcc.target/mips/mips16e2-cache.c: Ditto. * gcc.target/mips/mips16e2-cmov.c: Ditto. * gcc.target/mips/mips16e2-gp.c: Ditto.	2025-09-27 22:28:19 +08:00
ChengLulu	b07bab1953	MIPS: Fix the issue with the '-fpatchable-function-entry=' feature. PR target/99217 gcc/ChangeLog: * config/mips/mips.cc (mips_start_function_definition): Implements the functionality of '-fpatchable-function-entry='. (mips_print_patchable_function_entry): Define empty function. (TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY): Define macro. gcc/testsuite/ChangeLog: * gcc.target/mips/pr99217.c: New test.	2025-09-27 20:55:07 +08:00
Jason Merrill	a0536f80ff	c++: concepts and conversions [PR112632] One case missed in my fix for this PR: Here we were omitting the IMPLICIT_CONV_EXPR that expresses the conversion from int to char because the target type was non-dependent and the argument was not type-dependent. But we still need it if the argument is value-dependent. PR c++/112632 gcc/cp/ChangeLog: * pt.cc (convert_template_argument): Also force IMPLICIT_CONV_EXPR if the argument is value-dependent. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/concepts-conv4.C: New test.	2025-09-27 09:06:34 +01:00
Jason Merrill	6fda31f7b3	c++: add testcase [PR121854] Add the testcase for this GCC 15 PR, already fixed on trunk by r16-970. PR c++/121854 gcc/testsuite/ChangeLog: * g++.dg/cpp23/explicit-obj-lambda19.C: New test.	2025-09-27 09:06:21 +01:00
Jason Merrill	90ad957406	c++: implicit 'this' in generic lambda [PR122048] Here template substitution was replacing a reference to the 'this' of X::f with the implicit closure parameter of the operator(), which is wrong. The closure parameter is never a suitable replacement for a 'this' parameter. PR c++/122048 gcc/cp/ChangeLog: * pt.cc (tsubst_expr): Don't use a lambda current_class_ptr. gcc/testsuite/ChangeLog: * g++.dg/cpp1y/lambda-generic-this6.C: New test.	2025-09-27 09:04:58 +01:00
Jie Mei	f731fa5801	MIPS: Add conditions for use of the -mmips16e2 and -mips16 option. Changes from V1: * Raise the minimal revision to r2. MIPS16e2 ASE is a superset of MIPS16e ASE, which is again a superset of MIPS16 ASE. Later, all of them are forbidden in Release 6. Make -mmips16e2 imply -mips16 as the ASE requires, so users won't be surprised even if they expect it to. Meanwhile, check if mips_isa_rev <= 5 when -mips16 is effective and >= 2 when -mmips16e2 is effective. Co-developed-by: Rong Zhang <rongrong@oss.cipunited.com> Signed-off-by: Rong Zhang <rongrong@oss.cipunited.com> gcc/ChangeLog: * config/mips/mips.cc(mips_option_override):Add conditions for use of the -mmips16e2 and -mips16 option. gcc/testsuite/ChangeLog: * gcc.target/mips/mips16e2-cache.c: Use isa_rev>=2 instead of -mips32r2 and remove -mips16 option. * gcc.target/mips/mips16e2-cmov.c: Add isa_rev>=2 and remove -mips16 option. * gcc.target/mips/mips16e2-gp.c: Same as above. * gcc.target/mips/mips16e2.c: Same as above.	2025-09-27 15:27:49 +08:00
Paul Thomas	25f7f04e44	Fortran: Revert r16-4069 causing memory leaks in f951 [PR87908] 2025-09-27 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/87908 * interface.cc (check_interface0): Revert changes. gcc/testsuite/ PR fortran/87908 * gfortran.dg/pr87908.f90: Delete.	2025-09-27 08:01:30 +01:00
Jie Mei	51a3669ab6	MIPS: Add MSUBF.fmt instruction for MIPSr6 GCC currently uses two instructions (NEG.fmt and MADDF.fmt) for operations like `x - (y * z)' for MIPSr6. We can further tune this by using only MSUBF.fmt instead of those two. This patch adds MSUBF.fmt instrutions with corresponding tests. gcc/ChangeLog: * config/mips/mips.md (fms<mode>4): Generates MSUBF.fmt instructions. (fms<mode>4_msubf): Same as above. (fnma<mode>4): Same as above. (fnma<mode>4_msubf): Same as above. gcc/testsuite/ChangeLog: * gcc.target/mips/mips-msubf.c: New tests for MIPSr6.	2025-09-27 13:48:40 +08:00
GCC Administrator	4ab8a985b8	Daily bump.	2025-09-27 00:18:25 +00:00
Alejandro Colomar	33c35b7f4c	c, objc: Add -Wmultiple-parameter-fwd-decl-lists Warn about this: void f(int x; int x; int x); Add a new diagnostic, -Wmultiple-parameter-fwd-decl-lists, which diagnoses uses of this obsolescent syntax. Add this diagnostic in -Wextra. Forward declarations of parameters are very rarely used. And functions that need two forward declaractions of parameters are also quite rare. This combination results in this code almost not existing in any code base, which makes adding this to -Wextra okay. FWIW, I've tried finding such code using a code search engine, and didn't find any cases (but the regex for that isn't easy to writei, so I wouldn't trust it). gcc/c-family/ChangeLog: * c.opt: Add -Wmultiple-parameter-fwd-decl-lists gcc/c/ChangeLog: * c-decl.cc (c_scope): Rename {warned > had}_forward_parm_decls. (mark_forward_parm_decls): Add -Wmultiple-parameter-fwd-decl-lists. gcc/ChangeLog: * doc/extend.texi: Clarify documentation about lists of parameter forward declarations, and mention that more than one of them are unnecessary. * doc/invoke.texi: Document the new -Wmultiple-parameter-fwd-decl-lists. gcc/testsuite/ChangeLog: * gcc.dg/Wmultiple-parameter-fwd-decl-lists.c: New test. Signed-off-by: Alejandro Colomar <alx@kernel.org>	2025-09-26 19:28:24 +00:00
Harald Anlauf	e6b4908c04	Fortran: fix uninitialized read in testcase gfortran.dg/pdt_48.f03 Running the testcase using valgrind --leak-check=full --track-origins=yes: ==28585== Conditional jump or move depends on uninitialised value(s) ==28585== at 0x400E19: MAIN__ (pdt_48.f03:48) ==28585== by 0x400EDB: main (pdt_48.f03:34) ==28585== Uninitialised value was created by a heap allocation ==28585== at 0x4841984: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==28585== by 0x400975: __pdt_m_MOD_add (pdt_48.f03:30) ==28585== by 0x400D84: MAIN__ (pdt_48.f03:44) ==28585== by 0x400EDB: main (pdt_48.f03:34) The cause was a partial initialization of a vector used in a subsequent addition. Initialize the remaining elements of the first vector by zero. gcc/testsuite/ChangeLog: * gfortran.dg/pdt_48.f03:	2025-09-26 19:20:39 +02:00
Jan Hubicka	40d9e9601a	Fix precise 0 handling in afdo_propagate_edge Currently afdo_propagate_edge will turn precise 0 to autofdo 0 because it thinks auto-profile claims some samples has been executed in the given basic block, while this is only a consequence of < being defined by 0 (predise) < 0 (autofdo) gcc/ChangeLog: * auto-profile.cc (afdo_propagate_edge): Fix handling of precize 0 counts.	2025-09-26 15:57:31 +02:00
Andrew Stubbs	fdc8037a59	amdgcn: Remove vector alignment restrictions The supported misalignment logic seems to be a bit arbitrary. Some of it looks like it was copied from the Arm implementation, although testing shows that the packed accesses do not work (weird subregs happen). AMD GCN does have some alignment restrictions on Buffer instructions, but as we don't use those that's irrelvant. The Flat and Global instructions (that we do use) have no such restrictions. LDS memory -- which can be accessed via Flat instructions -- does have alignment restrictions, but the compiler is not using LDS for arbitrary vectors. If the user deliberately choses to place unaligned data in low-latency memory then a runtime exception should occur (no silent bad behaviour), so there's no reason to pessimise the normal case. gcc/ChangeLog: * config/gcn/gcn.cc (gcn_vectorize_support_vector_misalignment): Allow any alignment, as long as it's not packed.	2025-09-26 11:23:06 +00:00
Joseph Myers	1b876bdffd	c: Give permerror for excess braces in scalar initializers [PR88642] As noted in bug 88642, the C front end fails to give errors or pedwarns for scalar initializers with too many levels of surrounding braces. There is a warning for redundant braces around a scalar initializer within a larger braced initializer (valid for a single such level within a structure, union or array initializer; not valid for more than one such level, or where the outer layer of braces is itself for a scalar, either redundant braces themselves or part of a compound literal), but this never becomes an error even for invalid cases. Check for this case and turn the warning into a permerror when there are more levels of braces than permitted. The existing warning is unchanged for a single (permitted) level of redundant braces around a scalar initializer inside a structure, union or array initializer, and it's also unchanged that no such warning is given for a single (permitted) level of redundant braces around a top-level scalar initializer. Technically this is a C2y issue (these rules on valid initializers moved into Constraints as a result of N3346, accepted in Minneapolis; previously, as a "shall" outside constraints, violating these rules resulted in compile-time undefined behavior without requiring a diagnostic). Hopefully little code is actually relying on not getting an error here. In view of gcc.dg/tree-ssa/ssa-dse-10.c showing that at least some code may be using such over-braced initializers (initializer of pubKeys at line 1167 in that test; I'm not at all sure how that initializer ends up getting interpreted to translate it to something equivalent but properly structured), this is made a permerror rather than a hard error, so -fpermissive (as already used by that test) can be used to disable the error (the default -fpermissive for old standards modes is not a problem given that before C2y this is undefined behavior not a constraint violation). Bootstrapped with no regressions for x86_64-pc-linux-gnu. PR c/88642 gcc/c/ * c-typeck.cc (constructor_braced_scalar): New variable. (struct constructor_stack): Add braced_scalar field. (really_start_incremental_init): Handle constructor_braced_scalar and braced_scalar field. (push_init_level): Handle constructor_braced_scalar and braced_scalar field. Give permerror rather than warning for nested braces around scalar initializer. (pop_init_level): Handle constructor_braced_scalar and braced_scalar field. gcc/testsuite/ * gcc.dg/c2y-init-1.c: New test.	2025-09-26 11:12:12 +00:00
Jan Hubicka	9f9c8d63a5	Fix integer overflow in profile_count::probability_in This patch fixes integer overflow in profile_count::probability_in which happens for very large counts. This was probably not that common in practice until scaled AutoFDO profiles were intorduces. This was introduced as cut&paste from profile_probability implementation. I reviewed multiplicaitons in the file for safety and noticed that in some cases the code is over-protective. In profile_probability::operator/ we alrady scale that m_val <= other.m_val and thus we know result will be in the range 0...max_probability. In profile_probability::apply_scale we deal with 30bit value from profile_probability so no overflow can happen. gcc/ChangeLog: * profile-count.h (profile_probability::operator/): Do not cap twice. (profile_probability::operator/=): Likewise. (profile_probability::apply_scale): Do not watch for overflow. (profile_count::probability_in): Watch overflow.	2025-09-26 12:39:44 +02:00
Jonathan Wakely	1cf6cda055	libstdc++: Reuse predicates in std::search and std::is_permutation Hoist construction of the call wrappers out of the loop when we're repeatedly creating a call wrapper with the same bound arguments. We need to be careful about iterators that return proxy references, because bind1st(pred, first) could bind a reference to a prvalue proxy reference returned by first. That would then be an invalid reference by the time we invoked the call wrapper. If we dereference the iterator first and store the result of that on the stack, then we don't have a prvalue proxy reference, and can bind it (or the value it refers to) into the call wrapper: auto&& val = first; // lifetime extension auto wrapper = bind1st(pred, val); for (;;) / use wrapper /; This ensures that the reference returned from first outlives the call wrapper, whether it's a proxy reference or not. For C++98 compatibility in __search we can use __decltype(expr) instead of auto&&. libstdc++-v3/ChangeLog: * include/bits/stl_algobase.h (__search, __is_permutation): Reuse predicate instead of creating a new one each time. * include/bits/stl_algo.h (__is_permutation): Likewise. Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>	2025-09-26 11:05:59 +01:00
Jonathan Wakely	b83c2e52a2	libstdc++: Simplify std::erase functions for sequence containers This removes the use of std::ref that meant that __remove_if used an indirection through the reference, which might be a pessimization. Users can always use std::ref to pass expensive predicates into erase_if, but we shouldn't do it unconditionally. We can std::move the predicate so that if it's not cheap to copy and the user didn't use std::ref, then we try to use a cheaper move instead of a copy. There's no reason that std::erase shouldn't just be implemented by forwarding to std::erase_if. I probably should have done that in r12-4083-gacf3a21cbc26b3 when std::erase started to call __remove_if directly. libstdc++-v3/ChangeLog: * include/std/deque (erase_if): Move predicate instead of wrapping with std::ref. (erase): Forward to erase_if. * include/std/inplace_vector (erase_if, erase): Likewise. * include/std/string (erase_if, erase): Likewise. * include/std/vector (erase_if, erase): Likewise. Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>	2025-09-26 11:05:54 +01:00
Jonathan Wakely	aaeca77a79	libstdc++: Eliminate __gnu_cxx::__ops function objects This removes the indirect functors from <bits/predefined_ops.h> that are used by our STL algorithms. Currently we wrap all predicates and values into callables which accept iterator arguments, and automatically dereference the iterators. With this change we no longer do that dereferencing and so all predicates are passed values not iterators, and the algorithms that invoke those predicates must dereference the iterators. This avoids wrapping user-provided predicates into another predicate that does the dereferencing. User-provided predicates are now passed unchanged to our internal algos like __search_n. For the overloads that take a value instead of a predicate, we still need to create a predicate that does comparison to the value, but we can now use std::less<void> and std::equal_to<void> as the base predicate and bind the value to those base predicates. Because the "transparent operators" std::less<void> and std::equal_to<void> were not added until C++14, this change defines those explicit specializations unconditionally for C++98 and C++11 too (but the default template arguments that make std::less<> and std::equal_to<> refer to those specializations are still only present for C++14 and later, because we don't need to rely on those default template arguments for our internal uses). When binding a predicate and a value into a new call wrapper, we now decide whether to store the predicate by value when it's an empty type or a scalar (such as a function pointer). This avoids a double-indirection through function pointers, and avoids storing and invoking stateless empty functors through a reference. For C++11 and later we also use [[no_unique_address]] to avoid wasted storage for empty predicates (which includes all standard relational ops, such as std::less). The call wrappers in bits/predefined_ops.h all have non-const operator() because we can't be sure that the predicates they wrap are const-invocable. The requirements in [algorithms.requirements] for Predicate and BinaryPredicate template arguments require pred(i) to be valid, but do not require that std::to_const(pred)(i) has to be valid, and similarly for binary_pred. libstdc++-v3/ChangeLog: * include/bits/predefined_ops.h (equal_to, less): Define aliases for std::equal_to<void> and std::less<void>. (bind1st, bind2nd, not1, __equal_to): New object generator functions for adapting predicates. (__iter_less_iter, __iter_less_val, __iter_comp_val) (__val_less_iter, __val_comp_iter, __iter_equal_to_iter) (__iter_equal_to_val, __iter_comp_iter, __negate): Remove all object generator functions and the class templates they return. * include/bits/stl_algo.h (__move_median_to_first, __find_if_not) (__find_if_not_n, __search_n_aux, find_end, find_if_not) (__remove_copy_if, remove_copy, remove_copy_if, remove) (remove_if, __adjacent_find, __unique, unique, __unique_copy) (__unique_copy_1, __stable_partition_adaptive, stable_partition) (__heap_select, __partial_sort_copy, partial_sort_copy) (__unguarded_linear_insert, __insertion_sort) (__unguarded_insertion_sort, __unguarded_partition) (lower_bound, __upper_bound, upper_bound, __equal_range) (equal_range, binary_search, __move_merge_adaptive) (__move_merge_adaptive_backward, __merge_adaptive_resize) (__merge_without_buffer, inplace_merge, __move_merge) (__includes, includes, __next_permutation, next_permutation) (__prev_permutation, prev_permutation, __replace_copy_if) (replace_copy, replace_copy_if, __is_sorted_until) (is_sorted_until, __minmax_element, minmax_element, minmax) (is_permutation, __is_permutation, find, find_if, adjacent_find) (count, count_if, search, search_n, unique_copy, partial_sort) (nth_element, sort, __merge, merge, stable_sort, __set_union) (set_union, __set_intersection, set_intersection) (__set_difference, set_difference, __set_symmetric_difference) (set_symmetric_difference, __min_element, min_element) (__max_element, max_element, min, max): Use direct predicates instead of __iter_equal_to_iter, __iter_comp_iter, and __iter_less_iter, __negate etc. Dereference iterators when invoking predicates. * include/bits/stl_algobase.h (__lexicographical_compare_impl) (__lexicographical_compare::__lc, __lower_bound, lower_bound) (lexicographical_compare, __mismatch, mismatch, __find_if) (__count_if, __remove_if, __search, __is_permutation) (is_permutation, search): Likewise. * include/bits/stl_function.h (equal_to<void>, less<void>): Define transparent comparison functions for C++98 and C++11. * include/bits/stl_heap.h (__is_heap_until, __is_heap) (__push_heap, push_heap, __adjust_heap, pop_heap, make_heap) (sort_heap, is_heap_until, is_heap): Likewise. * include/std/deque (erase_if): Remove call to __pred_iter. (erase): Replace __iter_equals_val with __equal_to. * include/std/inplace_vector (erase_if, erase): Likewise. * include/std/string (erase_if, erase): Likewise. * include/std/vector (erase_if, erase): Likewise. Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com> Reviewed-by: François Dumont <frs.dumont@gmail.com>	2025-09-26 11:05:49 +01:00
Jonathan Wakely	11ce485bcf	libstdc++: Fix unsafe comma operators in <random> [PR122062] This fixes a 'for' loop in std::piecewise_linear_distribution that increments two iterators with a comma operator between them, making it vulnerable to evil overloads of the comma operator. It also changes a 'for' loop used by some other distributions, even though those are only used with std::vector<double>::iterator and so won't find any overloaded commas. libstdc++-v3/ChangeLog: PR libstdc++/122062 * include/bits/random.tcc (__detail::__normalize): Use void cast for operands of comma operator. (piecewise_linear_distribution): Likewise. * testsuite/26_numerics/random/piecewise_linear_distribution/cons/122062.cc: New test. Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com> Reviewed-by: Hewill Kang <hewillk@gmail.com>	2025-09-26 11:01:08 +01:00
Paul Thomas	947b22d9d0	Fortran: Fix uninitialized reads for pdt_13.f03 etc. [PR122002] 2025-09-26 Harald Anlauf <anlauf@gcc.gnu.org> gcc/fortran PR fortran/122002 * decl.cc (gfc_get_pdt_instance): Initialize 'instance' to NULL and set 'kind_value' to zero before calling gfc_extract_int. * primary.cc (gfc_match_rvalue): Intitialize 'ctr_arglist' to NULL and test for default values if gfc_get_pdt_instance returns NULL.	2025-09-26 07:30:07 +01:00
Lulu Cheng	d6ee89a65b	LoongArch: Implement TARGET_CAN_INLINE_P[PR121875]. Because LoongArch does not implement TARGET_CAN_INLINE_P, functions with the target attribute set and those without it cannot be inlined. At the same time, setting the always_inline attribute will cause compilation failure. To solve this problem, I implemented this hook. During the implementation process, it checks the status of the target special options of the caller and callee, such as the ISA extension. PR target/121875 gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_can_inline_p): New function. (TARGET_CAN_INLINE_P): Define. gcc/testsuite/ChangeLog: * gcc.target/loongarch/can_inline_1.c: New test. * gcc.target/loongarch/can_inline_2.c: New test. * gcc.target/loongarch/can_inline_3.c: New test. * gcc.target/loongarch/can_inline_4.c: New test. * gcc.target/loongarch/can_inline_5.c: New test. * gcc.target/loongarch/can_inline_6.c: New test. * gcc.target/loongarch/pr121875.c: New test.	2025-09-26 09:13:18 +08:00
GCC Administrator	11a662dd8b	Daily bump.	2025-09-26 00:20:03 +00:00
Gerald Pfeifer	505c139c58	doc: Standardize on "bitwise" and "elementwise" ...over "bit-wise" and "element-wise". gcc: * doc/invoke.texi (Warning Options): Use "bitwise" over "bit-wise". * doc/extend.texi (Vector Extensions): Use "elementwise" over "element-wise". * doc/md.texi (Standard Names): Ditto.	2025-09-26 00:09:32 +02:00
Gerald Pfeifer	29c28bb912	doc: Fix grammar around Vector Extensions gcc: * doc/extend.texi (Vector Extensions): Fix grammar.	2025-09-25 23:59:06 +02:00
Harald Anlauf	43508d358b	Fortran: ICE in character(kind=4) deferred-length array reference [PR121939] PR fortran/121939 gcc/fortran/ChangeLog: * trans-types.cc (gfc_init_types): Set string flag for all character types. gcc/testsuite/ChangeLog: * gfortran.dg/deferred_character_39.f90: Disable temporary workaround for character(kind=4) deferred-length bug.	2025-09-25 18:51:48 +02:00
John David Anglin	80d729c4b1	hppa: Fix asm in atomic_store_8 in sync-libfuncs.c Fix typo in the asm in atomic_store_8. Also correct floating point store. Reported by Nick Hudson for netbsd. 2025-09-25 John David Anglin <danglin@gcc.gnu.org> libgcc/ChangeLog: * config/pa/sync-libfuncs.c (atomic_store_8): Fix asm.	2025-09-25 10:49:39 -04:00
Luc Grosheintz	5756d0b613	libstdc++: Refactor __mdspan::__static_quotient. For padded layouts we want to check that the product of the padded stride with the remaining extents is representable. Creating a second overload, allows passing in subspans of the static extents and retains the ergonomics for the common case of passing in all static extents. libstdc++-v3/ChangeLog: * include/std/mdspan (__static_quotient): New overload. Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com> Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>	2025-09-25 15:55:43 +02:00
Jonathan Wakely	08530be259	libstdc++: Check feature test macro for robust_nonmodifying_seq_ops We should check the relevant feature test macro instead of just the value of __cplusplus. Also add a comment explaining why the __cplusplus check guarding __sample can't be changed to check __glibcxx_sample (because __sample is also used in C++14 by std::experimental::sample, not only by C++17 std::sample). libstdc++-v3/ChangeLog: * include/bits/stl_algo.h: Check robust_nonmodifying_seq_ops feature test macro instead of checking __cplusplus value. Add comment to another __cplusplus check. * include/bits/stl_algobase.h: Add comment to #endif.	2025-09-25 14:52:25 +01:00
Jonathan Wakely	0959f0e0ce	libstdc++: Remove unwanted STDC_HEADERS macro from c++config.h [PR79147] Similar to r16-4034-g1953939243e1ab, this comments out another macro that Autoconf adds to the generated config.h but which is not wanted in the c++config.h file that we install. There's no benefit to defining _GLIBCXX_STDC_HEADERS in user code, so we should just prevent it from being defined. libstdc++-v3/ChangeLog: PR libstdc++/79147 PR libstdc++/103650 * include/Makefile.am (c++config.h): Adjust sed command to comment out STDC_HEADERS macro. * include/Makefile.in: Regenerate.	2025-09-25 14:51:50 +01:00
Luc Grosheintz	181e7bea46	libstdc++: Prepare mapping layout tests for padded layouts. Using the existing tests for padded layouts requires the following changes: * The padded layouts are template classes. In order to be able to use partially specialized templates, functions need to be converted to structs. * The layout mapping tests include a check that only applies if is_exhaustive is static. This commit introduces a concept to check if is_exhaustive is a static member function. * Fix a test to not use a hard-coded layout_left. The test empty.cc contains indentation mistakes that are fixed. libstdc++-v3/ChangeLog: * testsuite/23_containers/mdspan/layouts/empty.cc: Fix indent. * testsuite/23_containers/mdspan/layouts/mapping.cc (test_stride_1d): Fix test. (test_stride_2d): Rewrite using a struct. (test_stride_3d): Ditto. (has_static_is_exhaustive): New concept. (test_mapping_properties): Update test. Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com> Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com>	2025-09-25 15:40:19 +02:00
Nathaniel Shead	4f9f1269f4	c++/modules: Remove incorrect assertion [PR122015,PR122019] This assertion, despite what I said in r16-4070, is not valid: we can reach here when deduping a VAR_DECL that didn't get a LANG_SPECIFIC in the current TU. It's still correct to always use lang_cplusplus however as for anything else the decl would have been created with an appropriate LANG_SPECIFIC to start with. PR c++/122015 PR c++/122019 gcc/cp/ChangeLog: * module.cc (trees_in::install_entity): Remove incorrect assertion. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>	2025-09-25 21:16:10 +10:00
Xi Ruoyao	19167808a6	doc: Reword the description of -f[no-]fp-int-builtin-inexact default Now -std=gnu23 is the default, so -fno-fp-int-builtin-inexact is effectively the default value. gcc/ * doc/invoke.texi (-ffp-int-builtin-inexact): Reword to match the default value with the default C standard.	2025-09-25 10:06:59 +08:00
GCC Administrator	e4b9750478	Daily bump.	2025-09-25 00:21:38 +00:00
Dusan Stojkovic	d53f7ad85e	[PATCH][PR target/121778] RISC-V: Improve rotation detection for RISC-V This patch splits the canonical sign-bit checking idiom into a 2-insn sequence when Zbb is available. Combine often normalizes (xor (lshr A, (W - 1)) 1) to (ge A, 0). For width W = bitsize (mode), the identity: (a << 1) \| (a >= 0) == (a << 1) \| ((a >> (W - 1)) ^ 1) == ROL1 (a) ^ 1 lets us split: (ior:X (ashift:X A 1) (ge:X A 0)) into: → rotatert:X A, (W-1) → xor:X A, 1 PR target/121778 gcc/ChangeLog: * config/riscv/riscv.md: Add define_split pattern. gcc/testsuite/ChangeLog: * gcc.target/riscv/pr121778-1.c: New test. * gcc.target/riscv/pr121778-2.c: New test.	2025-09-24 15:13:01 -06:00
Joseph Myers	dfbce1feae	c: Fix handling of register atomic compound literals The logic for loads and stores of _Atomic objects in the C front end involves taking the address of such objects, with really_atomic_lvalue detecting cases where this cannot be done (and also no special handling is needed for atomicity), in particular register _Atomic objects. This logic failed to deal with the case of register _Atomic compound literals, so resulting in spurious errors "error: address of register compound literal requested" followed by "error: argument 1 of '__atomic_load' must be a non-void pointer type". (This is a C23 bug that I found while changing really_atomic_lvalue as part of previous C2y changes.) Add a use of COMPOUND_LITERAL_EXPR_DECL in that case. Bootstrapped with no regressions for x86_64-pc-linux-gnu. gcc/c/ * c-typeck.cc (really_atomic_lvalue): For a COMPOUND_LITERAL_EXPR, check C_DECL_REGISTER on the COMPOUND_LITERAL_EXPR_DECL. gcc/testsuite/ * gcc.dg/c23-complit-9.c: New test.	2025-09-24 19:53:01 +00:00
Mikael Morin	3386216618	fortran: Favor parser-generated module procedure namespaces [PR122046] In the testcase from the PR, an assertion triggers because the compiler tries to access the parent namespace of a contained procedure. But the namespace is the formal namespace of a module procedure symbol in a submodule, which hasn't its parent set. To add a bit of context, in submodules, module procedures inherited from their parent module have two different namespaces holding their dummy arguments. The first one is generated by the the host association of the module from the .mod file, and is made accessible in the procedure symbol's formal_ns field. Its parent field is not set. The second one is generated by the parser and contains the procedure implementation. It's accessible from the list of contained procedures in the submodule namespace. Its parent field is set. This change modifies gfc_get_procedure_ns to favor the parser-generated namespace in the submodule case where there are two namespaces to choose from. PR fortran/122046 gcc/fortran/ChangeLog: * symbol.cc (gfc_get_procedure_ns): Try to find the namespace among the list of contained namespaces before returning the value from the formal_ns field. gcc/testsuite/ChangeLog: * gfortran.dg/submodule_34.f90: New test.	2025-09-24 21:02:48 +02:00
Andrew Pinski	966cdec2b2	gimple-fold/fab: Move ASSUME_ALIGNED handling to gimple-fold [PR121762] This is the next patch in the series of removing fab. This one is simplier than builtin_constant_p because the only time we want to simplify this builtin is at the final folding step. Note align-5.c needs to change slightly as __builtin_assume_aligned is no longer taken into account for the same reason as why PR 111875 is closed as invalid and why the testcase is failing at -Og I added a new testcase align-5a.c where the pointer is explictly aligned so that the check is gone there. Note __builtin_assume_aligned should really be instrumented for UBSAN, I filed PR 122038 for that. Bootstrapped and tested on x86_64-linux-gnu. PR tree-optimization/121762 gcc/ChangeLog: * gimple-fold.cc (gimple_fold_builtin_assume_aligned): New function. (gimple_fold_builtin): Call gimple_fold_builtin_assume_aligned for BUILT_IN_ASSUME_ALIGNED. * tree-ssa-ccp.cc (pass_fold_builtins::execute): Remove handling of BUILT_IN_ASSUME_ALIGNED. gcc/testsuite/ChangeLog: * c-c++-common/ubsan/align-5.c: Update as __builtin_assume_aligned is no longer taked into account. * c-c++-common/ubsan/align-5a.c: New test. Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>	2025-09-24 10:45:31 -07:00
Jennifer Schmitz	0088e4a419	AArch64: Enable dispatch scheduling for Neoverse V2. This patch adds dispatch constraints for Neoverse V2 and illustrates the steps necessary to enable dispatch scheduling for an AArch64 core. The dispatch constraints are based on section 4.1 of the Neoverse V2 SWOG. Please note that the values used here deviate slightly from the current SWOG version but are based on correct numbers. Arm will do an official Neoverse V2 SWOG release with the updated values in due time. Here are the steps how we implemented the dispatch constraints for Neoverse V2: 1. We used instruction attributes to group instructions into dispatch groups, corresponding to operations that utilize a certain pipeline type. For that, we added a new attribute (neoversev2_dispatch) with values for the different dispatch groups. The values of neoversev2_dispatch are determined using expressions of other instruction attributes. For example, the SWOG describes a constraint of "Up to 4 uOPs utilizing the M pipelines". Thus, one of the values of neoversev2_dispatch is "m" and it groups instructions that use the M pipelines such as integer multiplication. Note that we made some minor simplifications compared to the information in the SWOG, because the instruction annotation does not allow for a fully accurate mapping of instructions to utilized pipelines. To give one example, the instructions IRG and LDG are both tagged with "memtag", but IRG uses the M pipelines, while LDG uses the L pipelines. 2. In the Neoverse V2 tuning model, we added an array of available slots per dispatch constraint and a callback function that takes an insn as input and returns a vector of pairs (a, b) where a is an index in the array of slots and b is the number of occupied slots. The callback function calls get_attr_neoversev2_dispatch(insn) and switches over the result values to create a vector of occupied slots. Thus, the new attribute neoversev2_dispatch provides a compact way to define the dispatch constraints. The array of available slots, its length, and a pointer to the callback function are collected in a struct dispatch_constraint_into which is referenced in the tune_params. 3. We enabled dispatch scheduling for Neoverse V2 by adding the AARCH64_EXTRA_TUNE_DISPATCH_SCHED tune flag. Performance evaluation showed no regression in several different workloads including SPEC2017 and GROMACS2024. Thank you, Tamar, for helping with performance evaluation. The patch was bootstrapped and tested on aarch64-linux-gnu, no regression. Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com> gcc/ChangeLog: * config/aarch64/aarch64.md: Include neoversev2.md. * config/aarch64/tuning_models/neoversev2.h: Enable dispatch scheduling and add dispatch constraints. * config/aarch64/neoversev2.md: New file and new instruction attribute neoversev2_dispatch.	2025-09-24 16:18:35 +02:00
Jennifer Schmitz	c8bd7b2d55	AArch64: Implement target hooks for dispatch scheduling. This patch adds dispatch scheduling for AArch64 by implementing the two target hooks TARGET_SCHED_DISPATCH and TARGET_SCHED_DISPATCH_DO. The motivation for this is that cores with out-of-order processing do most of the reordering to avoid pipeline hazards on the hardware side using large reorder buffers. For such cores, rather than scheduling around instruction latencies and throughputs, the compiler should aim to maximize the utilized dispatch bandwidth by inserting a certain instruction mix into the frontend dispatch window. In the following, we will describe the overall implementation: Recall that the Haifa scheduler makes the following 6 types of queries to a dispatch scheduling model: 1) targetm.sched.dispatch (NULL, IS_DISPATCH_ON) 2) targetm.sched.dispatch_do (NULL, DISPATCH_INIT) 3) targetm.sched.dispatch (insn, FITS_DISPATCH_WINDOW) 4) targetm.sched.dispatch_do (insn, ADD_TO_DISPATCH_WINDOW) 5) targetm.sched.dispatch (NULL, DISPATCH_VIOLATION) 6) targetm.sched.dispatch (insn, IS_CMP) For 1), we created the new tune flag AARCH64_EXTRA_TUNE_DISPATCH_SCHED. For 2-5), we modeled dispatch scheduling using the class dispatch_window. A dispatch_window object represents the window of operations that is dispatched per cycle. It contains the two arrays max_slots and free_slots (the length of the arrays is the number of dispatch constraints specified for a core) to keep track of the available slots. The dispatch_window class exposes functions to ask whether a given instruction would fit into the dispatch_window or to add an instruction to the window. The model operates using only one dispatch_window object that is constructed when 2) is called. Upon construction, it copies the number of available slots given in the tuning model (more details on the changes to tune_params below). During scheduling, instructions are added according to the dispatch constraints. For that, the dispatch_window queries the tuning model using a callback function that takes an insn as input and returns a vector of pairs (a, b), where a is the index of the constraint and b is the number of slots occupied. The dispatch_window checks if the instruction fits into the current window. If not, i.e. the current window is full, the free_slots array is reset to max_slots. Then the dispatch_window deducts b slots from free_slots[a] for each pair (a, b) in the vector returned by the callback. A dispatch violation occurs when the number of free slots becomes negative for any dispatch_constraint. For 6), return false (see comment in aarch64-sched-dispatch.cc). Dispatch information for a core can be added in its tuning model. We added the new field dispatch_constraint to the struct tune_params that holds a pointer to a struct dispatch_constraints_info. All current tuning models were initialized with nullptr. (In the next patch, dispatch information will be added for Neoverse V2.) The patch was bootstrapped and tested on aarch64-linux-gnu, no regression. Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com> gcc/ChangeLog: config.gcc: Add aarch64-sched-dispatch.o to extra_objs. * config/aarch64/aarch64-protos.h (struct tune_params): New field for dispatch scheduling. (struct dispatch_constraint_info): New struct for dispatch scheduling. * config/aarch64/aarch64-tuning-flags.def (AARCH64_EXTRA_TUNING_OPTION): New flag to enable dispatch scheduling. * config/aarch64/aarch64.cc (TARGET_SCHED_DISPATCH): Implement target hook. (TARGET_SCHED_DISPATCH_DO): Likewise. (aarch64_override_options_internal): Add check for definition of dispatch constraints if dispatch-scheduling tune flag is set. * config/aarch64/t-aarch64: Add aarch64-sched-dispatch.o. * config/aarch64/tuning_models/a64fx.h: Initialize fields for dispatch scheduling in tune_params. * config/aarch64/tuning_models/ampere1.h: Likewise. * config/aarch64/tuning_models/ampere1a.h: Likewise. * config/aarch64/tuning_models/ampere1b.h: Likewise. * config/aarch64/tuning_models/cortexa35.h: Likewise. * config/aarch64/tuning_models/cortexa53.h: Likewise. * config/aarch64/tuning_models/cortexa57.h: Likewise. * config/aarch64/tuning_models/cortexa72.h: Likewise. * config/aarch64/tuning_models/cortexa73.h: Likewise. * config/aarch64/tuning_models/cortexx925.h: Likewise. * config/aarch64/tuning_models/emag.h: Likewise. * config/aarch64/tuning_models/exynosm1.h: Likewise. * config/aarch64/tuning_models/fujitsu_monaka.h: Likewise. * config/aarch64/tuning_models/generic.h: Likewise. * config/aarch64/tuning_models/generic_armv8_a.h: Likewise. * config/aarch64/tuning_models/generic_armv9_a.h: Likewise. * config/aarch64/tuning_models/neoverse512tvb.h: Likewise. * config/aarch64/tuning_models/neoversen1.h: Likewise. * config/aarch64/tuning_models/neoversen2.h: Likewise. * config/aarch64/tuning_models/neoversen3.h: Likewise. * config/aarch64/tuning_models/neoversev1.h: Likewise. * config/aarch64/tuning_models/neoversev2.h: Likewise. * config/aarch64/tuning_models/neoversev3.h: Likewise. * config/aarch64/tuning_models/neoversev3ae.h: Likewise. * config/aarch64/tuning_models/olympus.h: Likewise. * config/aarch64/tuning_models/qdf24xx.h: Likewise. * config/aarch64/tuning_models/saphira.h: Likewise. * config/aarch64/tuning_models/thunderx.h: Likewise. * config/aarch64/tuning_models/thunderx2t99.h: Likewise. * config/aarch64/tuning_models/thunderx3t110.h: Likewise. * config/aarch64/tuning_models/thunderxt88.h: Likewise. * config/aarch64/tuning_models/tsv110.h: Likewise. * config/aarch64/tuning_models/xgene1.h: Likewise. * config/aarch64/aarch64-sched-dispatch.cc: New file for dispatch scheduling for aarch64. * config/aarch64/aarch64-sched-dispatch.h: New header file.	2025-09-24 16:18:28 +02:00
Jennifer Schmitz	cb80cdbef4	AArch64: Annotate SVE instructions with new instruction attribute. In this patch, we add the new instruction attribute "sve_type" and use it to annotate the SVE instructions in aarch64-sve.md and aarch64-sve2.md. This allows us to use instruction attributes to group instructions into dispatch groups for dispatch scheduling. While there had already been fine-grained annotation of scalar and neon instructions (mostly using the "type"-attribute), annotation was missing for SVE instructions. The values of the attribute "sve_type" are comparatively coarse-grained, but fulfill the two criteria we aimed for with regard to dispatch scheduling: - the annotation allows the definition of CPU-specific high-level attributes mapping instructions to dispatch constraints - the annotation is by itself CPU-independent and consistent, i.e. all instructions fulfilling certain criteria are tagged with the corresponding value The patch was bootstrapped and tested on aarch64-linux-gnu, no regression. Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com> gcc/ChangeLog: * config/aarch64/aarch64-sve.md: Annotate instructions with attribute sve_type. * config/aarch64/aarch64-sve2.md: Likewise. * config/aarch64/aarch64.md (sve_type): New attribute sve_type. * config/aarch64/iterators.md (sve_type_unspec): New int attribute. (sve_type_int): New code attribute. (sve_type_fp): New int attribute.	2025-09-24 16:18:17 +02:00
Luc Grosheintz	41c95d5e53	libstdc++: Move test for __cpp_lib_not_fn to version.cc When running the tests without pre-compiled headers (--disable-libstdcxx-pch), the test fails, because the feature testing macro (FTM) isn't defined yet. This commit moves checking the FTM to a dedicated file (version.cc) that's run without PCH. libstdc++-v3/ChangeLog: * testsuite/20_util/function_objects/not_fn/nttp.cc: Move test of feature testing macro to version.cc * testsuite/20_util/function_objects/not_fn/version.cc: New test. Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Co-authored-by: Tomasz Kamiński <tkaminsk@redhat.com> Signed-off-by: Luc Grosheintz <luc.grosheintz@gmail.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>	2025-09-24 14:35:19 +02:00
Richard Biener	1f6b1ed047	tree-optimization/116816 - improve VMAT_ELEMENTWISE with SLP The following implements VMAT_ELEMENTWISE for grouped loads, in particular for being able to serve as fallback for unhandled load permutations since it's trivial to load elements in the correct order. PR tree-optimization/116816 * tree-vect-stmts.cc (get_load_store_type): Allow multi-lane single-element interleaving to fall back to VMAT_ELEMENTWISE. Fall back to VMAT_ELEMENTWISE when we cannot handle a load permutation. (vectorizable_load): Do not check a load permutation for VMAT_ELEMENTWISE. Handle grouped loads with VMAT_ELEMENTWISE and directly apply a load permutation.	2025-09-24 13:33:02 +02:00
Richard Biener	191d8b846c	Fix get_load_store_type wrt VMAT_ELEMENTWISE classification We may not classify a BB vectorization load as VMAT_ELEMENTWISE as that will ICE. Instead we build vectors from existing scalar loads. Make that explicit. * tree-vect-stmts.cc (get_load_store_type): Explicitly fail when we end up with VMAT_ELEMENTWISE for BB vectorization.	2025-09-24 13:33:02 +02:00

1 2 3 4 5 ...

223745 Commits