Commit Graph

227575 Commits

Author SHA1 Message Date
Richard Sandiford
783e7e1fd4 Revert two changes in r16-7265-ga9e48eca3a6eef [PR118608]
Sorry to be awkward, but I'd like to revert the rtlanal.cc and
config/mips/mips.md parts of r16-7265-ga9e48eca3a6eef.  I think
the expr.cc part of that patch is enough to fix the bug.  The other
parts seem unnecessary and are likely to regress code quality on MIPS
compared to previous releases.  (See the testing below for examples.)

The rtlanal.cc part added the following code to truncated_to_mode:

  /* This explicit TRUNCATE may be needed on targets that require
     MODE to be suitably extended when stored in X.  Targets such as
     mips64 use (sign_extend:DI (truncate:SI (reg:DI x))) to perform
     an explicit extension, avoiding use of (subreg:SI (reg:DI x))
     which is assumed to already be extended.  */
  scalar_int_mode imode, omode;
  if (is_a <scalar_int_mode> (mode, &imode)
      && is_a <scalar_int_mode> (GET_MODE (x), &omode)
      && targetm.mode_rep_extended (imode, omode) != UNKNOWN)
    return false;

I think this has two problems.  The first is that mode_rep_extended
describes a canonical form that is obtained by correctly honouring
TARGET_TRULY_NOOP_TRUNCATION.  It is not an independent restriction
on what RTL optimisers can do.  If we need to disable an optimisation
on MIPS-like targets, the restrictions should be based on
TARGET_TRULY_NOOP_TRUNCATION instead.

The second problem is that, although the comment treats MIPS-like
DI->SI truncation as a special case, truncated_to_mode is specifically
written for such cases.  The comment above the function says:

/* Suppose that truncation from the machine mode of X to MODE is not a
   no-op.  See if there is anything special about X so that we can
   assume it already contains a truncated value of MODE.  */

Thus we're already in the realm of MIPS-like truncations that need
TRUNCATE rather than SUBREG (and that in turn guarantee sign-extension
in some cases).  It's the caller that checks for that condition:

	  && (TRULY_NOOP_TRUNCATION_MODES_P (mode, GET_MODE (op))
	      || truncated_to_mode (mode, op)))

So I think the patch has the effect of disabling exactly the kind of
optimisation that truncated_to_mode is supposed to provide.

truncated_to_mode makes an implicit assumption that sign-extension is
enough to allow a SUBREG to be used in place of a TRUNCATE.  This is
true for MIPS and was true for the old SH64 port.  I don't know whether
it's true for gcn and nvptx, although I assume that it must be, since
no-one seems to have complained.  However, it would not be true for a
port that required zero rather than sign extension (which AFAIK we've
never had).

It's probably worth noting that this assumption is in the opposite
direction from what mode_rep_extended describes.  mode_rep_extended
says that "proper" truncation leads to a guarantee of sign extension.
truncated_for_mode assumes that sign extension avoids the need for
"proper" truncation.  On MIPS, the former is only true for truncation
from 64 bits to 32 bits, whereas the latter is true for all cases (such
as 64 bits to 16 bits).

And that feeds into the mips.md change in r16-7265-ga9e48eca3a6eef.
The change was:

 (define_insn_and_split "*extenddi_truncate<mode>"
   [(set (match_operand:DI 0 "register_operand" "=d")
        (sign_extend:DI
-           (truncate:SHORT (match_operand:DI 1 "register_operand" "d"))))]
+           (truncate:SUBDI (match_operand:DI 1 "register_operand" "d"))))]
   "TARGET_64BIT && !TARGET_MIPS16 && !ISA_HAS_EXTS"

The old :SHORT pattern existed because QI and HI values are only
guaranteed to be sign-extensions of bit 31 of the register, not bits
7 or 15 (respectively).  Thus we have the worst of both worlds:

(1) truncation from DI is not a nop.  It requires a left shift by
    at least 32 bits and a right shift by the same amount.

(2) sign extension to DI is not a nop.  It requires a left shift and
    a right shift in the normal way (by 56 bits for QI and 48 bits
    for HI).

So a separate truncation and extension would yield four shifts.
The pattern above exists to reduce this to two shifts, since (2)
subsumes (1).

But the :SI case is different:

(1) truncation from DI is not a nop.  It requires a left shift by 32
    and a right shift by 32, as above.

(2) sign extension from SI to DI is a nop.

(2) is implemented by:

;; When TARGET_64BIT, all SImode integer and accumulator registers
;; should already be in sign-extended form (see TARGET_TRULY_NOOP_TRUNCATION
;; and truncdisi2).  We can therefore get rid of register->register
;; instructions if we constrain the source to be in the same register as
;; the destination.
;;
;; Only the pre-reload scheduler sees the type of the register alternatives;
;; we split them into nothing before the post-reload scheduler runs.
;; These alternatives therefore have type "move" in order to reflect
;; what happens if the two pre-reload operands cannot be tied, and are
;; instead allocated two separate GPRs.  We don't distinguish between
;; the GPR and LO cases because we don't usually know during pre-reload
;; scheduling whether an operand will be LO or not.
(define_insn_and_split "extendsidi2"
  [(set (match_operand:DI 0 "register_operand" "=d,l,d")
        (sign_extend:DI (match_operand:SI 1 "nonimmediate_operand" "0,0,m")))]
  "TARGET_64BIT"
  "@
   #
   #
   lw\t%0,%1"
  "&& reload_completed && register_operand (operands[1], VOIDmode)"
  [(const_int 0)]
{
  emit_note (NOTE_INSN_DELETED);
  DONE;
}
  [(set_attr "move_type" "move,move,load")
   (set_attr "mode" "DI")])

So extending the first pattern above from :SHORT to :SUBDI is not really
an optimisation, in the sense that it doesn't add new information.
Not providing the combination allows the truncation or sign-extension
to be optimised with surrounding code.

I suppose the argument in favour of going from :SHORT to :SUBDI is
that it might avoid a move in some cases.  But (a) I think that would
need to be measured further, (b) it might instead mean that the
extendsidi2 pattern needs to be tweaked for modern RA choices,
and (c) it doesn't really feel like stage 4 material.

I can understand where the changes came from.  The output of combine
was clearly wrong before r16-7265-ga9e48eca3a6eef.  And what combine
did looked bad.  But I don't think combine itself did anything wrong.
IMO, all it did was expose the problems in the existing RTL.  Expand
dropped a necessary sign-extension and the rest flowed from there.

In particular, the old decisions based on truncated_to_mode seemed
correct.  The thing that the truncated_to_mode patch changed was the
assumption that a 64-bit register containing a "u16 lower" parameter
could be truncated with a SUBREG.  And that's true, since it's
guaranteed by the ABI.  The parameter is zero-extended from bit 16
and so the register contains a sign extension of bit 16 (i.e. 0).
And that was the information that truncated_to_mode was using.

I tested the patch on mips64-linux-gnu (all 3 ABIs).  The patch fixes
regressions in:

- gcc.target/mips/octeon-exts-7.c (n32 & 64)
- gcc.target/mips/truncate-1.c (n32 & 64)
- gcc.target/mips/truncate-2.c (n32)
- gcc.target/mips/truncate-6.c (64)

gcc/
	PR middle-end/118608
	* rtlanal.cc (truncated_to_mode): Revert a change made on 2026-02-03.
	* config/mips/mips.md (*extenddi_truncate<mode>): Likewise.
2026-03-09 08:38:31 +00:00
Takayuki 'January June' Suwa
1ad01d1aa2 xtensa: Make use of compact insn definition syntax more
The remaining MD templates with multiple alternatives will also be re-
written using compact syntax.

gcc/ChangeLog:

	* config/xtensa/xtensa.md (movdi_internal, movdf_internal, *btrue,
	*ubtrue, movsicc_internal0, movsicc_internal1, movsfcc_internal0,
	movsfcc_internal1):
	Rewrite in compact syntax.
2026-03-09 00:39:14 -07:00
GCC Administrator
debb3d46e4 Daily bump. 2026-03-09 00:16:25 +00:00
Andrew Pinski
8ee479c278 riscv: Fix formating of diagnostic [PR124403]
Just some small formating of the digatnotic is required here.
A missing space after the semicolon. And move must out of the quotes.

Pushed as obvious after a quick build and test for riscv64-linux-gnu.

	PR target/124403

gcc/ChangeLog:

	* config/riscv/riscv.cc (riscv_get_vls_cc_attr): Fix formating
	of the diagnostic.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
2026-03-08 14:29:12 -07:00
Jose E. Marchesi
891d1d0fcd a68: fix calls to strtol and stroll [PR algol68/124372]
This commit fixes the following problems related to parsing integer
and bits denotations:

1. strtou?l should be used only if itis 64-bit long.  Otherwise, use
   strtou?l.

2. Use unsigned conversions for bits denotations radix, for
consistency.

Tested in i686-linux-gnu and x86_64-linux-gnu.

Signed-off-by: Jose E. Marchesi <jemarch@gnu.org>

gcc/algol68/ChangeLog

	PR algol68/124372
	* a68-low-units.cc (a68_lower_denotation): Call to strtoull if
	INT64_T_IS_LONG is not defined, strtol otherwise.
	* a68-parser-scanner.cc (get_next_token): Use strtoul for radix
	instead of strtol.
2026-03-08 19:47:37 +01:00
GCC Administrator
6c5de6335f Daily bump. 2026-03-08 00:16:27 +00:00
Jason Merrill
fcb78dea6b c++/modules: fix -MG for header units [PR123622]
With -MG we should allow a nonexistent header unit, as we do with a
nonexistent #include.  But still import it if available.

	PR c++/123622

gcc/cp/ChangeLog:

	* module.cc (preprocess_module): Check deps.missing_files.

gcc/testsuite/ChangeLog:

	* g++.dg/modules/dep-6.C: New test.

Co-authored-by: <mtxn@duck.com>
2026-03-07 17:38:41 -05:00
Sandra Loosemore
c600bb7aef doc: Move specs documentation to GCC internals manual [PR69367] [PR69849]
The description of specs should have ended up in the GCC internals
manual instead of the user-facing documentation when the two manuals
were split many years ago.

gcc/ChangeLog
	PR driver/69367
	PR driver/69849
	* Makefile.in (TEXI_GCCINT_FILES): Add specs.texi.
	* doc/gccint.texi: Include it.
	* doc/install.texi: Fix cross-references.
	* doc/invoke.texi: Likewise.
	(Option Summary): Reclassify -specs/--specs as a developer option.
	(Overall Options): Move -specs= documentation to...
	(Developer Options): ...here.
	(Spec Files): Move entire section to....
	* doc/specs.texi: ....new file.
	* common.opt.urls: Regenerated.
2026-03-07 22:23:05 +00:00
Eric Botcazou
a17b22cfa9 Ada: adjust pattern matching to new stack probes on x86/Linux
gcc/ada/
	PR target/124336
	* init.c (__gnat_adjust_context_for_raise) [x86/Linux]: Fix typo.
2026-03-07 22:47:15 +01:00
François Dumont
698a6af5dc libstdc++: [_GLIBCXX_DEBUG][__cplusplus >= 201103L] Remove useless workaround
Starting with C++11 we leverage on template parameter requirement to prevent
instantiation of methods taking iterators with invalid types.
So the _GLIBCXX_DEBUG mode do not need to check for potential ambiguity between
integer type and iterator type anymore.

libstdc++-v3/ChangeLog:

	* include/debug/functions.h [__cplusplus >= 201103L]
	(__foreign_iterator_aux): Remove.
	(__foreign_iterator): Adapt to use __foreign_iterator_aux2.
	* include/debug/helper_functions.h [__cplusplus >= 201103L]:
	Remove include bits/cpp_type_traits.h.
	(_Distance_traits<_Integral, std::__true_type>): Remove.
	(__valid_range_aux(_Integral, _Integral, std::__true_type)):
	Remove.
	(__valid_range_aux(_Iterator, _Iterator, std::__false_type)): Remove.
	(__valid_range_aux(_Integral, _Integral, _Distance_traits<_Integral>::__type&,
	std::__true_type)): Remove.
	(__valid_range_aux(_Iterator, _Iterator, _Distance_traits<_Iterator>::__type&,
	std::__false_type)): Remove.
	(__valid_range(_Iterator, _Iterator)): Adapt.
	(__valid_range(_Iterator, _Iterator, _Distance_traits<_Iterator>::__type&)): Adapt.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
2026-03-07 22:04:19 +01:00
Marek Polacek
9f99ee9d10 c++: add fixed test [PR39057]
This was fixed by r16-6725 and we no longer crash.  The error is
expected.

	PR c++/39057

gcc/testsuite/ChangeLog:

	* g++.dg/template/friend89.C: New test.
2026-03-07 12:30:18 -05:00
Eric Botcazou
0fdaf0eb61 Ada: adjust pattern matching to new stack probes on x86/Linux
This fixes the couple of ACATS regressions introduced by the change:

                === acats tests ===
FAIL:  c52103x
FAIL:  c52104x

gcc/ada/
	PR target/124336
	* init.c (__gnat_adjust_context_for_raise) [x86/Linux]: Adjust
	pattern matching to new stack probes.
2026-03-07 15:11:18 +01:00
Jørgen Kvalsvik
b1859c2a3d Improve speed of masking table algorithm for MC/DC
The masking table was computed by considering the cartesian product of
incoming edges, ordering the pairs, and doing upwards BFS searches
from the sucessors of the lower topologically index'd ones (higher in
the graph). The problem with this approach is that all the nodes we
find from the higher candidates would also be found from the lower
candidates, and since we want to collect the set intersection, any
higher candidate would be dominated by lower candidates.

We need only consider adjacent elements in the sorted set of
candidates.  This has a dramatic performance impact for large
functions.  The worst case is expressions on the form (x && y && ...)
and (x || y || ...) with up-to 64 elements. I did a wallclock
comparison of the full analysis phase (including emitting the GIMPLE):

test.c:
    int fn (int a[])
    {
      (a[0] && a[1] && ...) // 64 times
      (a[0] && a[1] && ...) // 64 times
      ...                   // 500 times
    }

    int main ()
    {
      int a[64];
      for (int i = 0; i != 10000; ++i)
        {
          for (int k = 0; k != 64; ++k)
            a[k] = i % k;
          fn1 (a);
        }
    }

Without this patch:
    fn1 instrumented in 20822.303 ms (41.645 ms per expression)

With this patch:
    fn1 instrumented in 1288.548  ms (2.577  ms per expression)

I also tried considering terms left-to-right and, whenever the search
found an already-processed expression it would stop the search and
just insert its complete table entry, but this had no measurable
impact on compile time, and the result was a slightly more complicated
function.

This inefficiency went unnoticed for a while, because these
expressions aren't very common.  The most I've seen in the wild is 27
conditions, and that involved a lot of nested expressions which aren't
impacted as much.

gcc/ChangeLog:

	* tree-profile.cc (struct conds_ctx): Add edges.
	(topological_src_cmp): New function.
	(masking_vectors): New search strategy.
2026-03-07 12:45:17 +01:00
Robin Dapp
2a155ceffe cse: Only use non-reg vec_select simplifications. [PR121649]
When merging classes, cse computes new equivalences for constants.
In the PR we have

  (insn 1173 1172 1174 2 (set (reg:V8QI 33 v1)
         (const_vector:V8QI [
                 (const_int 3 [0x3])
                 (const_int -4 [0xfffffffffffffffc])
                 (const_int 0 [0]) repeated x6
             ])) "pr121649.c":63:3 1325 {*aarch64_simd_movv8qi}
      (nil))

of which the second element is selected:

  (insn 1178 1177 1179 2 (set (reg:QI 4 x4)
          (vec_select:QI (reg:V8QI 33 v1)
              (parallel [
                      (const_int 1 [0x1])
                  ]))) "pr121649.c":63:3 2968 {aarch64_get_lanev8qi}
       (expr_list:REG_EQUAL (const_int -4 [0xfffffffffffffffc])
          (nil)))

We find (const_int 3 [0x3]) and a few others to be equivalent, among
them (reg:QI v1).  This is a "fake set" that we create to help CSE extract
const_vector elements and reuse them.  Element 0 is special, though.
We lowpart-subreg simplify it to (reg:QI v1) directly and, as the register
stays the same, consider it equivalent to (reg:V8QI v1).

Because both equivs refer to the same hard reg, in merge_equiv_classes, the
old (reg:V8QI) equiv is deleted and replaced by the new (reg:QI) one,
forgetting that the old equiv had 7 more elements.
Subsequently, extracting element 1 of a zero-extended QImode register results
in "0" instead of the correct "-4".

Therefore, this patch only uses those vec_select simplification that do
not directly result in a register.

	PR rtl-optimization/121649

gcc/ChangeLog:

	* cse.cc (find_sets_in_insn):  Only use non-reg vec_select
	simplifications.

gcc/testsuite/ChangeLog:

	* gcc.dg/torture/pr121649.c: New test.
2026-03-07 11:56:32 +01:00
Martin Uecker
4edd2957ad c: Fix ICE related to tags and hardbool attribute [PR123856]
The hardbool attribute creates special enumeration types,
but the tag is not set correctly, which causes broken diagnostics
and an ICE with the new helper function to get the tag.

	PR c/123856

gcc/c-family/ChangeLog:
	* c-attribs.cc (handle_hardbool_attribute): Fix TYPE_NAME.

gcc/testsuite/ChangeLog:
	* gcc.dg/pr123856.c: New test.
2026-03-07 10:44:39 +01:00
GCC Administrator
381af4e29b Daily bump. 2026-03-07 00:16:29 +00:00
David Malcolm
c2c64cfcd0 testsuite: fix ICEs in analyzer plugin with CPython >= 3.11 [PR107646,PR112520]
In GCC 14 the testsuite gained a plugin that "teaches" the analyzer
about the CPython API, trying for find common mistakes:
  https://gcc.gnu.org/wiki/StaticAnalyzer/CPython

Unfortunately, this has been crashing for more recent versions of
CPython.

Specifically, in Python 3.11,  PyObject's ob_refcnt was moved to an
anonymous union (as part of PEP 683 "Immortal Objects, Using a Fixed
Refcount").  The plugin attempts to find the field but fails, but has
no error-handling, leading to a null pointer dereference.

Also, https://github.com/python/cpython/pull/101292 moved the "ob_digit"
from struct _longobject to a new field long_value of a new
struct _PyLongValue, leading to similar analyzer crashes when not
finding the field.

The following patch fixes this by
* looking within the anonymous union for the ob_refcnt field if it can't
find it directly
* gracefully handling the case of not finding "ob_digit" in PyLongObject
* doing more lookups once at plugin startup, rather than continuously on
analyzing API calls
* adding diagnostics and more error-handling to the plugin startup, so that
if it can't find something in the Python headers it emits a useful note
when disabling itself, e.g.
  cc1: note: could not find field 'ob_digit' of CPython type 'PyLongObject' {aka 'struct _longobject'}
* replacing some copy-and-pasted code with member functions of a new
"class api" (though various other cleanups could be done)

Tested with:
* CPython 3.8: all tests continue to PASS
* CPython 3.13: fixes the ICEs, 2 FAILs remain (reference counting false
negatives)

Given that this is already a large patch, I'm opting to only fix the
crashes and defer the 2 remainings FAILs and other cleanups to followup
work.

gcc/analyzer/ChangeLog:
	PR testsuite/112520
	* region-model-manager.cc
	(region_model_manager::get_field_region): Assert that the args are non-null.

gcc/testsuite/ChangeLog:
	PR analyzer/107646
	PR testsuite/112520
	* gcc.dg/plugin/analyzer_cpython_plugin.cc: Move everything from
	namespace ana:: into ana::cpython_plugin.  Move global tree values
	into a new "class api".
	(pyobj_record): Replace with api.m_type_PyObject.
	(pyobj_ptr_tree): Replace with api.m_type_PyObject_ptr.
	(pyobj_ptr_ptr): Replace with  api.m_type_PyObject_ptr_ptr.
	(varobj_record): Replace with api.m_type_PyVarObject.
	(pylistobj_record): Replace with api.m_type_PyListObject.
	(pylongobj_record): Replace with api.m_type_PyLongObject.
	(pylongtype_vardecl): Replace with api.m_vardecl_PyLong_Type.
	(pylisttype_vardecl): Replace with api.m_vardecl_PyList_Type.
	(get_field_by_name): Add "complain" param and use it to issue a
	note on failure.  Assert that type and	name are non-null.  Don't
	crash on fields that are anonymous unions, and special-case
	looking within them for "ob_refcnt" to	work around the
	Python 3.11 change for PEP 683 (immortal objects).
	(get_sizeof_pyobjptr): Convert to...
	(api::get_sval_sizeof_PyObject_ptr): ...this
	(init_ob_refcnt_field): Convert to...
	(api::init_ob_refcnt_field): ...this.
	(set_ob_type_field): Convert to...
	(api::set_ob_type_field): ..this.
	(api::init_PyObject_HEAD): New.
	(api::get_region_PyObject_ob_refcnt): New.
	(api::do_Py_INCREF): New.
	(api::get_region_PyVarObject_ob_size): New.
	(api::get_region_PyLongObject_ob_digit): New.
	(inc_field_val): Convert to...
	(api::inc_field_val): ...this.
	(refcnt_mismatch::refcnt_mismatch): Add tree params for refcounts
	and initialize corresponding fields.  Fix whitespace.
	(refcnt_mismatch::emit): Use stored tree values, rather than
	assuming we have constants, and crashing non-constants.  Delete
	commented-out dead code.
	(refcnt_mismatch::foo): Delete.
	(refcnt_mismatch::m_expected_refcnt_tree): New field.
	(refcnt_mismatch::m_actual_refcnt_tree): New field.
	(retrieve_ob_refcnt_sval): Simplify using class api.
	(count_pyobj_references): Likewise.
	(check_refcnt): Likewise.  Don't warn on UNKNOWN values.  Use
	get_representative_tree for the expected and actual values and
	skip the warning if it fails, rather than assuming we have
	constants and crashing on non-constants.
	(count_all_references): Update comment.
	(kf_PyList_Append::impl_call_pre): Simplify using class api.
	(kf_PyList_Append::impl_call_post): Likewise.
	(kf_PyList_New::impl_call_post): Likewise.
	(kf_PyLong_FromLong::impl_call_post): Likewise.
	(get_stashed_type_by_name): Emit note if the type couldn't be
	found.
	(get_stashed_global_var_by_name): Likewise for globals.
	(init_py_structs): Convert to...
	(api::init_from_stashed_types): ...this.  Bail out with an error
	code if anything fails.  Look up more things at startup, rather
	than during analysis of calls.
	(ana::cpython_analyzer_events_subscriber): Rename to...
	(ana::cpython_plugin::analyzer_events_subscriber): ...this.
	(analyzer_events_subscriber::analyzer_events_subscriber):
	Initialize m_init_failed.
	(analyzer_events_subscriber::on_message<on_tu_finished>):
	Update for conversion of init_py_structs to
	api::init_from_stashed_types and bail if it fails.
	(analyzer_events_subscriber::on_message<on_frame_popped): Don't
	run if plugin initialization failed.
	(analyzer_events_subscriber::m_init_failed): New field.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2026-03-06 18:47:05 -05:00
Patrick Palka
d0d3d4dde0 c++: ICE mangling C auto... tparm [PR124297]
After r16-7491, the constraint on a C auto... tparm is represented as a
fold-expression (in TEMPLATE_PARM_CONSTRAINTS) instead of a concept-id (in
PLACEHOLDER_TYPE_CONSTRAINTS).  So we now need to strip this fold-expression
before calling write_type_constraint, like we do in the type template
parameter case a few lines below.

	PR c++/124297

gcc/cp/ChangeLog:

	* mangle.cc (write_template_param_decl) <case PARM_DECL>:
	Strip fold-expression before calling write_type_constraint.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/concepts-variadic4.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
2026-03-06 17:59:11 -05:00
Andrew Pinski
5eecb51ad7 aarch64: Fix uint64_t[8] usage after including "arm_neon.h" [PR124126]
aarch64_init_ls64_builtins_types currently creates an array with type uint64_t[8]
and then sets the mode to V8DI. The problem here is if you used that array
type before, you would get a mode of BLK.
This causes an ICE in some cases, with the C++ front-end with -g, you would
get "type variant differs by TYPE_MODE" and in some cases even without -g,
"canonical types differ for identical types".

The fix is to do build_distinct_type_copy of the array in aarch64_init_ls64_builtins_types
before assigning the mode to that copy. We keep the same ls64 structures correct and
user provided arrays are not influenced when "arm_neon.h" is included.

Build and tested on aarch64-linux-gnu.

	PR target/124126

gcc/ChangeLog:

	* config/aarch64/aarch64-builtins.cc (aarch64_init_ls64_builtins_types): Copy
	the array type before setting the mode.

gcc/testsuite/ChangeLog:

	* g++.target/aarch64/pr124126-1.C: New test.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
2026-03-06 14:34:36 -08:00
Joseph Myers
0c3b30c71c Update gcc fr.po
* fr.po: Update.
2026-03-06 21:40:49 +00:00
Qing Zhao
9b3c24577e Fix [PR124230]
For a pointer array reference that is annotated with counted_by attribute,
such as:

  struct annotated {
    int *c __attribute__ ((counted_by (b)));
    int b;
  };

  struct annotated *p = setup (10);
  p->c[12] = 2; //out of bound access

the IR for p->c[12] is:
  (.ACCESS_WITH_SIZE (p->c, &p->b, 0B, 4) + 48) = 2;

The current routine get_index_from_offset in c-family/c-ubsan.cc cannot
handle the integer constant offset "48" correctly.

The fix is to enhance "get_index_from_offset" to correctly handle the constant
offset.

	PR c/124230

gcc/c-family/ChangeLog:

	* c-ubsan.cc (get_index_from_offset): Handle the special case when
	the offset is an integer constant.

gcc/testsuite/ChangeLog:

	* gcc.dg/ubsan/pointer-counted-by-bounds-124230-char.c: New test.
	* gcc.dg/ubsan/pointer-counted-by-bounds-124230-float.c: New test.
	* gcc.dg/ubsan/pointer-counted-by-bounds-124230-struct.c: New test.
	* gcc.dg/ubsan/pointer-counted-by-bounds-124230-union.c: New test.
	* gcc.dg/ubsan/pointer-counted-by-bounds-124230.c: New test.
2026-03-06 20:39:01 +00:00
Andrew Pinski
4665987e91 c: Fix pragma inside a pragma [PR97991}
After r0-72806-gbc4071dd66fd4d, c_parser_consume_token will
assert if we get a pragma inside c_parser_consume_token but
pragma processing will call pragma_lex which then calls
c_parser_consume_token. In the case of pragma with expansion
(redefine_extname, message and sometimes pack [and some target
specific pragmas]) we get the expanded tokens that includes
CPP_PRAGMA. We should just allow it instead of doing an assert.
This follows what the C++ front-end does even and we no longer
have an ICE.

Bootstrapped and tested on x86_64-linux-gnu.

	PR c/97991

gcc/c/ChangeLog:

	* c-parser.cc (c_parser_consume_token): Allow
	CPP_PRAGMA if inside a pragma.

gcc/testsuite/ChangeLog:

	* c-c++-common/cpp/pr97991-1.c: New test.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
2026-03-06 11:38:14 -08:00
Saurabh Jha
bef3d98610 aarch64: mingw: Fix regression in C++ support
Fixes regression in C++ support without exception handling by:
1. Moving Makefile fragment config/i386/t-seh-eh to
   config/mingw/t-seh-eh that handles C++ exception handling. This is
   sufficient to fix the regression even if the exception handling
   itself is not implemented yet.
2. Changing existing references of t-seh-eh in libgcc/config.host and
   add it for aarch64-*-mingw*.

With these changes, the compiler can now be built with C and C++.

This doesn't add support for Structured Exception Handling (SEH)
which will be done separately.

libgcc/ChangeLog:

	* config.host: Set tmake_eh_file for aarch64-*-mingw* and update
	it for x86_64-*-mingw* and x86_64-*-cygwin*.
	* config/i386/t-seh-eh: Move to...
	* config/mingw/t-seh-eh: ...here.
	* config/aarch64/t-no-eh: Removed.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/mingw/mingw.exp: Add support for C++ files.
	* gcc.target/aarch64/mingw/minimal_new_del.C: New test.

Co-Authored-By: Evgeny Karpov <evgeny.karpov@arm.com>
2026-03-06 17:42:33 +00:00
Jakub Jelinek
2365f48836 testsuite: Add testcase for already fixed PR [PR122000]
This testcase started to be miscompiled with my r15-9131 change
on arm with -march=armv7-a -mfpu=vfpv4 -mfloat-abi=hard -O and got
fixed with r16-6548 PR121773 change.

2026-03-06  Jakub Jelinek  <jakub@redhat.com>

	PR target/122000
	* gcc.c-torture/execute/pr122000.c: New test.
2026-03-06 14:33:19 +01:00
Nathan Myers
2bfaa218b0 libstdc++: bitset _GLIBCXX_ASSERTIONS op[] fixes
C++11 forbids a compound statement, as seen in the definition
of __glibcxx_assert(), in a constexpr function. This patch
open-codes the assertion in `bitset<>::operator[] const` for
C++11 to fix a failure in `g++.old-deja/g++.martin/bitset1.C`.

Also, it adds `{ dg-do compile }` in another test to suppress
a spurious UNRESOLVED complaint.

libstdc++-v3/ChangeLog:
	* include/std/bitset (operator[]() const): Customize bounds
	check for C++11 case.
	* testsuite/20_util/bitset/access/subscript_const_neg.cc:
	Suppress UNRESOLVED complaint.
2026-03-06 07:12:23 -05:00
Richard Earnshaw
e1077ad575 arm: testsuite: remove some flaky code-size tests
Code size tests on Arm are notoriously flaky because there are
numerous ISA variants (Arm, Thumb-1 and Thumb-2) to consider in
addition to a number of other variants from multiple sub-architecture
and micro-architectural tuning options.  In combination this means
that we have continuous testsuite churn if the constraints are tight
enough to detect real regressions.

So this patch eliminates most of these checks, except where the code
size test is the only test that is done (other than the compilation
itself).  Where that is the case I've tightened the compiler options
to limit the test to one set of architecture flags, thereby
eliminating most of the sources of variation.

In some cases I've replaced a code-size check with some other test of
the output, based on the intent of the original patch that motivated
the test.  For example, the max-insns-skipped test now checks that an
IT instruction is not generated rather than checking the size of the
binary (which was a side-effect of not generating IT).

gcc/testsuite/ChangeLog:

	* lib/target-supports.exp: Add arm_arch_v7a_thumb.
	* gcc.target/arm/ifcvt-size-check.c: Add options to force thumb1.
	* gcc.target/arm/ivopts-2.c: Remove object size check.
	* gcc.target/arm/ivopts-3.c: Likewise.
	* gcc.target/arm/ivopts-4.c: Likewise.
	* gcc.target/arm/ivopts-5.c: Likewise.
	* gcc.target/arm/ivopts.c: Likewise.
	* gcc.target/arm/max-insns-skipped.c: Scan for absence of an IT
	instruction.  Remove object size check.  Use arm_arch_v7a_thumb.
	* gcc.target/arm/pr43597.c: Remove object size check and use
	arm_arch_v7a_thumb.
	* gcc.target/arm/pr63210.c: Use arm_arch_v5t_thumb options.
	* gcc.target/arm/split-live-ranges-for-shrink-wrap.c: Remove
	object size check and use arm_arch_v5t_thumb options.
2026-03-06 11:24:47 +00:00
Richard Earnshaw
41aba0b725 arm: testsuite: Fix typo on target arm_cpu_cortex_a53
When testing the effective target these tests were using the wrong
name since they omitted the trailing _ok.  This was causing some tests
to fail to execute correclty.

gcc/testsuite/ChangeLog:

	* gcc.target/arm/aes-fuse-1.c: Add _ok to the effective_target.
	* gcc.target/arm/aes-fuse-2.c: Likewise.
2026-03-06 11:24:34 +00:00
Tomasz Kamiński
468124a1aa libstdc++: Remove unnecessary string in filesystem::path formatter
libstdc++-v3/ChangeLog:

	* include/bits/fs_path.h (std::formatter<filesystem::path, _CharT>):
	Format _Utf_view directly via __formatter_str::_M_format_range.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
2026-03-06 11:35:24 +01:00
Andre Vehreschild
9282a60d5b Fortran: Caf_shmem - Fix compile issue on cygwin [PR124371]
libgfortran/ChangeLog:

	PR libfortran/124371
	* caf/shmem/supervisor.c (startWorker): Use defined(HAVE_FORK)
	instead of !defined(WIN32) for preprocessor conditional.
2026-03-06 11:18:55 +01:00
Jonathan Wakely
e159c78851 libstdc++: Use aligned new for filesystem::path internals [PR122300]
As Bug 122300 shows, we have at least one target where the
static_assert added by r16-4422-g1b18a9e53960f3 fails. This patch
resurrects the original proposal for using aligned new that I posted in
https://gcc.gnu.org/pipermail/libstdc++/2025-October/063904.html

Instead of just asserting that the memory from operator new will be
sufficiently aligned, check whether it will be and use aligned new if
needed. We don't just use aligned new unconditionally, because that can
add overhead on targets where malloc already meets the requirements.

libstdc++-v3/ChangeLog:

	PR libstdc++/122300
	* src/c++17/fs_path.cc (path::_List::_Impl): Remove
	static_asserts.
	(path::_List::_Impl::required_alignment)
	(path::_List::_Impl::use_aligned_new): New static data members.
	(path::_List::_Impl::create_unchecked): Check use_aligned_new
	and use aligned new if needed.
	(path::_List::_Impl::alloc_size): New static member function.
	(path::_List::_Impl_deleter::operator): Check use_aligned_new
	and use aligned delete if needed.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
2026-03-06 09:58:33 +00:00
Jakub Jelinek
46520c0d38 tree-inline: Fix up ICE on !is_gimple_reg is_gimple_reg_type copying [PR124135]
The first testcase below ICEs e.g. with -O2 on s390x-linux, the
second with -O2 -m32 on x86_64-linux.  We have
  <bb 2> [local count: 1073741824]:
  if (x_4(D) != 0)
    goto <bb 3>; [33.00%]
  else
    goto <bb 4>; [67.00%]

  <bb 3> [local count: 354334800]:
  _7 = qux (42);
  foo (0, &<retval>, _7);

  <bb 4> [local count: 1073741824]:
  return <retval>;
on a target where <retval> has gimple reg type but is
aggregate_value_p and TREE_ADDRESSABLE too.
fnsplit splits this into
  <bb 2> [local count: 354334800]:
  _1 = qux (42);
  foo (0, &<retval>, _1);

  <bb 3> [local count: 354334800]:
  return <retval>;
in the *.part.0 function and
  if (x_4(D) != 0)
    goto <bb 3>; [33.00%]
  else
    goto <bb 4>; [67.00%]

  <bb 3> [local count: 354334800]:
  <retval> = _Z3bari.part.0 ();

  <bb 4> [local count: 1073741824]:
  return <retval>;
in the original function.  Now, dunno if already that isn't
invalid because <retval> has TREE_ADDRESSABLE set in the latter, but
at least it is accepted by tree-cfg.cc verification.
  tree lhs = gimple_call_lhs (stmt);
  if (lhs
      && (!is_gimple_reg (lhs)
          && (!is_gimple_lvalue (lhs)
              || verify_types_in_gimple_reference
                   (TREE_CODE (lhs) == WITH_SIZE_EXPR
                    ? TREE_OPERAND (lhs, 0) : lhs, true))))
    {
      error ("invalid LHS in gimple call");
      return true;
    }
While lhs is not is_gimple_reg, it is is_gimple_lvalue here.
Now, inlining of the *.part.0 fn back into the original results
in
  <retval> = a;
statement which already is diagnosed by verify_gimple_assign_single:
    case VAR_DECL:
    case PARM_DECL:
      if (!is_gimple_reg (lhs)
          && !is_gimple_reg (rhs1)
          && is_gimple_reg_type (TREE_TYPE (lhs)))
        {
          error ("invalid RHS for gimple memory store: %qs", code_name);
          debug_generic_stmt (lhs);
          debug_generic_stmt (rhs1);
          return true;
        }
__float128/long double are is_gimple_reg_type, but both operands
aren't is_gimple_reg.

The following patch fixes it by doing separate load and store, i.e.
  _42 = a;
  <retval> = 42;
in this case.  If we want to change verify_gimple_assign to disallow
!is_gimple_reg (lhs) for is_gimple_reg_type (TREE_TYPE (lhs)), we'd
need to change fnsplit instead, but I'd be afraid such a change would
be more stage1 material (and certainly nothing that should be
even backported to release branches).

2026-03-05  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/124135
	* tree-inline.cc (expand_call_inline): If both gimple_call_lhs (stmt)
	and use_retvar aren't gimple regs but have gimple reg type, use
	separate load of use_retva into SSA_NAME and then store of it
	into gimple_call_lhs (stmt).

	* g++.dg/torture/pr124135-1.C: New test.
	* g++.dg/torture/pr124135-2.C: New test.
2026-03-06 10:33:09 +01:00
Jakub Jelinek
acf7028b79 match.pd: Move cast into p+ operand for (ptr) (x p+ y) p+ z -> (ptr) (x p+ (y + z)) [PR124358]
The following testcase is miscompiled since my r12-6382 change, because
it doesn't play well with the gimple_fold_indirect_ref function which uses
STRIP_NOPS and then has
  /* *(foo *)fooarrptr => (*fooarrptr)[0] */
  if (TREE_CODE (TREE_TYPE (subtype)) == ARRAY_TYPE
      && TREE_CODE (TYPE_SIZE (TREE_TYPE (TREE_TYPE (subtype)))) == INTEGER_CST
      && useless_type_conversion_p (type, TREE_TYPE (TREE_TYPE (subtype))))
    {
      tree type_domain;
      tree min_val = size_zero_node;
      tree osub = sub;
      sub = gimple_fold_indirect_ref (sub);
      if (! sub)
        sub = build1 (INDIRECT_REF, TREE_TYPE (subtype), osub);
      type_domain = TYPE_DOMAIN (TREE_TYPE (sub));
      if (type_domain && TYPE_MIN_VALUE (type_domain))
        min_val = TYPE_MIN_VALUE (type_domain);
      if (TREE_CODE (min_val) == INTEGER_CST)
        return build4 (ARRAY_REF, type, sub, min_val, NULL_TREE, NULL_TREE);
    }
Without the GENERIC
 #if GENERIC
 (simplify
   (pointer_plus (convert:s (pointer_plus:s @0 @1)) @3)
   (convert:type (pointer_plus @0 (plus @1 @3))))
 #endif
we have INDIRECT_REF of POINTER_PLUS_EXPR with int * type of NOP_EXPR
to that type of POINTER_PLUS_EXPR with pointer to int[4] ARRAY_TYPE, so
gimple_fold_indirect_ref doesn't create the ARRAY_REF.
But with it, it is simplified to NOP_EXPR to int * type from
POINTER_PLUS_EXPR with pointer to int[4] ARRAY_TYPE, the NOP_EXPR is
skipped over by STRIP_NOPS and the above code triggers.

The following patch fixes it by swapping the order, do NOP_EXPR
inside of POINTER_PLUS_EXPR first argument instead of NOP_EXPR with
POINTER_PLUS_EXPR.

2026-03-06  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/124358
	* match.pd ((ptr) (x p+ y) p+ z -> (ptr) (x p+ (y + z))): Simplify
	into (ptr) x p+ (y + z) instead.

	* gcc.c-torture/execute/pr124358.c: New test.
2026-03-06 08:14:09 +01:00
Andrew Pinski
d2881c26c2 testsuite/aarch64: Add testcae for already fixed bug [PR124078]
This big-endian testcase started to ICE with r16-7464-g560766f6e239a8
and then started to work r16-7506-g498983d9619351.
So it seems like a good idea to add the testcase for this
so it does not break again.

Pushed as obvious after a quick test to make sure it ICEd
before and it is passing now on aarch64-linux-gnu.

	PR rtl-optimization/124078

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/pr124078-1.c: New test.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
2026-03-05 21:59:41 -08:00
GCC Administrator
e0d9c5a23f Daily bump. 2026-03-06 00:16:27 +00:00
Jakub Jelinek
0970bb8565 c++: Avoid caching TARGET_EXPR slot value if exception is thrown from TARGET_EXPR_INITIAL [PR124145]
The following testcase is miscompiled, we throw exception only during
the first bar () call and not during the second and in that case reach
the inline asm.
The problem is that the TARGET_EXPR handling calls
            ctx->global->put_value (new_ctx.object, new_ctx.ctor);
first for aggregate/vectors, then
        if (is_complex)
          /* In case no initialization actually happens, clear out any
             void_node from a previous evaluation.  */
          ctx->global->put_value (slot, NULL_TREE);
and then recurses on TARGET_EXPR_INITIAL.
Even for is_complex it can actually store partially the result in the
slot before throwing.

When TARGET_EXPR_INITIAL doesn't throw, we do
  if (ctx->save_expr)
    ctx->save_expr->safe_push (slot);
and that arranges for the value in slot be invalidated at the end of
surrounding CLEANUP_POINT_EXPR.
But in case when it does throw this isn't done.

The following patch fixes it by moving that push to save_expr
before the if (*jump_target) return NULL_TREE; check.

2026-03-05  Jakub Jelinek  <jakub@redhat.com>

	PR c++/124145
	* constexpr.cc (cxx_eval_constant_expression) <case TARGET_EXPR>: Move
	ctx->save_expr->safe_push (slot) call before if (*jump_target) test.
	Use TARGET_EXPR_INITIAL instead of TREE_OPERAND.

	* g++.dg/cpp26/constexpr-eh18.C: New test.
2026-03-05 21:43:55 +01:00
Nathan Myers
1b404c5744 libstdc++: bitset subscript check when _GLIBCXX_ASSERTIONS [PR118341]
Changes in v3:
 - Delete redundant "dg" annotations.

Changes in v2:
 - Rejigger testing.
 - Add tests for regular bitset<>::op[].

Perform __glibcxx_assert bounds check on indices to bitset<>::op[]
for const and non-const overloads.

Also, add previously neglected regular tests for bitset<>::op[].

libstdc++-v3/ChangeLog
	PR libstdc++/118341
	* include/std/bitset (operator[] (2x)): Add assertion.
	* testsuite/20_util/bitset/access/118341_neg1.cc: New test.
	* testsuite/20_util/bitset/access/118341_neg2.cc: Same.
	* testsuite/20_util/bitset/access/118341_smoke.cc: Same.
	* testsuite/20_util/bitset/access/subscript.cc: Same.
	* testsuite/20_util/bitset/access/subscript_const_neg.cc: Same.
2026-03-05 13:23:36 -05:00
François Dumont
dae387d2c8 libstdc++: [_GLIBCXX_DEBUG] Hide _Safe_unordered_container methods
In _Safe_unordered_container the _M_invalidate_all and _M_invalidate_all_if
are made public to be used in nested struct _UContMergeGuard.

Thanks to friend declaration we can avoid those method to be accessible from
user code.

libstdc++-v3/ChangeLog:

	* include/debug/safe_unordered_container.h
	(_Safe_unordered_container::_UContInvalidatePred): Move outside class, at
	namespace scope. Declare friend.
	(_Safe_unordered_container::_UMContInvalidatePred): Likewise.
	(_Safe_unordered_container::_UContMergeGuard): Likewise.
	(_Safe_unordered_container::_M_invalidate_all): Make protected.
	(_Safe_unordered_container::_M_invalidate_all_if): Likewise.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
2026-03-05 19:11:53 +01:00
Jose E. Marchesi
ee8ca6c927 a68: fix wrapping C functions returning void [PR algol68/124322]
This patch fixes a68_wrap_formal_proc_hole so it doesn't assume that
wrapped C functions returning void return Algol 68 void values, which
are empty records.

Tested in i686-linux-gnu and x86_64-linux-gnu.

Signed-off-by: Jose E. Marchesi <jemarch@gnu.org>

gcc/algol68/ChangeLog

	PR algol68/124322
	* a68-low-holes.cc (a68_wrap_formal_proc_hole): Wrap functions
	returning void properly.
2026-03-05 16:28:37 +01:00
Alice Carlotti
e3d2277d51 aarch64 libgcc: Fix mingw build [PR124333]
Make __aarch64_cpu_features unconditionally available.  This permits the
unconditional use of this global inside __arm_get_current_vg, which was
introduced in r16-7637-g41b4a73f370116.

For now this global is only initialised when <sys/auxv.h> is available,
but we can extend this in future to support other ways of initialising
the bits used for SME support, and use this remove __aarch64_have_sme.
This approach was recently adopted by LLVM.

This patch does introduce an inconsistency with __aarch64_have_sme when
<sys/auxv.h> is unavailable.  However, this doesn't introduce any
regressions, because one of the following conditions will hold:

1. SVE is enabled at compile time whenever we use a streaming or
streaming compatible function.  In this case the compiler won't need to
use __arm_get_current_vg, so it doesn't matter if it gives the wrong
answer.

2. There is a use of a streaming or streaming compatible function when
we don't know whether SVE is enabled.  In order to get correct DWARF
unwind information, we then have to be able to test for SVE availability
at runtime.  This isn't possible until a working __arm_get_current_vg
implementation is available, so the configuration has never (yet) been
supported.

libgcc/ChangeLog:

	PR target/124333
	* config/aarch64/cpuinfo.c: Define __aarch64_cpu_features
	unconditionally.
2026-03-05 15:12:56 +00:00
Victor Do Nascimento
4a30b45ffe vect: fix vectorization of non-gather elementwise loads [PR124037]
For the vectorization of non-contiguous memory accesses such as the
vectorization of loads from a particular struct member, specifically
when vectorizing with unknown bounds (thus using a pointer and not an
array) it is observed that inadequate alignment checking allows for
the crossing of a page boundary within a single vectorized loop
iteration. This leads to potential segmentation faults in the
resulting binaries.

For example, for the given datatype:

    typedef struct {
      uint64_t a;
      uint64_t b;
      uint32_t flag;
      uint32_t pad;
    } Data;

and a loop such as:

int
foo (Data *ptr) {
  if (ptr == NULL)
    return -1;

  int cnt;
  for (cnt = 0; cnt < MAX; cnt++) {
    if (ptr->flag == 0)
      break;
    ptr++;
  }
  return cnt;
}

the vectorizer yields the following cfg on armhf:

<bb 1>:
_41 = ptr_4(D) + 16;
<bb 2>:
_44 = MEM[(unsigned int *)ivtmp_42];
ivtmp_45 = ivtmp_42 + 24;
_46 = MEM[(unsigned int *)ivtmp_45];
ivtmp_47 = ivtmp_45 + 24;
_48 = MEM[(unsigned int *)ivtmp_47];
ivtmp_49 = ivtmp_47 + 24;
_50 = MEM[(unsigned int *)ivtmp_49];
vect_cst__51 = {_44, _46, _48, _50};
mask_patt_6.17_52 = vect_cst__51 == { 0, 0, 0, 0 };
if (mask_patt_6.17_52 != { 0, 0, 0, 0 })
  goto <bb 4>;
else
  ivtmp_43 = ivtmp_42 + 96;
  goto <bb 2>;
<bb4>
...

without any proper address alignment checks on the starting address
or on whether alignment is preserved across iterations. We therefore
fix the handling of such cases.

To correct this, we modify the logic in `get_load_store_type',
particularly the logic responsible for ensuring we don't read more
than the scalar code would in the context of early breaks, extending
it from handling not only gather-scatter and strided SLP accesses but
also allowing it to properly handle element-wise accesses, wherein we
specify that these need correct block alignment, thus promoting their
`alignment_support_scheme' from `dr_unaligned_supported' to
`dr_aligned'.

gcc/ChangeLog:

	PR tree-optimization/124037
	* tree-vect-stmts.cc (get_load_store_type): Fix
	alignment_support_scheme categorization for early
	break VMAT_ELEMENTWISE accesses.

gcc/testsuite/ChangeLog:

	* gcc.dg/vect/vect-pr124037.c: New.
	* g++.dg/vect/vect-pr124037.cc: New.
2026-03-05 14:02:24 +00:00
Richard Biener
e49ff17cb1 Fix typo
s/replacemend/replacement/

	* tree-vect-loop.cc (vectorizable_live_operation): Fix typo.
2026-03-05 14:50:58 +01:00
Richard Biener
c1926449ca Fix overly restrictive live-lane extraction replacement
The following fixes a regression introduced by r11-5542 which
restricts replacing uses of live original defs of now vectorized
stmts to when that does not require new loop-closed PHIs to be
inserted.  That restriction keeps the original scalar definition
live which is sub-optimal and also not reflected in costing.

The particular case the following fixes can be seen in
gcc.dg/vect/bb-slp-57.c is the case where we are replacing an
existing loop closed PHI argument.

	PR tree-optimization/98064
	* tree-vect-loop.cc (vectorizable_live_operation): Do
	not restrict replacing uses in a LC PHI.

	* gcc.dg/vect/bb-slp-57.c: Verify we do not keep original
	stmts live.
2026-03-05 14:34:15 +01:00
Jakub Jelinek
8b39ec7074 libiberty: Copy over .ARM.attributes section into *.debug.temp.o files [PR124365]
If gcc is configured on aarch64-linux against new binutils, such as
2.46, it doesn't emit into assembly markings like
        .section        .note.gnu.property,"a"
        .align  3
        .word   4
        .word   16
        .word   5
        .string "GNU"
        .word   0xc0000000
        .word   4
        .word   0x7
        .align  3
but instead emits
        .aeabi_subsection aeabi_feature_and_bits, optional, ULEB128
        .aeabi_attribute Tag_Feature_BTI, 1
        .aeabi_attribute Tag_Feature_PAC, 1
        .aeabi_attribute Tag_Feature_GCS, 1
The former goes into .note.gnu.propery section, the latter goes into
.ARM.attributes section.

Now, when linking without LTO or with LTO but without -g, all behaves
for the linked binaries the same, say for test.c
int main () {}
$ gcc -g -mbranch-protection=standard test.c -o test; readelf -j .note.gnu.property test

Displaying notes found in: .note.gnu.property
  Owner                Data size        Description
  GNU                  0x00000010       NT_GNU_PROPERTY_TYPE_0
      Properties: AArch64 feature: BTI, PAC, GCS
$ gcc -flto -mbranch-protection=standard test.c -o test; readelf -j .note.gnu.property test

Displaying notes found in: .note.gnu.property
  Owner                Data size        Description
  GNU                  0x00000010       NT_GNU_PROPERTY_TYPE_0
      Properties: AArch64 feature: BTI, PAC, GCS
$ gcc -flto -g -mbranch-protection=standard test.c -o test; readelf -j .note.gnu.property test
readelf: Warning: Section '.note.gnu.property' was not dumped because it does not exist

The problem is that the *.debug.temp.o object files created by lto-wrapper
don't have these markings.  The function copies over .note.GNU-stack section
(so that it doesn't similarly on most arches break PT_GNU_STACK segment
flags), and .note.gnu.property (which used to hold this stuff e.g. on
aarch64 or x86, added in PR93966).  But it doesn't copy the new
.ARM.attributes section.

The following patch fixes it by copying that section too.  The function
unfortunately only works on names, doesn't know if it is copying ELF or some
other format (PE, Mach-O) or if it is copying ELF, whether it is EM_AARCH64
or some other arch.  The following patch just copies the section always,
I think it is very unlikely people would use .ARM.attributes section for
some random unrelated stuff.  If we'd want to limit it to just EM_AARCH64,
guess it would need to be done in
libiberty/simple-object-elf.c (simple_object_elf_copy_lto_debug_sections)
instead as an exception for the (*pfn) callback results (and there it could
e.g. verify SHT_AARCH64_ATTRIBUTES type but even there dunno if it has
access to the Ehdr stuff).

No testcase from me, dunno if e.g. the linker can flag the lack of those
during linking with some option rather than using readelf after link and
what kind of effective targets we'd need for such a test.

2026-03-05  Jakub Jelinek  <jakub@redhat.com>

	PR target/124365
	* simple-object.c (handle_lto_debug_sections): Also copy over
	.ARM.attributes section.
2026-03-05 13:11:39 +01:00
Tomasz Kamiński
3a41229f92 libstdc++: Fix atomic/cons/zero_padding.cc test for arm-none-eabi [PR124124]
The test uses dg-require-atomic-cmpxchg-word that checks if atomic compare
exchange is available for pointer sized integers, and then test types that
are eight bytes in size. This causes issue for targets for which pointers
are four byte and libatomic is not present, like arm-none-eabi.

This patch addresses by using short member in TailPadding and MidPadding,
instead of int. This reduces the size of types to four bytes, while keeping
padding bytes present.

	PR libstdc++/124124

libstdc++-v3/ChangeLog:

	* testsuite/29_atomics/atomic/cons/zero_padding.cc: Limit size of
	test types to four bytes.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
2026-03-05 11:59:36 +01:00
Tomasz Kamiński
7793e34adf libstdc++: Remove UB in _Arg_value union alternative assignment
The _Arg_value::_M_set method, initialized the union member, by
assigning to reference to that member produced by _M_get(*this).
However, per language rules, such assignment has undefined behavior,
if alternative was not already active, same as for any object not
within its lifetime.

To address above, we modify _M_set to use placement new for the class
types, and invoke _S_access with two arguments for all other types.
The _S_access (rename of _S_get) is modified to assign the value of
the second parameter (if provided) to the union member. Such direct
assignments are treated specially in the language (see N5032
[class.union.general] p5), and will start lifetime of trivially default
constructible alternative.

libstdc++-v3/ChangeLog:

	* include/std/format (_Arg_value::_M_get): Rename to...
	(_Arg_value::_M_access): Modified to accept optional
	second parameter that is assigned to value.
	(_Arg_value::_M_get): Handle rename.
	(_Arg_value::_M_set): Use construct_at for basic_string_view,
	handle, and two-argument _S_access for other types.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
Signed-off-by: Ivan Lazaric <ivan.lazaric1@gmail.com>
Co-authored-by: Ivan Lazaric <ivan.lazaric1@gmail.com>
2026-03-05 11:55:59 +01:00
Jakub Jelinek
446835a07d i386: Make -masm={att,intel} xchg operand order consistent
While in this case it is not an assemble failure nor wrong-code,
because say xchgl %eax, %edx and xchg eax, edx do the same thing,
they are encoded differently, so if we want consistency between
-masm=att and -masm=intel emitted code (my understanding is that
is what is Zdenek testing right now, fuzzing code, compiling
with both -masm=att and -masm=intel and making sure if the former
assembles, the latter does too and they result in identical
*.o files), we should use different order of the operands
even here (and it doesn't matter which order we pick).

I've grepped the *.md files with
grep '\\t%[0-9], %[0-9]' *.md | grep -v '%0, %0'
i386.md:  "xchg{<imodesuffix>}\t%1, %0"
i386.md:   xchg{<imodesuffix>}\t%1, %0
i386.md:  "wrss<mskmodesuffix>\t%0, %1"
i386.md:  "wruss<mskmodesuffix>\t%0, %1"
(before this and PR124366 fix) and later on also with
grep '\\t%[a-z0-9_<>]*[0-9], %[a-z0-9_<>]*[0-9]' *.md | grep -v '%0, %0'
and checked all the output and haven't found anything else problematic.

2026-03-05  Jakub Jelinek  <jakub@redhat.com>

	* config/i386/i386.md (swap<mode>): Swap operand order for
	-masm=intel.
2026-03-05 11:24:50 +01:00
Tomasz Kamiński
afa58609ba libstdc++: Store basic_format_arg::handle in __format::_Arg_value
This patch changes the type of _M_handle member of __format::_Arg_value
from __format::_HandleBase union member to basic_format_arg<_Context>::handle.
This allows handle to be stored (using placement new) inside _Arg_value at
compile time, as type _M_handle member now matches stored object.

In addition to above, to make handle usable at compile time, we adjust
the _M_func signature to match the stored function, avoiding the need
for reinterpret cast.

To avoid a cycling dependency, where basic_format_arg<_Context> requires
instantiating _Arg_value<_Context> for its _M_val member, that in turn
requires basic_format_arg<_Context>::handle, we define handle as nested
class inside _Arg_value and change basic_format_arg<_Context>::handle
to alias for it.

Finally, the handle(_Tp&) constructor is now constrained to not accept
handle itself, as otherwise it would be used instead of copy-constructor
when constructing from handle&.

As _Arg_value is already templated on _Context, this change should not lead
to additional template instantiations.

libstdc++-v3/ChangeLog:

	* include/std/format (__Arg_value::handle): Define, extracted
	with modification from basic_format_arg::handle.
	(_Arg_value::_Handle_base): Remove.
	(_Arg_value::_M_handle): Change type to handle.
	(_Arg_value::_M_get, _Arg_value::_M_set): Check for handle
	type directly, and return result unmodified.
	(basic_format_arg::__formattable): Remove.
	(basic_format_arg::handle): Replace with alias to
	_Arg_value::handle.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
2026-03-05 11:11:02 +01:00
Xi Ruoyao
4898147482 Partially revert "LoongArch: Fix bug123807."
This reverts the loongarch.cc change of the commit
4df77a2542.

PR 123807 turns out to be a special case of the middle-end PR 124250.
The previous ad-hoc fix is unneeded now since the underlying middle-end
issue is fixed, so revert it but keep the test case.

gcc/

	PR target/123807
	PR middle-end/124250
	* config/loongarch/loongarch.cc
	(loongarch_expand_vector_init_same): Revert r16-7163 change.
2026-03-05 18:04:36 +08:00
Jakub Jelinek
ed29af4100 i386: Fix up last -masm=intel operand of vcvthf82ph [PR124349]
gas expects for this instruction
vcvthf82ph      xmm30, QWORD PTR [r9]
vcvthf82ph      ymm30, XMMWORD PTR [r9]
vcvthf82ph      zmm30, YMMWORD PTR [r9]
i.e. the memory size is half of the dest register size.
We currently emit it for the last 2 forms but emit XMMWORD PTR
for the first one too.  So, we need %q1 for V8HF and for V16HF/V32HF
can either use just %1 or %x1/%t1.  There is no define_mode_attr
that would provide those, so I've added one just for this insn.

2026-03-05  Jakub Jelinek  <jakub@redhat.com>

	PR target/124349
	* config/i386/sse.md (iptrssebvec_2): New define_mode_attr.
	(cvthf82ph<mode><mask_name>): Use it for -masm=intel input
	operand.

	* gcc.target/i386/avx10_2-pr124349-2.c: New test.
2026-03-05 10:05:44 +01:00
Jakub Jelinek
d828a370db i386: Fix up vpternlogq last operand of *andnot<mode>3 for -masm=intel [PR124367]
The immediate operand 0x44 in this insn was incorrectly emitted as
$0x44 even in -masm=intel syntax.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
approved by Uros in the PR, committed to trunk.

2026-03-05  Jakub Jelinek  <jakub@redhat.com>

	PR target/124367
	* config/i386/sse.md (*andnot<mode>3): Use 0x44 rather than $0x44
	for -masm=intel.

	* gcc.target/i386/avx512vl-pr124367.c: New test.
2026-03-05 09:39:36 +01:00