Some versions of vxworks define NULL to __nullptr in C++, assuming
C++11, which breaks at least a number of analyzer tests that get
exercised in C++98 mode.
Wrap the header that defines NULL so that, after including it, we
override the NULL definition with the one provided by stddef.h.
That required some infrastructure to enable subdirectories in extra
headers. Since USER_H filenames appear as dependencies, that limits
the possibilities or markup, so I went for a filesystem-transparent
sequence that doesn't appear in any extra_headers whatsoever, namely
/././, to mark the beginning of the desired install name.
Co-Authored-By: Olivier Hainque <hainque@adacore.com>
for gcc/ChangeLog
* config/vxworks/base/b_NULL.h: New.
* config.gcc (extra_headers) <*-*-vxworks*>: Add it.
* Makefile.in (stmp-int-hdrs): Support /././ markers in USER_H
to mark the beginning of the install name. Document.
* doc/sourcebuild.texi (Headers): Document /././ marker.
r16-5173-g52a24bcec9388a fixed this testcase, but I think it's
worthwhile still adding this reduced test for it to the modules.exp set
of tests so we don't need to rely on libstdc++ tests for it yet.
PR c++/122646
gcc/testsuite/ChangeLog:
* g++.dg/modules/friend-10_a.C: New test.
* g++.dg/modules/friend-10_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
This is the last cleanup in this area. Merges the splitting functionality
of remove_forwarder_block_with_phi into remove_forwarder_block.
Now mergephi still has the ability to split the edges when merging the forwarder
block with a phi. But this reduces the non-shared code a lot.
gcc/ChangeLog:
* tree-cfgcleanup.cc (tree_forwarder_block_p): Remove must argument.
(remove_forwarder_block): Add can_split
argument. Handle the splitting case (iff phis in bb).
(cleanup_tree_cfg_bb): Update argument to tree_forwarder_block_p.
(remove_forwarder_block_with_phi): Remove.
(pass_merge_phi::execute): Update argument to tree_forwarder_block_p
and call remove_forwarder_block instead of remove_forwarder_block_with_phi.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
This adds support for merging forwarder blocks with phis in cleanupcfg.
This patch might seem small but that is because the previous patches were
done to build up to make it easier to add this support.
There is still one more patch to merge remove_forwarder_block
and remove_forwarder_block_with_phi since remove_forwarder_block_with_phi
supports splitting an edge which is not supported as an option in remove_forwarder_block.
The splitting edge option should not be enabled for cfgcleanup but only for mergephi.
Note r8-338-ge7d70c6c3bccb2 added always creating a preheader for loops so we should
protect them if we have a phi node as it goes back and forth here. And both the gimple
and RTL loop code likes to have this preheader in the case of having the same constant
value being starting of the loop.
explaination on testcase changes
gcc.target/i386/pr121062-1.c needed a small change because there is a basic block
which is not duplicated so only one `movq reg, -1` is there instead of 2.
uninit-pred-7_a.c is xfailed and filed as PR122660, some analysis in the PR already of
the difference now.
uninit-pred-5.C was actually a false positive because when
m_best_candidate is non-NULL, m_best_candidate_len is always initialized.
The log message on the testcase is wrong if you manually fall the path
you can notice that. With an extra jump threading after the merging of
some bbs, the false positive is now no longer happening. So change the
dg-warning to dg-bogus.
ssa-dom-thread-7.c now jump threads 12 times in thread2 instead of 8
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/122493
gcc/ChangeLog:
* tree-cfgcleanup.cc (tree_forwarder_block_p): Change bool argument
to a must have phi and allow phis if it is false.
(remove_forwarder_block): Add support for merging of forwarder blocks
with phis.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr121062-1.c: Update count.
* gcc.dg/uninit-pred-7_a.c: xfail line 23.
* g++.dg/uninit-pred-5.C: Change dg-warning to dg-bogus.
* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Update count of jump thread.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
So when we using the newly mapped location, we should check if
it is not unknown location and if so just use the original location.
Note this is a latent bug in remove_forwarder_block_with_phi code too.
This fixes gcc.dg/uninit-pr40635.c when doing more mergephi.
gcc/ChangeLog:
* tree-cfg.cc (copy_phi_arg_into_existing_phi): Use the original location
if the mapped location is unknown.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
copy_phi_arg_into_existing_phi was added in r14-477-g78b0eea7802698
and used in remove_forwarder_block but since
remove_forwarder_block_with_phi needed to use the redirect edge var
map, it was not moved over. This extends copy_phi_arg_into_existing_phi
to have the ability to optional use the mapper.
This also makes remove_forwarder_block_with_phi and remove_forwarder_block closer to
one another. There is a few other changes needed to be able to do both
from the same function.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* tree-cfg.cc (copy_phi_arg_into_existing_phi): New use_map argument.
* tree-cfg.h (copy_phi_arg_into_existing_phi): Update declaration.
* tree-cfgcleanup.cc (remove_forwarder_block_with_phi): Use
copy_phi_arg_into_existing_phi instead of inlining it.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
This moves the ei definition directly into for loo
like was done for remove_forwarder_block_with_phi.
gcc/ChangeLog:
* tree-cfgcleanup.cc (remove_forwarder_block): Move
variable declaration ei into for loop.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
It was always kinda of odd that while remove_forwarder_block used
an edge iterator, remove_forwarder_block_with_phi used a while loop.
remove_forwarder_block_with_phi was added after remove_forwarder_block too.
Anyways this changes remove_forwarder_block_with_phi into use the same
form of loop so it is easier to merge the 2.
gcc/ChangeLog:
* tree-cfgcleanup.cc (remove_forwarder_block_with_phi): Use
edge iterator instead of while loop.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Since at least r9-1005-gb401e50fed4def, dominator information is
available in remove_forwarder_block so there is no reason to have a
check on if we should update the dominator information, always do it.
This is one more step into commoning remove_forwarder_block and remove_forwarder_block_with_phi.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* tree-cfgcleanup.cc (remove_forwarder_block): Remove check
on the available dominator information.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
While moving mergephi's forwarder block removal over to cfgcleanup,
I noticed a few regressions due to removal of a forwarder block (correctly)
but the counts were not updated, instead let these blocks be handled by the merge_blocks
cleanup code.
gcc/ChangeLog:
* tree-cfgcleanup.cc (tree_forwarder_block_p): Reject bb which has a single
predecessor which has a single successor.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
This moves the checks that were in pass_merge_phi::execute into remove_forwarder_block_with_phi
or tree_forwarder_block_p to make easier to merge remove_forwarder_block_with_phi with remove_forwarder_block.
This also simplifies the code slightly because we can do `return false` rather than break
in one location.
gcc/ChangeLog:
* tree-cfgcleanup.cc (pass_merge_phi::execute): Move
check for abnormal or no phis to remove_forwarder_block_with_phi
and the check on dominated to tree_forwarder_block_p.
(remove_forwarder_block_with_phi): here.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
I noticed this check was in both remove_forwarder_block and remove_forwarder_block_with_phi but
were slightly different in that eh landing pad was not being checked for remove_forwarder_block_with_phi
when it definite should be.
This folds the check into tree_forwarder_block_p instead as it is called right before hand anyways.
The eh landing pad check was added to the non-phi one by r0-98233-g28e5ca15b76773 but missed the phi variant;
I am not sure if it could show up there but it is better to have one common code than having two copies of
slightly different checks.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* tree-cfgcleanup.cc (remove_forwarder_block_with_phi): Remove check on non-local label.
(remove_forwarder_block): Remove check on non-label/eh landing pad.
(tree_forwarder_block_p): Add check on lable for an eh landing pad.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Since removing the worklist for both mergephi and cfglceanup (r0-80545-g672987e82f472b), these
two functions are now called right after tree_forwarder_block_p so there is no reason to the
extra check for infinite loop nor the current loop on the headers check as it is already
handled in tree_forwarder_block_p.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* tree-cfgcleanup.cc (remove_forwarder_block): Remove check for infinite loop.
(remove_forwarder_block_with_phi): Likewise. Also remove check for loop header.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
Since the worklist was never added to and the anlysis part can benifit
from the work part, we can combine the analayis part with the work part.
This should get a small speedup for this pass
Looking into the history here, remove_forwarder_block used to add to the worklist
but remove_forwarder_block_with_phi never did.
This is the first step in moving part of the functionality of mergephi into
cfgcleanup.
gcc/ChangeLog:
* tree-cfgcleanup.cc (pass_merge_phi::execute): Remove worklist.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
This is more prep work for revamping the zero/sign extension patterns on RISC-V
to avoid the need for define_insn_and_splits.
The core issue at hand is for the base ISA we don't have the full set of
sign/zero extensions. So what's been done so far is to pretend we do via a
define_insn_and_split, then split the extensions into shift pairs post-reload
(for the base ISA).
That has multiple undesirable properties, including inhibiting optimization in
some cases and making it harder to add new optimizations in the most natural
way in the future.
The basic approach we've been taking to these problems has been to generate the
desired code at expansion time. When we do that for RISC-V, ext-dce will no
longer see the zero/sign extension nodes when compiling for the base ISA --
instead it'll see shift pairs. And that in turn causes ext-dce to miss
elimination opportunities which is a regression relative to the trunk right
now.
This patch improves ext-dce to recognize the second shift (right) in such a
sequence, then try to match it up with a prior left shift (which has to be the
immediately prior real instruction). When it can pair them up it'll treat the
pair like an extension. The right shift turns into a simple copy of the source
of the left shift.
That prevents optimization regressions with the in flight code to revamp the
zero extension (and then sign extensino) code. No new tests since it's
preventing existing tests from failing to optimize after some in flight stuff
lands.
Bootstrapped and regression tested on x86_64 and tested on all the crosses in
my tester. The Pioneer and BPI will pick it up tonight for bootstrap testing
on RISC-V.
* ext-dce.cc (ext_dce_try_optimize_rshift): New function to optimize a
shift pair implementing a zero/sign extension.
(ext_dce_try_optimize_extension): Renamed from
ext_dce_try_optimize_insn.
(ext_dce_process_uses): Handle shift pairs implementing extensions.
This is 3rd (and hopefully last) time to fix the order here.
The previous times were r16-5093-g77e10b47f25d05 and r16-4905-g7b9d32aa2ffcb5.
The order before these patches were:
* removal of phi
* propagate constants
* gimplification of expr
* create assignment
* rewrite to undefined
* add stmts to bb
The current order before this patch (and after the other 2):
* gimplification of expr
* removal of phi
* create assignment
* propagate constants
* rewrite to undefined
* add stmts to bb
The correct and new order with this patch we have:
* gimplifcation of expr
* propagate constants
* removal of phi
* create the assignment
* rewrite to undefined
* add stmts to bb
This is because the propagate of the constant will cause a fold_stmt which requires
the statement in the IR still. The gimplifcation of expr also calls fold_stmt.
Now with the new order the phi is not removed until right before the creation of the
new assigment so the IR in the basic block is well defined while calling fold_stmt.
Pushed as obvious after bootstrap/test on x86_64-linux-gnu.
PR tree-optimization/122637
gcc/ChangeLog:
* tree-scalar-evolution.cc (final_value_replacement_loop): Fix order
of gimplification and constant prop.
gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr122637-1.c: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
I missed a null check for asm_node when introducing toplevel_node.
PR lto/122603
gcc/lto/ChangeLog:
* lto-partition.cc (split_partition_into_nodes): Null check for
possible asm_node.
gcc/testsuite/ChangeLog:
* gcc.dg/lto/pr122603_0.c: New test.
Introduced in r16-5042-g470411f44f51d9, this testcase fails on
AdvSIMD-less AArch32 configurations, likely as well as on other targets
without vector support; thus, require it via dg-require-effective-target.
Since this testcase includes stdint.h, require that as well.
Regtested on arm-gnueabihf with
RUNTESTFLAGS=--target_board=unix/-mfpu=vfpv3-d16/-march=armv7-a.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/forwprop-43.c: Adjust.
Yup, yet another out of bounds access into the equivalence array.
In this case we had an out of bounds write, which corrupted the heap leading to
the fault.
Given this is the 3rd such issue in this space in recent history and the second
in this loop within LRA within a week or so, I looked for a solution that would
cover the whole loop rather than another spot fix.
The good news is this loop runs after elimination, so we can just expand the
equivalence array after elimination and all the right things should happen.
This also allows removal of the spot fix I did last week (which I did
backtest). I didn't have a testcase for the bug in this space I fixed a couple
months ago (and the artifacts from that build are certainly gone from my tester
by now).
Bootstrapped and regression tested on x86. Also verified the RISC-V failures
in this bz and bz122321 are fixed.
Given this is a refinement & simplification of a prior fix, I'm going to take
some slight leeway to push the fix forward now.
PR rtl-optimization/122627
gcc/
* lra-constraints.cc (update_equiv): Remove patch from last week
related to pr122321.
(lra_constraints): Expand the equivalence array after eliminations
are complete.
gcc/testsuite/
* gcc.target/riscv/rvv/autovec/pr122627.c: New test.
This is a freezing issue introduced by the new support for deferred extra
formals. The freezing of local types created during the expansion of the
entry construct happens in the wrong scope when the expansion is deferred,
causing reference-before-freezing in the expanded code.
gcc/ada/ChangeLog:
* exp_ch9.adb (Expand_N_Entry_Declaration): In the deferred case,
freeze immediately all the newly created entities.
Adjust the register restoration on aarch64 to not use register 96
on llvm. Avoids the "reg too big" warning on aarch64 when sigtramp
is called. For llvm and aarch64, the correct choice seems to be 32.
Remove parens on REGNO_PC_OFFSET when compiling,
it causes a silent failure due to alphanumeric register names.
Define a macro for __attribute ((optimize (2))) which is
empty if not availble. (Despite being documented, it generates an
"unknown attribute" warning with clang.)
Define ATTRIBUTE_PRINTF_2 if not defined.
gcc/ada/ChangeLog:
* sigtramp-vxworks-target.h (REGNO_PC_OFFSET): Use 32 vice
96 with llvm/clang. (REGNO_G_REG_OFFSET): Remove parens on
operand. (REGNO_GR): Likewise.
* sigtramp-vxworks.c (__gnat_sigtramp): Define a macro for
__attribute__ optimize, which is empty of not available.
* raise-gcc.c (db): Define ATTRIBUTE_PRINTF_2 if not defined.
Duplicate streaming and Put_Image subprograms were being generated in some
cases where this was not intended. In most cases this only resulted in unwanted
code duplication (which, of course, is not good), but in some cases it resulted
in compilation failures with spurious "duplicate body" error messages.
gcc/ada/ChangeLog:
* exp_attr.adb: Rewrite the spec and implementation of package
Cached_Attribute_Ops so that the saved value associated with a
type in a given map is not a single subprogram but instead a
set of subprograms. Thus, the correct generation of a second subprogram
for given type for use in some other context no longer causes the
first subprogram to be forgotten. This allows more reuse and,
in particular, allows reuse in the case where generating another
copy of the subprogram would result in a compilation failure.
Update Cached_Attribute_Ops clients correspondingly.
One more case where compile-time evaluation can't trace the original location
of an object reference.
gcc/ada/ChangeLog:
* exp_util.adb (Find_In_Enclosing_Context): Give up on declarations of
internal types.
The pretty-printed output (emitted by the debugger "pp" command and by command
line switch -gnatdt) was broken for an end span, e.g.
81 (Uint = 81) p.adb:8:11 Then_Statements = List (List_Id=-99999975)
and now this is printed as:
End_Span = 81 (Uint = 81) p.adb:8:11
Then_Statements = List (List_Id=-99999975)
No impact on the compilation.
gcc/ada/ChangeLog:
* treepr.adb (Print_End_Span): Print prefix, field name and line break.
The compiler fails to resolve expressions involving a target name (@ symbol)
in assignment statements where the target object is an indexed container
object, complaining that the target name is of the reference type associated
with the container type. The target object is initially viewed as having
the reference type, which is what the compiler was also setting as the
type of the N_Target_Name node in the assignment's expression tree (leading
to type errors), and it's only later expansion that changes the target object
to a dereference whose type is the reference type's designated type, which
is too late.
This is addressed by implementing AI22-0082 and AI22-0112. The first AI is
about changing the reference types declared in the predefined containers
generics to be limited types. The second AI revises the resolution rules for
assignment statements to exclude interpretations that are of limited types.
Combining the two AIs, the case described above will resolve to the dereference
of an indexed container component rather than the interpretation of the indexing
as returning an object of a reference type. The AI22-0112 changes also avoid
ambiguities for assignments involving indexed names (such as "C1(I) := C2(J);"),
at least for cases involving the predefined containers (user-defined containers
that declare nonlimited reference types can still run into such ambiguities).
But apart from those AIs, GNAT was already doing things wrong in
the case of overloaded variable names in assignment statements with
container indexing, in determining the type of target names (@ symbols)
as being of the reference type, which could result in wrong-type errors.
GNAT wasn't following the requirement that the variable name in an
assignment statement must be resolved as a "complete context". This is
now corrected by separate resolution code that's done in the case where
the expression of the assignment contains target names.
Also, the existing code in Analyze_Assignment that's used in the
non-target-name case is revised by removing incorrect code for ignoring
the reference interpretations of generalized indexing and replacing it
with code to remove interpretations of limited types (which, per AI22-0112,
needs to be done whether or not there are target names involved).
It should be noted that the changes to make reference types limited in the
predefined container packages can affect existing code that happens to depend
on the reference types being nonlimited, and code changes may be required to
remove or work around such dependence.
gcc/ada/ChangeLog:
* libgnat/a-cbdlli.ads: Add "limited" to partial view of reference types.
* libgnat/a-cbhama.ads: Likewise.
* libgnat/a-cbhase.ads: Likewise.
* libgnat/a-cbmutr.ads: Likewise.
* libgnat/a-cborma.ads: Likewise.
* libgnat/a-cborse.ads: Likewise.
* libgnat/a-cdlili.ads: Likewise.
* libgnat/a-cidlli.ads: Likewise.
* libgnat/a-cihama.ads: Likewise.
* libgnat/a-cihase.ads: Likewise.
* libgnat/a-cimutr.ads: Likewise.
* libgnat/a-ciorma.ads: Likewise.
* libgnat/a-ciormu.ads: Likewise.
* libgnat/a-ciorse.ads: Likewise.
* libgnat/a-cobove.ads: Likewise.
* libgnat/a-cohama.ads: Likewise.
* libgnat/a-cohase.ads: Likewise.
* libgnat/a-coinho.ads: Likewise.
* libgnat/a-coinho__shared.ads: Likewise.
* libgnat/a-coinve.ads: Likewise.
* libgnat/a-comutr.ads: Likewise.
* libgnat/a-convec.ads: Likewise.
* libgnat/a-coorma.ads: Likewise.
* libgnat/a-coormu.ads: Likewise.
* libgnat/a-coorse.ads: Likewise.
* sem_ch5.adb (Analyze_Assignment): Added code to resolve the target
object (LHS) as a complete context when there are target names ("@")
present in the expression of the assignment. Loop over interpretations,
removing any that have a limited type, and set the type (T1) to be the
type of the first nonlimited interpretation. Test for ambiguity by
calling Is_Ambiguous_Operand. Delay analysis of Rhs in the target-name
case. Replace existing test for generalized indexing with implicit
dereference in existing analysis code with test of Is_Limited_Type
along with calling Remove_Interp in the limited case.
* sem_res.adb (Is_Ambiguous_Operand): Condition the calls to
Report_Interpretation on Report_Errors being True.
The RM 4.9(36/2) subclause says that, if a static expression is of type
universal_real and its expected type is a decimal fixed point type, then
its value shall be a multiple of the small of the decimal type. This was
enforced for real literals, but not for real named numbers.
Fixing the problem involves tweaking Fold_Ureal and the same tweak is also
applied to Fold_Uint for the sake of consistency in the implementation.
gcc/ada/ChangeLog:
PR ada/29463
* sem_eval.adb (Fold_Uint): Use Universal_Integer as actual type
for a named number.
(Fold_Ureal): Likewise with Universal_Real.
* sem_res.adb (Resolve_Real_Literal): Test whether the literal is
a static expression instead of coming from source to give the error
prescribed by the RM 4.9(36/2) subclause.
This patch adds documentation that stresses some of the consequences of
RM D.10 (10.2/5) that enable a lightweight implementation of suspension
objects.
gcc/ada/ChangeLog:
* libgnarl/s-taspri__posix.ads (Suspension_Object): Add some
documentation.
The recent change that streamlined the implementation of alignment checks
has uncovered an ancient bug in the implementation of pragma Suppress on
a specific object:
pragma Suppress (Alignment_Check, A);
The pragma would work only if placed before the address clause:
A : Integer;
pragma Suppress (Alignment_Check, A);
for A'Address use ...
but not if placed (just) after it:
A : Integer;
for A'Address use ...
pragma Suppress (Alignment_Check, A);
which seems unfriendly at best.
gcc/ada/ChangeLog:
* sem_prag.adb (Analyze_Pragma) <Process_Suppress_Unsuppress>: For
Alignment_Check on a specific object with an address clause and no
alignment clause, toggle the Check_Address_Alignment flag present
on the address clause.
After the last change on the list of junk names, the documentation was
not modified.
gcc/ada/ChangeLog:
* doc/gnat_rm/implementation_defined_pragmas.rst: Update
the list of junk names.
* doc/gnat_ugn/building_executable_programs_with_gnat.rst:
Likewise.
* gnat_rm.texi: Regenerate.
* gnat_ugn.texi: Regenerate.
Packages that implement CRC32 algorithm are pure and always terminate.
This avoids spurious warnings when using them from GNATprove.
gcc/ada/ChangeLog:
* libgnat/g-crc32.ads (CRC32): Annotate as pure and always terminating.
* libgnat/s-crc32.ads (CRC32): Annotate as pure and always terminating.
A statically dead ELSIF branch can be rewritten either to a NULL statement or
entirely detached from its parent.
gcc/ada/ChangeLog:
* exp_util.adb (Get_Current_Value_Condition): Relax assertion about
rewritten ELSIF branch.
This documents the meaning of Standard'Maximum_Alignment as it has been
implemented in the GCC-based compiler for more than a decade.
gcc/ada/ChangeLog:
* doc/gnat_rm/implementation_defined_attributes.rst
(Maximum_Alignment): Fix description.
* doc/gnat_rm/representation_clauses_and_pragmas.rst
(Alignment Clauses): Adjust accordingly.
* checks.adb (Apply_Address_Clause_Check): Remove incorrect test on
Maximum_Alignment.
* sem_ch13.adb (Analyze_Attribute_Definition_Clause): Minor tweak
in comment.
* ttypes.ads (Maximum_Alignment): Fix description.
* gnat_rm.texi: Regenerate.
Alignment checks are now fully decoupled from range checks.
gcc/ada/ChangeLog:
* doc/gnat_rm/implementation_defined_pragmas.rst (Pragma Suppress):
Remove mention of range checks in the entry for alignment checks.
* gnat_rm.texi: Regenerate.
They are present in comments, except for one present in an error message.
gcc/ada/ChangeLog:
* layout.adb (Set_Composite_Alignment): Fix typos and comments.
As [1] says, we cannot mix up lock-free and locking atomics for one
object. For example assume atom = (0, 0) initially, if we have a
locking "atomic" xor running on T0 and a lock-free store running on T1
concurrently:
T0 | T1
-----------------------------+-------------------------------------
acquire_lock |
t0 = atom[0] |
/* some CPU cycles */ | atom = (1, 1) /* lock-free atomic */
t1 = atom[1] |
t0 ^= 1 |
t1 ^= 1 |
atom[0] = t0 |
atom[1] = t1 |
release_lock |
we get atom = (0, 1), but the atomicity of xor and store should
guarantee that atom is either (0, 0) or (1, 1).
So, if we want to use a lock-free 16B atomic operation, we need both LSX
and SCQ even if that specific operation only needs one of them. To make
things worse, one may link a TU compiled with -mlsx -mscq and another
without them together, then if we want to use the lock-free 16B atomic
operations in the former, we must make libatomic also use the lock-free
16B atomic operation for the latter so we need to add ifuncs for
libatomic, similar to the discussion about i386 vs. i486 in [1].
Implementing and building the ifuncs currently requires:
- Glibc, because the ifunc resolver interface is libc-specific
- Linux, because the HWCAP bit for LSX is kernel-specific
- A recent enough assembler at build time to recognize sc.q
So the approach here is: only allow 16B lock-free atomic operations in
the compiler if the criteria above is satisfied, and ensure libatomic to
use those lock-free operations on capable hardware (via ifunc unless
both LSX and SCQ are already enabled by the builder) if the compiler
allows 16B lock-free atomic.
[1]: https://gcc.gnu.org/wiki/Atomic/GCCMM/LIbrary
gcc/
* configure.ac (HAVE_AS_16B_ATOMIC): Define if the assembler
supports LSX and sc.q.
* configure: Regenerate.
* config.in: Regenerate.
* config/loongarch/loongarch-opts.h (HAVE_AS_16B_ATOMIC):
Defined to 0 if undefined yet.
* config/loongarch/linux.h (HAVE_IFUNC_FOR_LIBATOMIC_16B):
Define as HAVE_AS_16B_ATOMIC && OPTION_GLIBC.
* config/loongarch/loongarch-protos.h
(loongarch_16b_atomic_lock_free_p): New prototype.
* config/loongarch/loongarch.cc
(loongarch_16b_atomic_lock_free_p): Implement.
* config/loongarch/sync.md (atomic_storeti_lsx): Require
loongarch_16b_atomic_lock_free_p.
(atomic_storeti): Likewise.
(atomic_exchangeti_scq): Likewise.
(atomic_exchangeti): Likewise.
(atomic_compare_and_swapti): Likewise.
(atomic_fetch_<amop_ti_fetch>ti_scq): Likewise.
(atomic_fetch_<amop_ti_fetch>ti): Likewise.
(ALL_SC): Likewise for TImode.
(atomic_storeti_scq): Remove.
libatomic/
* configure.ac (ARCH_LOONGARCH): New AM_CONDITIONAL.
* Makefile.am (IFUNC_OPT): Separate the item from IFUNC_OPTIONS
to allow using multiple options for an ISA variant.
(libatomic_la_LIBADD): Add *_16_1_.lo for LoongArch.
(IFUNC_OPTIONS): Build *_16_1_.lo for LoongArch with -mlsx and
-mscq.
* configure: Regenerate.
* Makefile.in: Regenerate.
* configure.tgt (try_ifunc): Set to yes for LoongArch if the
compiler can produce lock-free 16B atomic with -mlsx -mscq.
* config/loongarch/host-config.h: Implement ifunc selector.
The following makes sure to delete the loads we previously allocated
with new.
gcc/
* config/i386/i386-features.cc (pass_x86_cse::x86_cse): Delete
loads.
The std::is_assignable check should test for assignment to an lvalue,
not an rvalue.
libstdc++-v3/ChangeLog:
PR libstdc++/122661
* include/bits/forward_list.h (forward_list::assign(I, I)): Fix
value category in is_assignable check.
* testsuite/23_containers/forward_list/modifiers/122661.cc:
New test.
SIG_IGN also needs to be defined according to the C++ standard.
This was missing in the test.
* testsuite/18_support/headers/csignal/macros.cc: Check for
SIG_IGN.
Signed-off-by: Xavier Bonaventura <xavibonaventura@gmail.com>
"long long" and "unsigned long long" min and max macros were added in
C++11, but they were not present in the climits test.
libstdc++-v3/ChangeLog:
* testsuite/18_support/headers/climits/values.cc: Check for
LLONG_MIN, LLONG_MAX, and ULLONG_MAX.
Signed-off-by: Xavier Bonaventura <xavibonaventura@gmail.com>
This adds support for using Cuda Managed Memory with omp_alloc. AMD support
will be added in a future patch.
There is one new predefined allocator, "ompx_gnu_managed_mem_alloc", plus a
corresponding memory space, which can be used to allocate memory in the
"managed" space.
The nvptx plugin is modified to make the necessary Cuda calls, via two new
(optional) plugin interfaces.
gcc/fortran/ChangeLog:
* openmp.cc (is_predefined_allocator): Use GOMP_OMP_PREDEF_ALLOC_MAX
and GOMP_OMPX_PREDEF_ALLOC_MIN/MAX instead of hardcoded values in the
comment.
include/ChangeLog:
* cuda/cuda.h (cuMemAllocManaged): Add declaration and related
CU_MEM_ATTACH_GLOBAL flag.
* gomp-constants.h (GOMP_OMPX_PREDEF_ALLOC_MAX): Update to 201.
(GOMP_OMP_PREDEF_MEMSPACE_MAX): New constant.
(GOMP_OMPX_PREDEF_MEMSPACE_MIN): New constant.
(GOMP_OMPX_PREDEF_MEMSPACE_MAX): New constant.
libgomp/ChangeLog:
* allocator.c (ompx_gnu_max_predefined_alloc): Update to
ompx_gnu_managed_mem_alloc.
(_Static_assert): Fix assertion messages for allocators and add
new assertions for memspace constants.
(omp_max_predefined_mem_space): New define.
(ompx_gnu_min_predefined_mem_space): New define.
(ompx_gnu_max_predefined_mem_space): New define.
(MEMSPACE_ALLOC): Add check for non-standard memspaces.
(MEMSPACE_CALLOC): Likewise.
(MEMSPACE_REALLOC): Likewise.
(MEMSPACE_VALIDATE): Likewise.
(predefined_ompx_gnu_alloc_mapping): Add ompx_gnu_managed_mem_space.
(omp_init_allocator): Add ompx_gnu_managed_mem_space validation.
* config/gcn/allocator.c (gcn_memspace_alloc): Add check for
non-standard memspaces.
(gcn_memspace_calloc): Likewise.
(gcn_memspace_realloc): Likewise.
(gcn_memspace_validate): Update to validate standard vs non-standard
memspaces.
* config/linux/allocator.c (linux_memspace_alloc): Add managed
memory space handling.
(linux_memspace_calloc): Likewise.
(linux_memspace_free): Likewise.
(linux_memspace_realloc): Likewise (returns NULL for fallback).
* config/nvptx/allocator.c (nvptx_memspace_alloc): Add check for
non-standard memspaces.
(nvptx_memspace_calloc): Likewise.
(nvptx_memspace_realloc): Likewise.
(nvptx_memspace_validate): Update to validate standard vs non-standard
memspaces.
* env.c (parse_allocator): Add ompx_gnu_managed_mem_alloc,
ompx_gnu_managed_mem_space, and some static asserts so I don't forget
them again.
* libgomp-plugin.h (GOMP_OFFLOAD_managed_alloc): New declaration.
(GOMP_OFFLOAD_managed_free): New declaration.
* libgomp.h (gomp_managed_alloc): New declaration.
(gomp_managed_free): New declaration.
(struct gomp_device_descr): Add managed_alloc_func and
managed_free_func fields.
* libgomp.texi: Document ompx_gnu_managed_mem_alloc and
ompx_gnu_managed_mem_space, add C++ template documentation, and
describe NVPTX and AMD support.
* omp.h.in: Add ompx_gnu_managed_mem_space and
ompx_gnu_managed_mem_alloc enumerators, and gnu_managed_mem C++
allocator template.
* omp_lib.f90.in: Add Fortran bindings for new allocator and
memory space.
* omp_lib.h.in: Likewise.
* plugin/cuda-lib.def: Add cuMemAllocManaged.
* plugin/plugin-nvptx.c (nvptx_alloc): Add managed parameter to
support cuMemAllocManaged.
(GOMP_OFFLOAD_alloc): Move contents to ...
(cleanup_and_alloc): ... this new function, and add managed support.
(GOMP_OFFLOAD_managed_alloc): New function.
(GOMP_OFFLOAD_managed_free): New function.
* target.c (gomp_managed_alloc): New function.
(gomp_managed_free): New function.
(gomp_load_plugin_for_device): Load optional managed_alloc
and managed_free plugin APIs.
* testsuite/lib/libgomp.exp: Add check_effective_target_omp_managedmem.
* testsuite/libgomp.c++/alloc-managed-1.C: New test.
* testsuite/libgomp.c/alloc-managed-1.c: New test.
* testsuite/libgomp.c/alloc-managed-2.c: New test.
* testsuite/libgomp.c/alloc-managed-3.c: New test.
* testsuite/libgomp.c/alloc-managed-4.c: New test.
* testsuite/libgomp.fortran/alloc-managed-1.f90: New test.
Co-authored-by: Kwok Cheung Yeung <kcyeung@baylibre.com>
Co-authored-by: Thomas Schwinge <tschwinge@baylibre.com>
When building trunk on Mac OS X 10.13 with the bundled makeinfo 4.8, I
ran into a couple of errors:
gcc/doc/install.texi:2675: @option expected braces.
gcc/doc/install.texi:2675: Misplaced {.
gcc/doc/install.texi:2675: Misplaced }.
gcc/doc/install.texi:2675: @code missing close brace.
gcc/fortran/gfortran.texi:1842: First argument to cross-reference may not be empty.
gcc/fortran/gfortran.texi:1903: First argument to cross-reference may not be empty.
gcc/fortran/intrinsic.texi:15549: Unknown command `cindex,'.
However, install.texi states that makeinfo >= 4.7 is required, so this
should work.
This patch fixes those errors.
Tested on x86_64-apple-darwin17.7.0 (makeinfo 4.8), i386-pc-solaris2.11
(makeinfo 7.2), and x86_64-pc-linux-gnu (makeinfo 7.1).
2025-10-22 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>
gcc:
PR other/122638
* doc/install.texi (Configuration, --enable-x86-64-mfentry): Fix
typo.
gcc/fortran:
PR other/122638
* gfortran.texi (OpenMP): Fix syntax.
* intrinsic.texi (UINT): Fix syntax.
For instruction sequence like
kmovb %k0, %edx
kmovb %k1, %ecx
orb %cl, %dl
je .L5
if only CCZ is cared, it can be optimized to
kortestb %k1, %k0
je .L5
gcc/ChangeLog:
* config/i386/i386.md (*ior<mode>_ccz_1): New define_insn.
gcc/testsuite/ChangeLog:
* gcc.target/i386/kortest_ccz-1.c: New test.
r16-5132-g6786a073fcead3 added mention of the '=' variant of the
'--param' command line option to gcc/doc/invoke.texi. This confused
contrib/check-params-in-docs.py. Fix that.
Commiting as obvious.
contrib/ChangeLog:
* check-params-in-docs.py: Start parsing from
@itemx --param=@var{name}=@var{value} instead of
@item --param @var{name}=@var{value}.
Signed-off-by: Filip Kastl <fkastl@suse.cz>
This implements P3913R1: Optimize for std::optional in range adaptors.
Specifically, for an opt of type optional<T> that is a view:
* views::reverse(opt), views::take(opt, n), and views::drop(opt, n) returns
optional<T>.
* views::as_const(opt), optional<T&> is converted into optional<const T&>.
optional<T const> is not used in the non-reference case because, such
type is not move assignable, and thus not a view.
libstdc++-v3/ChangeLog:
* include/std/optional (__is_optional_ref): Define.
* include/std/ranges (_Take::operator(), _Drop::operator())
(_Reverse::operator()): Handle optional<T> that are view.
(_AsConst::operator()): Handle optional<T&>.
* testsuite/20_util/optional/range.cc: New tests.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
Some SVE features in the toolchain need to be enabled when either of two
different kernel HWCAPS (and corresponding cpuinfo strings) are enabled
(one for non-streaming mode and one for streaming mode).
Add support for using "|" to separate alternative lists of required
features.
gcc/ChangeLog:
* config/aarch64/driver-aarch64.cc
(host_detect_local_cpu): Extend feature string syntax.
For non-templated tests, a volatile_<T> alias is used. This alias expands to
volatile T if std::atomic_ref<T>::is_always_lock_free is true, and to T
otherwise. For templated functions, testing is controlled using if constexpr.
PR libstdc++/115402
PR libstdc++/122584
libstdc++-v3/ChangeLog:
* testsuite/29_atomics/atomic_ref/address.cc: Guard test for
volatile with if constexpr.
* testsuite/29_atomics/atomic_ref/deduction.cc: Likewise.
* testsuite/29_atomics/atomic_ref/op_support.cc: Likewise.
* testsuite/29_atomics/atomic_ref/requirements.cc: Likewise.
* testsuite/29_atomics/atomic_ref/bool.cc: Use volatile_t alias.
* testsuite/29_atomics/atomic_ref/generic.cc: Likewise.
* testsuite/29_atomics/atomic_ref/integral.cc: Likewise.
* testsuite/29_atomics/atomic_ref/pointer.cc: Likewise.
* testsuite/29_atomics/atomic_ref/float.cc: Likewise, and remove
not discarding if constexpr.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
The following teaches simple_dce_from_worklist to remove the LHS
from calls like DCE does.
* tree-ssa-dce.cc (simple_dce_from_worklist): For calls
with side-effects remove their LHS.