mirror of
https://github.com/gcc-mirror/gcc.git
synced 2026-05-06 23:25:24 +02:00
f4afefbbbee1414e130ca2f1552216bb702a985c
Currently x86_64's TImode STV pass has the restriction that candidate
chains must start with a TImode load from memory. This patch improves
the functionality of STV to allow zero-extensions and construction of
TImode pseudos from two DImode values (i.e. *concatditi) to both be
considered candidate chain initiators. For example, this allows chains
starting from an __int128 function argument to be processed by STV.
Compiled with -O2 on x86_64:
__int128 m0,m1,m2,m3;
void foo(__int128 m)
{
m0 = m;
m1 = m;
m2 = m;
m3 = m;
}
Previously generated:
foo: xchgq %rdi, %rsi
movq %rsi, m0(%rip)
movq %rdi, m0+8(%rip)
movq %rsi, m1(%rip)
movq %rdi, m1+8(%rip)
movq %rsi, m2(%rip)
movq %rdi, m2+8(%rip)
movq %rsi, m3(%rip)
movq %rdi, m3+8(%rip)
ret
With the patch, we now generate:
foo: movq %rdi, %xmm0
movq %rsi, %xmm1
punpcklqdq %xmm1, %xmm0
movaps %xmm0, m0(%rip)
movaps %xmm0, m1(%rip)
movaps %xmm0, m2(%rip)
movaps %xmm0, m3(%rip)
ret
or with -mavx2:
foo: vmovq %rdi, %xmm1
vpinsrq $1, %rsi, %xmm1, %xmm0
vmovdqa %xmm0, m0(%rip)
vmovdqa %xmm0, m1(%rip)
vmovdqa %xmm0, m2(%rip)
vmovdqa %xmm0, m3(%rip)
ret
Likewise, for zero-extension:
__int128 m0,m1,m2,m3;
void bar(unsigned long x)
{
__int128 m = x;
m0 = m;
m1 = m;
m2 = m;
m3 = m;
}
Previously with -O2:
bar: movq %rdi, m0(%rip)
movq $0, m0+8(%rip)
movq %rdi, m1(%rip)
movq $0, m1+8(%rip)
movq %rdi, m2(%rip)
movq $0, m2+8(%rip)
movq %rdi, m3(%rip)
movq $0, m3+8(%rip)
ret
with this patch:
bar: movq %rdi, %xmm0
movaps %xmm0, m0(%rip)
movaps %xmm0, m1(%rip)
movaps %xmm0, m2(%rip)
movaps %xmm0, m3(%rip)
ret
As shown in the examples above, the scalar-to-vector (STV) conversion of
*concatditi has an overhead [treating two DImode registers as a TImode
value is free on x86_64], but specifying this penalty allows the STV
pass to make an informed decision if the total cost/gain of the chain
is a net win.
2025-10-21 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/i386/i386-features.cc (timode_concatdi_p): New
function to recognize the various variants of *concatditi3_[1-7].
(scalar_chain::add_insn): Like VEC_SELECT, ZERO_EXTEND and
timode_concatdi_p instructions don't require their input
operands to be converted (to TImode).
(timode_scalar_chain::compute_convert_gain): Split/clone XOR and
IOR cases from AND case, to handle timode_concatdi_p costs.
<case PLUS>: Handle timode_concatdi_p conversion costs.
<case ZERO_EXTEND>: Provide costs of DImode to TImode extension.
(timode_convert_concatdi): Helper function to transform
a *concatditi3 instruction into a vec_concatv2di instruction.
(timode_scalar_chain::convert_insn): Split/clone XOR and IOR
cases from ANS case, to handle timode_concatdi_p using the new
timode_convert_concatdi helper function.
<case ZERO_EXTEND>: Convert zero_extendditi2 to *vec_concatv2di_0.
<case PLUS>: Handle timode_concatdi_p using the new
timode_convert_concatdi helper function.
(timode_scalar_to_vector_candidate_p): Support timode_concatdi_p
instructions in IOR, XOR and PLUS cases.
<case ZERO_EXTEND>: Consider zero extension of a register from
DImode to TImode to be a candidate.
gcc/testsuite/ChangeLog
* gcc.target/i386/sse4_1-stv-10.c: New test case.
* gcc.target/i386/sse4_1-stv-11.c: Likewise.
* gcc.target/i386/sse4_1-stv-12.c: Likewise.
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
This directory contains the GNU Compiler Collection (GCC). The GNU Compiler Collection is free software. See the files whose names start with COPYING for copying permission. The manuals, and some of the runtime libraries, are under different terms; see the individual source files for details. The directory INSTALL contains copies of the installation information as HTML and plain text. The source of this information is gcc/doc/install.texi. The installation information includes details of what is included in the GCC sources and what files GCC installs. See the file gcc/doc/gcc.texi (together with other files that it includes) for usage and porting information. An online readable version of the manual is in the files gcc/doc/gcc.info*. See http://gcc.gnu.org/bugs/ for how to report bugs usefully. Copyright years on GCC source files may be listed using range notation, e.g., 1987-2012, indicating that every year in the range, inclusive, is a copyrightable year that could otherwise be listed individually.
Description
Languages
C++
30.8%
C
30.2%
Ada
14.4%
D
6.1%
Go
5.7%
Other
12.3%