Files
gcc/libstdc++-v3/include/bits/atomic_timed_wait.h
Jonathan Wakely 5f02ba9258 libstdc++: Fix proxy wait detection in atomic wait
A failed assertion was observed with std::atomic<bool>::wait when the
loop in __atomic_wait_address is entered and calls _M_setup_wait a
second time, after waking from __wait_impl. When the first call to
_M_setup_wait makes a call to _M_setup_proxy_wait that function decides
that a proxy wait is needed for an object of type bool, and it updates
the _M_obj and _M_obj_size members to refer to the futex in the proxy
state, instead of referring to the bool object itself. The next time
_M_setup_wait is called it calls _M_setup_proxy_wait again but now it
sees _M_obj_size == sizeof(futex) and so this time decides a proxy wait
is *not* needed, and then fails the __glibcxx_assert(_M_obj == addr)
check.

The problem is that _M_setup_proxy_wait wasn't correctly handling the
case where it's called a second time, after the decision to use a proxy
wait has already been made. That can be fixed in _M_setup_proxy_wait by
checking if _M_obj != addr, which implies that a proxy wait has already
been set up by a previous call. In that case, _M_setup_proxy_wait should
only update _M_old to the latest value of the proxy _M_ver.

This change means that _M_setup_proxy_wait is safe to call repeatedly
for a proxy wait, and will only update _M_wait_state, _M_obj, and
_M_obj_size on the first call. On the second and subsequent calls, those
variables are already correctly set for the proxy wait so don't need to
be set again.

For non-proxy waits, calling _M_setup_proxy_wait more than once is safe,
but pessimizes performance. The caller shouldn't make a second call to
_M_setup_proxy_wait because we don't need to check again if a proxy wait
should be used (the answer won't change) and we don't need to load a
value from the proxy _M_ver.

However, it was difficult to detect the case of a non-proxy wait,
because _M_setup_wait doesn't know if it's being called the first time
(when _M_setup_proxy_wait is called to make the initial decision) or a
subsequent time (in which case _M_obj == addr implies a non-proxy wait
was already decided on). As a result, _M_setup_proxy_wait was being used
every time to see if it's a proxy wait. We can resolve this by splitting
the _M_setup_wait function into _M_setup_wait and _M_on_wake, where the
former is only called once to do the initial setup and the latter is
called after __wait_impl returns, to prepare to check the predicate and
possibly wait again.  The new _M_on_wake function can avoid unnecessary
calls to _M_setup_proxy_wait by checking _M_obj == addr to identify a
non-proxy wait.

The three callers of _M_setup_wait are updated to use _M_on_wake instead
of _M_setup_wait after waking from a waiting function. This change
revealed a latent performance bug in __atomic_wait_address_for which was
not passing __res to _M_setup_wait, so a new value was always loaded
even when __res._M_has_val was true. By splitting _M_on_wake out of
_M_setup_wait this problem became more obvious, because we no longer
have _M_setup_wait doing two different jobs, depending on whether it was
passed the optional third argument or not.

libstdc++-v3/ChangeLog:

	* include/bits/atomic_timed_wait.h (__atomic_wait_address_until):
	Use _M_on_wake instead of _M_setup_wait after waking.
	(__atomic_wait_address_for): Likewise.
	* include/bits/atomic_wait.h (__atomic_wait_address): Likewise.
	(__wait_args::_M_setup_wait): Remove third parameter and move
	code to update _M_old to ...
	(__wait_args::_M_on_wake): New member function to update _M_old
	after waking, only calling _M_setup_proxy_wait if needed.
	(__wait_args::_M_store): New member function to update _M_old
	from a value, for non-proxy waits.
	* src/c++20/atomic.cc (__wait_args::_M_setup_proxy_wait): If
	_M_obj is not addr, only load a new value and return true.

Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
2026-01-09 19:18:21 +00:00

233 lines
8.1 KiB
C++

// -*- C++ -*- header.
// Copyright (C) 2020-2026 Free Software Foundation, Inc.
//
// This file is part of the GNU ISO C++ Library. This library is free
// software; you can redistribute it and/or modify it under the
// terms of the GNU General Public License as published by the
// Free Software Foundation; either version 3, or (at your option)
// any later version.
// This library is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
// Under Section 7 of GPL version 3, you are granted additional
// permissions described in the GCC Runtime Library Exception, version
// 3.1, as published by the Free Software Foundation.
// You should have received a copy of the GNU General Public License and
// a copy of the GCC Runtime Library Exception along with this program;
// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
// <http://www.gnu.org/licenses/>.
/** @file bits/atomic_timed_wait.h
* This is an internal header file, included by other library headers.
* Do not attempt to use it directly. @headername{atomic}
*/
#ifndef _GLIBCXX_ATOMIC_TIMED_WAIT_H
#define _GLIBCXX_ATOMIC_TIMED_WAIT_H 1
#ifdef _GLIBCXX_SYSHDR
#pragma GCC system_header
#endif
#include <bits/atomic_wait.h>
#if __glibcxx_atomic_wait
#include <bits/this_thread_sleep.h>
#include <bits/chrono.h>
#ifdef _GLIBCXX_HAVE_LINUX_FUTEX
#include <sys/time.h>
#endif
namespace std _GLIBCXX_VISIBILITY(default)
{
_GLIBCXX_BEGIN_NAMESPACE_VERSION
namespace __detail
{
using __wait_clock_t = chrono::steady_clock;
template<typename _Clock, typename _Dur>
__wait_clock_t::time_point
__to_wait_clock(const chrono::time_point<_Clock, _Dur>& __atime) noexcept
{
const typename _Clock::time_point __c_entry = _Clock::now();
const __wait_clock_t::time_point __w_entry = __wait_clock_t::now();
const auto __delta = __atime - __c_entry;
using __w_dur = typename __wait_clock_t::duration;
return __w_entry + chrono::ceil<__w_dur>(__delta);
}
template<typename _Dur>
__wait_clock_t::time_point
__to_wait_clock(const chrono::time_point<__wait_clock_t,
_Dur>& __atime) noexcept
{
using __w_dur = typename __wait_clock_t::duration;
if constexpr (is_same_v<__w_dur, _Dur>)
return __atime;
else
return chrono::ceil<__w_dur>(__atime);
}
// This uses a nanoseconds duration for the timeout argument.
// For __abi_version=0 that is the time since the steady_clock's epoch.
// It's possible that in future we will add new __wait_flags constants
// to indicate that the timeout is the time since the system_clock epoch,
// or is a relative timeout not an absolute time.
__wait_result_type
__wait_until_impl(const void* __addr, __wait_args_base& __args,
const chrono::nanoseconds& __timeout);
template<typename _Clock, typename _Dur>
__wait_result_type
__wait_until(const void* __addr, __wait_args_base& __args,
const chrono::time_point<_Clock, _Dur>& __atime) noexcept
{
auto __at = __detail::__to_wait_clock(__atime);
auto __res = __detail::__wait_until_impl(__addr, __args,
__at.time_since_epoch());
if constexpr (!is_same_v<__wait_clock_t, _Clock>)
if (__res._M_timeout)
{
// We got a timeout when measured against __clock_t but
// we need to check against the caller-supplied clock
// to tell whether we should return a timeout.
if (_Clock::now() < __atime)
__res._M_timeout = false;
}
return __res;
}
template<typename _Rep, typename _Period>
__wait_result_type
__wait_for(const void* __addr, __wait_args_base& __args,
const chrono::duration<_Rep, _Period>& __rtime) noexcept
{
if (!__rtime.count())
{
// no rtime supplied, just spin a bit
__args._M_flags |= __wait_flags::__do_spin | __wait_flags::__spin_only;
return __detail::__wait_impl(__addr, __args);
}
auto const __reltime = chrono::ceil<__wait_clock_t::duration>(__rtime);
auto const __atime = chrono::steady_clock::now() + __reltime;
return __detail::__wait_until(__addr, __args, __atime);
}
} // namespace __detail
// returns true if wait ended before timeout
template<typename _Tp,
typename _Pred, typename _ValFn,
typename _Clock, typename _Dur>
bool
__atomic_wait_address_until(const _Tp* __addr, _Pred&& __pred,
_ValFn&& __vfn,
const chrono::time_point<_Clock, _Dur>& __atime,
bool __bare_wait = false) noexcept
{
__detail::__wait_args __args{ __addr, __bare_wait };
_Tp __val = __args._M_setup_wait(__addr, __vfn);
while (!__pred(__val))
{
auto __res = __detail::__wait_until(__addr, __args, __atime);
if (__res._M_timeout)
return false; // C++26 will also return last observed __val
__val = __args._M_on_wake(__addr, __vfn, __res);
}
return true; // C++26 will also return last observed __val
}
template<typename _Clock, typename _Dur>
bool
__atomic_wait_address_until_v(const __detail::__platform_wait_t* __addr,
__detail::__platform_wait_t __old,
int __order,
const chrono::time_point<_Clock, _Dur>& __atime,
bool __bare_wait = false) noexcept
{
// This function must not be used if __wait_impl might use a proxy wait:
__glibcxx_assert(__platform_wait_uses_type<__detail::__platform_wait_t>);
__detail::__wait_args __args{ __addr, __old, __order, __bare_wait };
auto __res = __detail::__wait_until(__addr, __args, __atime);
return !__res._M_timeout; // C++26 will also return last observed __val
}
template<typename _Tp, typename _ValFn,
typename _Clock, typename _Dur>
bool
__atomic_wait_address_until_v(const _Tp* __addr, _Tp&& __old,
_ValFn&& __vfn,
const chrono::time_point<_Clock, _Dur>& __atime,
bool __bare_wait = false) noexcept
{
auto __pfn = [&](const _Tp& __val) {
return !__detail::__atomic_eq(__old, __val);
};
return std::__atomic_wait_address_until(__addr, __pfn, __vfn, __atime,
__bare_wait);
}
template<typename _Tp,
typename _Pred, typename _ValFn,
typename _Rep, typename _Period>
bool
__atomic_wait_address_for(const _Tp* __addr, _Pred&& __pred,
_ValFn&& __vfn,
const chrono::duration<_Rep, _Period>& __rtime,
bool __bare_wait = false) noexcept
{
__detail::__wait_args __args{ __addr, __bare_wait };
_Tp __val = __args._M_setup_wait(__addr, __vfn);
while (!__pred(__val))
{
auto __res = __detail::__wait_for(__addr, __args, __rtime);
if (__res._M_timeout)
return false; // C++26 will also return last observed __val
__val = __args._M_on_wake(__addr, __vfn, __res);
}
return true; // C++26 will also return last observed __val
}
template<typename _Rep, typename _Period>
bool
__atomic_wait_address_for_v(const __detail::__platform_wait_t* __addr,
__detail::__platform_wait_t __old,
int __order,
const chrono::duration<_Rep, _Period>& __rtime,
bool __bare_wait = false) noexcept
{
// This function must not be used if __wait_impl might use a proxy wait:
__glibcxx_assert(__platform_wait_uses_type<__detail::__platform_wait_t>);
__detail::__wait_args __args{ __addr, __old, __order, __bare_wait };
auto __res = __detail::__wait_for(__addr, __args, __rtime);
return !__res._M_timeout; // C++26 will also return last observed __val
}
template<typename _Tp, typename _ValFn,
typename _Rep, typename _Period>
bool
__atomic_wait_address_for_v(const _Tp* __addr, _Tp&& __old, _ValFn&& __vfn,
const chrono::duration<_Rep, _Period>& __rtime,
bool __bare_wait = false) noexcept
{
auto __pfn = [&](const _Tp& __val) {
return !__detail::__atomic_eq(__old, __val);
};
return __atomic_wait_address_for(__addr, __pfn, forward<_ValFn>(__vfn),
__rtime, __bare_wait);
}
_GLIBCXX_END_NAMESPACE_VERSION
} // namespace std
#endif // __cpp_lib_atomic_wait
#endif // _GLIBCXX_ATOMIC_TIMED_WAIT_H