testsuite: fix ICEs in analyzer plugin with CPython >= 3.11 [PR107646,PR112520]

In GCC 14 the testsuite gained a plugin that "teaches" the analyzer
about the CPython API, trying for find common mistakes:
  https://gcc.gnu.org/wiki/StaticAnalyzer/CPython

Unfortunately, this has been crashing for more recent versions of
CPython.

Specifically, in Python 3.11,  PyObject's ob_refcnt was moved to an
anonymous union (as part of PEP 683 "Immortal Objects, Using a Fixed
Refcount").  The plugin attempts to find the field but fails, but has
no error-handling, leading to a null pointer dereference.

Also, https://github.com/python/cpython/pull/101292 moved the "ob_digit"
from struct _longobject to a new field long_value of a new
struct _PyLongValue, leading to similar analyzer crashes when not
finding the field.

The following patch fixes this by
* looking within the anonymous union for the ob_refcnt field if it can't
find it directly
* gracefully handling the case of not finding "ob_digit" in PyLongObject
* doing more lookups once at plugin startup, rather than continuously on
analyzing API calls
* adding diagnostics and more error-handling to the plugin startup, so that
if it can't find something in the Python headers it emits a useful note
when disabling itself, e.g.
  cc1: note: could not find field 'ob_digit' of CPython type 'PyLongObject' {aka 'struct _longobject'}
* replacing some copy-and-pasted code with member functions of a new
"class api" (though various other cleanups could be done)

Tested with:
* CPython 3.8: all tests continue to PASS
* CPython 3.13: fixes the ICEs, 2 FAILs remain (reference counting false
negatives)

Given that this is already a large patch, I'm opting to only fix the
crashes and defer the 2 remainings FAILs and other cleanups to followup
work.

gcc/analyzer/ChangeLog:
	PR testsuite/112520
	* region-model-manager.cc
	(region_model_manager::get_field_region): Assert that the args are non-null.

gcc/testsuite/ChangeLog:
	PR analyzer/107646
	PR testsuite/112520
	* gcc.dg/plugin/analyzer_cpython_plugin.cc: Move everything from
	namespace ana:: into ana::cpython_plugin.  Move global tree values
	into a new "class api".
	(pyobj_record): Replace with api.m_type_PyObject.
	(pyobj_ptr_tree): Replace with api.m_type_PyObject_ptr.
	(pyobj_ptr_ptr): Replace with  api.m_type_PyObject_ptr_ptr.
	(varobj_record): Replace with api.m_type_PyVarObject.
	(pylistobj_record): Replace with api.m_type_PyListObject.
	(pylongobj_record): Replace with api.m_type_PyLongObject.
	(pylongtype_vardecl): Replace with api.m_vardecl_PyLong_Type.
	(pylisttype_vardecl): Replace with api.m_vardecl_PyList_Type.
	(get_field_by_name): Add "complain" param and use it to issue a
	note on failure.  Assert that type and	name are non-null.  Don't
	crash on fields that are anonymous unions, and special-case
	looking within them for "ob_refcnt" to	work around the
	Python 3.11 change for PEP 683 (immortal objects).
	(get_sizeof_pyobjptr): Convert to...
	(api::get_sval_sizeof_PyObject_ptr): ...this
	(init_ob_refcnt_field): Convert to...
	(api::init_ob_refcnt_field): ...this.
	(set_ob_type_field): Convert to...
	(api::set_ob_type_field): ..this.
	(api::init_PyObject_HEAD): New.
	(api::get_region_PyObject_ob_refcnt): New.
	(api::do_Py_INCREF): New.
	(api::get_region_PyVarObject_ob_size): New.
	(api::get_region_PyLongObject_ob_digit): New.
	(inc_field_val): Convert to...
	(api::inc_field_val): ...this.
	(refcnt_mismatch::refcnt_mismatch): Add tree params for refcounts
	and initialize corresponding fields.  Fix whitespace.
	(refcnt_mismatch::emit): Use stored tree values, rather than
	assuming we have constants, and crashing non-constants.  Delete
	commented-out dead code.
	(refcnt_mismatch::foo): Delete.
	(refcnt_mismatch::m_expected_refcnt_tree): New field.
	(refcnt_mismatch::m_actual_refcnt_tree): New field.
	(retrieve_ob_refcnt_sval): Simplify using class api.
	(count_pyobj_references): Likewise.
	(check_refcnt): Likewise.  Don't warn on UNKNOWN values.  Use
	get_representative_tree for the expected and actual values and
	skip the warning if it fails, rather than assuming we have
	constants and crashing on non-constants.
	(count_all_references): Update comment.
	(kf_PyList_Append::impl_call_pre): Simplify using class api.
	(kf_PyList_Append::impl_call_post): Likewise.
	(kf_PyList_New::impl_call_post): Likewise.
	(kf_PyLong_FromLong::impl_call_post): Likewise.
	(get_stashed_type_by_name): Emit note if the type couldn't be
	found.
	(get_stashed_global_var_by_name): Likewise for globals.
	(init_py_structs): Convert to...
	(api::init_from_stashed_types): ...this.  Bail out with an error
	code if anything fails.  Look up more things at startup, rather
	than during analysis of calls.
	(ana::cpython_analyzer_events_subscriber): Rename to...
	(ana::cpython_plugin::analyzer_events_subscriber): ...this.
	(analyzer_events_subscriber::analyzer_events_subscriber):
	Initialize m_init_failed.
	(analyzer_events_subscriber::on_message<on_tu_finished>):
	Update for conversion of init_py_structs to
	api::init_from_stashed_types and bail if it fails.
	(analyzer_events_subscriber::on_message<on_frame_popped): Don't
	run if plugin initialization failed.
	(analyzer_events_subscriber::m_init_failed): New field.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
This commit is contained in:
David Malcolm
2026-03-06 18:47:05 -05:00
parent d0d3d4dde0
commit c2c64cfcd0
2 changed files with 459 additions and 186 deletions

View File

@@ -1689,6 +1689,8 @@ region_model_manager::get_unknown_symbolic_region (tree region_type)
const region *
region_model_manager::get_field_region (const region *parent, tree field)
{
gcc_assert (parent);
gcc_assert (field);
gcc_assert (TREE_CODE (field) == FIELD_DECL);
/* (*UNKNOWN_PTR).field is (*UNKNOWN_PTR_OF_&FIELD_TYPE). */

View File

@@ -51,72 +51,314 @@ int plugin_is_GPL_compatible;
static GTY (()) hash_map<tree, tree> *analyzer_stashed_types;
static GTY (()) hash_map<tree, tree> *analyzer_stashed_globals;
namespace ana
{
static tree pyobj_record = NULL_TREE;
static tree pyobj_ptr_tree = NULL_TREE;
static tree pyobj_ptr_ptr = NULL_TREE;
static tree varobj_record = NULL_TREE;
static tree pylistobj_record = NULL_TREE;
static tree pylongobj_record = NULL_TREE;
static tree pylongtype_vardecl = NULL_TREE;
static tree pylisttype_vardecl = NULL_TREE;
namespace ana {
namespace cpython_plugin {
static tree
get_field_by_name (tree type, const char *name)
get_field_by_name (tree type, const char *name, bool complain = true)
{
gcc_assert (type);
gcc_assert (name);
for (tree field = TYPE_FIELDS (type); field; field = TREE_CHAIN (field))
{
if (TREE_CODE (field) == FIELD_DECL)
{
const char *field_name = IDENTIFIER_POINTER (DECL_NAME (field));
if (strcmp (field_name, name) == 0)
return field;
}
if (tree id = DECL_NAME (field))
{
const char *field_name = IDENTIFIER_POINTER (id);
if (strcmp (field_name, name) == 0)
return field;
}
/* Prior to python 3.11, ob_refcnt a field of PyObject.
In Python 3.11 ob_refcnt was moved to an anonymous union within
PyObject (as part of PEP 683 "Immortal Objects, Using a
Fixed Refcount"). */
if (0 == strcmp (name, "ob_refcnt"))
if (tree subfield = get_field_by_name (TREE_TYPE (field), name, false))
return subfield;
}
if (complain)
inform (UNKNOWN_LOCATION, "could not find field %qs of CPython type %qT",
name, type);
return NULL_TREE;
}
static const svalue *
get_sizeof_pyobjptr (region_model_manager *mgr)
{
tree size_tree = TYPE_SIZE_UNIT (pyobj_ptr_tree);
const svalue *sizeof_sval = mgr->get_or_create_constant_svalue (size_tree);
return sizeof_sval;
}
/* Global state and utils for working with the CPython API.
/* Update MODEL to set OB_BASE_REGION's ob_refcnt to 1. */
static void
init_ob_refcnt_field (region_model_manager *mgr, region_model *model,
const region *ob_base_region, tree pyobj_record,
const call_details &cd)
Attempt to provide some isolation against changes to the API in
different versions of CPython.
This only persists during the analyzer, and thus doesn't need
to interact with the garbage collector. */
class api
{
tree ob_refcnt_tree = get_field_by_name (pyobj_record, "ob_refcnt");
const region *ob_refcnt_region
public:
bool init_from_stashed_types ();
// Get RECORD_TYPE for PyObject
tree
get_type_PyObject () const
{
gcc_assert (m_type_PyObject);
return m_type_PyObject;
}
// Get FIELD_DECL for PyObject's "ob_refcnt"
tree
get_field_PyObject_ob_refcnt () const
{
gcc_assert (m_field_PyObject_ob_refcnt);
return m_field_PyObject_ob_refcnt;
}
// Get FIELD_DECL for PyObject's "ob_type"
tree
get_field_PyObject_ob_type () const
{
gcc_assert (m_field_PyObject_ob_type);
return m_field_PyObject_ob_type;
}
// Get POINTER_TYPE for "PyObject *"
tree
get_type_PyObject_ptr () const
{
gcc_assert (m_type_PyObject_ptr);
return m_type_PyObject_ptr;
}
// Get POINTER_TYPE for "PyObject **"
tree
get_type_PyObject_ptr_ptr () const
{
gcc_assert (m_type_PyObject_ptr_ptr);
return m_type_PyObject_ptr_ptr;
}
// Get RECORD_TYPE for CPython's PyVarObject
tree
get_type_PyVarObject () const
{
gcc_assert (m_type_PyVarObject);
return m_type_PyVarObject;
}
// Get FIELD_DECL for PyVarObject's "ob_size"
tree
get_field_PyVarObject_ob_size () const
{
gcc_assert (m_field_PyVarObject_ob_size);
return m_field_PyVarObject_ob_size;
}
// Get RECORD_TYPE for CPython's PyListObject
tree
get_type_PyListObject () const
{
gcc_assert (m_type_PyListObject);
return m_type_PyListObject;
}
// Get RECORD_TYPE for CPython's PyLongObject
tree
get_type_PyLongObject () const
{
gcc_assert (m_type_PyLongObject);
return m_type_PyLongObject;
}
// Get FIELD_DECL for PyListObject's "ob_item"
tree
get_field_PyListObject_ob_item () const
{
gcc_assert (m_field_PyListObject_ob_item);
return m_field_PyListObject_ob_item;
}
// Get VAR_DECL for CPython's global var PyList_Type
tree
get_vardecl_PyList_Type () const
{
gcc_assert (m_vardecl_PyList_Type);
return m_vardecl_PyList_Type;
}
// Get VAR_DECL for CPython's global var PyLong_Type
tree
get_vardecl_PyLong_Type () const
{
gcc_assert (m_vardecl_PyLong_Type);
return m_vardecl_PyLong_Type;
}
/* Get an ana::svalue for "sizeof (PyObject *)". */
const svalue *
get_sval_sizeof_PyObject_ptr (region_model_manager *mgr) const
{
tree size_tree = TYPE_SIZE_UNIT (m_type_PyObject_ptr);
const svalue *sizeof_sval = mgr->get_or_create_constant_svalue (size_tree);
return sizeof_sval;
}
/* Update MODEL to set OB_BASE_REGION's ob_refcnt to 1. */
void
init_ob_refcnt_field (region_model *model,
const region *ob_base_region,
region_model_context *ctxt)
{
region_model_manager *mgr = model->get_manager ();
tree ob_refcnt_tree = get_field_PyObject_ob_refcnt ();
const region *ob_refcnt_region
= mgr->get_field_region (ob_base_region, ob_refcnt_tree);
const svalue *refcnt_one_sval
const svalue *refcnt_one_sval
= mgr->get_or_create_int_cst (size_type_node, 1);
model->set_value (ob_refcnt_region, refcnt_one_sval, cd.get_ctxt ());
}
model->set_value (ob_refcnt_region, refcnt_one_sval, ctxt);
}
/* Update MODEL to set OB_BASE_REGION's ob_type to point to
PYTYPE_VAR_DECL_PTR. */
static void
set_ob_type_field (region_model_manager *mgr, region_model *model,
const region *ob_base_region, tree pyobj_record,
tree pytype_var_decl_ptr, const call_details &cd)
{
const region *pylist_type_region
/* Update MODEL to set OB_BASE_REGION's ob_type to point to
PYTYPE_VAR_DECL_PTR. */
void
set_ob_type_field (region_model *model,
const region *ob_base_region,
tree pytype_var_decl_ptr,
region_model_context *ctxt)
{
region_model_manager *mgr = model->get_manager ();
const region *pylist_type_region
= mgr->get_region_for_global (pytype_var_decl_ptr);
tree pytype_var_decl_ptr_type
tree pytype_var_decl_ptr_type
= build_pointer_type (TREE_TYPE (pytype_var_decl_ptr));
const svalue *pylist_type_ptr_sval
const svalue *pylist_type_ptr_sval
= mgr->get_ptr_svalue (pytype_var_decl_ptr_type, pylist_type_region);
tree ob_type_field = get_field_by_name (pyobj_record, "ob_type");
const region *ob_type_region
tree ob_type_field = get_field_PyObject_ob_type ();
const region *ob_type_region
= mgr->get_field_region (ob_base_region, ob_type_field);
model->set_value (ob_type_region, pylist_type_ptr_sval, cd.get_ctxt ());
}
model->set_value (ob_type_region, pylist_type_ptr_sval, ctxt);
}
/* Initialize OB_BASE_REGION as a PyObject_HEAD
i.e. set "ob_refcnt" to 1 and "ob_type" to PYTYPE_VAR_DECL_PTR. */
void
init_PyObject_HEAD (region_model *model,
const region *ob_base_region,
tree pytype_var_decl_ptr,
region_model_context *ctxt)
{
// Initialize ob_refcnt field to 1.
init_ob_refcnt_field (model, ob_base_region, ctxt);
/* Get pointer svalue for PYTYPE_VAR_DECL_PTR then
assign it to ob_type field. */
set_ob_type_field (model, ob_base_region,
pytype_var_decl_ptr, ctxt);
}
// Get subregion for ob_refcnt within a PyObject instance
const region *
get_region_PyObject_ob_refcnt (region_model_manager *mgr,
const region *pyobject_instance_reg)
{
const region *ob_refcnt_region
= mgr->get_field_region (pyobject_instance_reg,
m_field_PyObject_ob_refcnt);
return ob_refcnt_region;
}
void
do_Py_INCREF (region_model *model,
const region *pyobject_instance_reg,
region_model_context *ctxt)
{
region_model_manager *mgr = model->get_manager ();
const region *ob_refcnt_region
= get_region_PyObject_ob_refcnt (mgr, pyobject_instance_reg);
inc_field_val (model, ob_refcnt_region, size_type_node, ctxt);
}
// Get subregion for ob_size within a PyVarObject instance
const region *
get_region_PyVarObject_ob_size (region_model_manager *mgr,
const region *pyvarobject_instance_reg)
{
const region *ob_size_region
= mgr->get_field_region (pyvarobject_instance_reg,
m_field_PyVarObject_ob_size);
return ob_size_region;
}
/* Attempt to get the subregion for the "ob_digit" field within
PYLONG_REGION, a PyLongObject instance. */
const region *
get_region_PyLongObject_ob_digit (region_model_manager *mgr,
const region *pylong_region)
{
if (tree ob_digit_field
= get_field_by_name (m_type_PyLongObject, "ob_digit", false))
{
const region *ob_digit_region
= mgr->get_field_region (pylong_region, ob_digit_field);
return ob_digit_region;
}
/* TODO: https://github.com/python/cpython/pull/101292
moved "ob_digit" from struct _longobject to a new field long_value
of a new struct _PyLongValue. */
return nullptr;
}
/* Increment the value of FIELD_REGION in the MODEL by 1. Optionally
capture the old and new svalues if OLD_SVAL and NEW_SVAL pointers are
provided. */
void
inc_field_val (region_model *model,
const region *field_region, const tree type_node,
region_model_context *ctxt,
const svalue **old_sval = nullptr,
const svalue **new_sval = nullptr)
{
region_model_manager *mgr = model->get_manager ();
const svalue *tmp_old_sval = model->get_store_value (field_region, ctxt);
const svalue *one_sval = mgr->get_or_create_int_cst (type_node, 1);
const svalue *tmp_new_sval
= mgr->get_or_create_binop (type_node, PLUS_EXPR, tmp_old_sval, one_sval);
model->set_value (field_region, tmp_new_sval, ctxt);
if (old_sval)
*old_sval = tmp_old_sval;
if (new_sval)
*new_sval = tmp_new_sval;
}
private:
// PyObject
tree m_type_PyObject; // RECORD_TYPE
tree m_field_PyObject_ob_refcnt; // FIELD_DECL
tree m_field_PyObject_ob_type; // FIELD_DECL
// POINTER_TYPE for "PyObject *"
tree m_type_PyObject_ptr;
// POINTER_TYPE for "PyObject **"
tree m_type_PyObject_ptr_ptr;
// PyVarObject
tree m_type_PyVarObject; // RECORD_TYPE
tree m_field_PyVarObject_ob_size; // FIELD_DECL
// PyListObject
tree m_type_PyListObject; // RECORD_TYPE
tree m_field_PyListObject_ob_item; // FIELD_DECL
tree m_vardecl_PyList_Type; // VAR_DECL for CPython's global var PyList_Type
// PyLongObject
tree m_type_PyLongObject; // RECORD_TYPE
tree m_vardecl_PyLong_Type; // VAR_DECL for CPython's global var PyLong_Type
} api;
/* Retrieve the "ob_base" field's region from OBJECT_RECORD within
NEW_OBJECT_REGION and set its value in the MODEL to PYOBJ_SVALUE. */
@@ -132,7 +374,7 @@ get_ob_base_region (region_model_manager *mgr, region_model *model,
return ob_base_region;
}
/* Initialize and retrieve a region within the MODEL for a PyObject
/* Initialize and retrieve a region within the MODEL for a PyObject
and set its value to OBJECT_SVALUE. */
static const region *
init_pyobject_region (region_model_manager *mgr, region_model *model,
@@ -144,30 +386,6 @@ init_pyobject_region (region_model_manager *mgr, region_model *model,
return pyobject_region;
}
/* Increment the value of FIELD_REGION in the MODEL by 1. Optionally
capture the old and new svalues if OLD_SVAL and NEW_SVAL pointers are
provided. */
static void
inc_field_val (region_model_manager *mgr, region_model *model,
const region *field_region, const tree type_node,
const call_details &cd, const svalue **old_sval = nullptr,
const svalue **new_sval = nullptr)
{
const svalue *tmp_old_sval
= model->get_store_value (field_region, cd.get_ctxt ());
const svalue *one_sval = mgr->get_or_create_int_cst (type_node, 1);
const svalue *tmp_new_sval = mgr->get_or_create_binop (
type_node, PLUS_EXPR, tmp_old_sval, one_sval);
model->set_value (field_region, tmp_new_sval, cd.get_ctxt ());
if (old_sval)
*old_sval = tmp_old_sval;
if (new_sval)
*new_sval = tmp_new_sval;
}
class pyobj_init_fail : public failed_call_info
{
public:
@@ -194,11 +412,17 @@ class refcnt_mismatch : public pending_diagnostic_subclass<refcnt_mismatch>
{
public:
refcnt_mismatch (const region *base_region,
const svalue *ob_refcnt,
const svalue *actual_refcnt,
tree reg_tree)
: m_base_region (base_region), m_ob_refcnt (ob_refcnt),
m_actual_refcnt (actual_refcnt), m_reg_tree(reg_tree)
const svalue *ob_refcnt,
const svalue *actual_refcnt,
tree reg_tree,
tree expected_refcnt_tree,
tree actual_refcnt_tree)
: m_base_region (base_region),
m_ob_refcnt (ob_refcnt),
m_actual_refcnt (actual_refcnt),
m_reg_tree (reg_tree),
m_expected_refcnt_tree (expected_refcnt_tree),
m_actual_refcnt_tree (actual_refcnt_tree)
{
}
@@ -225,16 +449,11 @@ public:
emit (diagnostic_emission_context &ctxt) final override
{
bool warned;
// just assuming constants for now
auto actual_refcnt
= m_actual_refcnt->dyn_cast_constant_svalue ()->get_constant ();
auto ob_refcnt = m_ob_refcnt->dyn_cast_constant_svalue ()->get_constant ();
warned = ctxt.warn ("expected %qE to have "
"reference count: %qE but ob_refcnt field is: %qE",
m_reg_tree, actual_refcnt, ob_refcnt);
// location_t loc = rich_loc->get_loc ();
// foo (loc);
m_reg_tree,
m_actual_refcnt_tree,
m_expected_refcnt_tree);
return warned;
}
@@ -245,15 +464,12 @@ public:
}
private:
void foo(location_t loc) const
{
inform(loc, "something is up right here");
}
const region *m_base_region;
const svalue *m_ob_refcnt;
const svalue *m_actual_refcnt;
tree m_reg_tree;
tree m_expected_refcnt_tree;
tree m_actual_refcnt_tree;
};
/* Retrieves the svalue associated with the ob_refcnt field of the base region.
@@ -263,7 +479,7 @@ retrieve_ob_refcnt_sval (const region *base_reg, const region_model *model,
region_model_context *ctxt)
{
region_model_manager *mgr = model->get_manager ();
tree ob_refcnt_tree = get_field_by_name (pyobj_record, "ob_refcnt");
tree ob_refcnt_tree = api.get_field_PyObject_ob_refcnt ();
const region *ob_refcnt_region
= mgr->get_field_region (base_reg, ob_refcnt_tree);
const svalue *ob_refcnt_sval
@@ -310,7 +526,7 @@ count_pyobj_references (const region_model *model,
seen.add (pyobj_region);
if (pyobj_ptr_sval->get_type () == pyobj_ptr_tree)
if (pyobj_ptr_sval->get_type () == api.get_type_PyObject_ptr ())
increment_region_refcnt (region_to_refcnt, pyobj_region);
const auto *curr_store = model->get_store ();
@@ -347,18 +563,30 @@ check_refcnt (const region_model *model,
const svalue *actual_refcnt_sval = mgr->get_or_create_int_cst (
ob_refcnt_sval->get_type (), actual_refcnt);
if (ob_refcnt_sval != actual_refcnt_sval)
if (ob_refcnt_sval != actual_refcnt_sval
&& ob_refcnt_sval->get_kind () != SK_UNKNOWN
&& actual_refcnt_sval->get_kind () != SK_UNKNOWN)
{
const svalue *curr_reg_sval
= mgr->get_ptr_svalue (pyobj_ptr_tree, curr_region);
= mgr->get_ptr_svalue (api.get_type_PyObject_ptr (), curr_region);
tree reg_tree = old_model->get_representative_tree (curr_reg_sval);
if (!reg_tree)
return;
tree expected_refcnt_tree
= old_model->get_representative_tree (ob_refcnt_sval);
if (!expected_refcnt_tree)
return;
tree actual_refcnt_tree
= old_model->get_representative_tree (actual_refcnt_sval);
if (!actual_refcnt_tree)
return;
const auto &eg = ctxt->get_eg ();
auto pd = std::make_unique<refcnt_mismatch> (curr_region, ob_refcnt_sval,
actual_refcnt_sval,
reg_tree);
reg_tree,
expected_refcnt_tree,
actual_refcnt_tree);
if (pd && eg)
ctxt->warn (std::move (pd),
make_ploc_fixer_for_epath_for_leak_diagnostic (*eg,
@@ -417,7 +645,7 @@ count_all_references (const region_model *model,
const svalue *unwrapped_sval
= binding_sval->unwrap_any_unmergeable ();
// if (unwrapped_sval->get_type () != pyobj_ptr_tree)
// if (unwrapped_sval->get_type () != api.m_type_PyObject_ptr)
// continue;
const region *pointee = unwrapped_sval->maybe_get_region ();
@@ -520,15 +748,15 @@ kf_PyList_Append::impl_call_pre (const call_details &cd) const
return;
// PyList_Check
tree ob_type_field = get_field_by_name (pyobj_record, "ob_type");
tree ob_type_field = api.get_field_PyObject_ob_type ();
const region *ob_type_region
= mgr->get_field_region (pylist_reg, ob_type_field);
const svalue *stored_sval
= model->get_store_value (ob_type_region, cd.get_ctxt ());
const region *pylist_type_region
= mgr->get_region_for_global (pylisttype_vardecl);
= mgr->get_region_for_global (api.get_vardecl_PyList_Type ());
tree pylisttype_vardecl_ptr
= build_pointer_type (TREE_TYPE (pylisttype_vardecl));
= build_pointer_type (TREE_TYPE (api.get_vardecl_PyList_Type ()));
const svalue *pylist_type_ptr
= mgr->get_ptr_svalue (pylisttype_vardecl_ptr, pylist_type_region);
@@ -578,7 +806,7 @@ kf_PyList_Append::impl_call_post (const call_details &cd) const
pylist_sval, cd.get_arg_tree (0), cd.get_ctxt ());
/* Identify ob_item field and set it to NULL. */
tree ob_item_field = get_field_by_name (pylistobj_record, "ob_item");
tree ob_item_field = api.get_field_PyListObject_ob_item ();
const region *ob_item_reg
= mgr->get_field_region (pylist_reg, ob_item_field);
const svalue *old_ptr_sval
@@ -592,7 +820,7 @@ kf_PyList_Append::impl_call_post (const call_details &cd) const
model->unset_dynamic_extents (freed_reg);
}
const svalue *null_sval = mgr->get_or_create_null_ptr (pyobj_ptr_ptr);
const svalue *null_sval = mgr->get_or_create_null_ptr (api.get_type_PyObject_ptr_ptr ());
model->set_value (ob_item_reg, null_sval, cd.get_ctxt ());
if (cd.get_lhs_type ())
@@ -633,20 +861,19 @@ kf_PyList_Append::impl_call_post (const call_details &cd) const
const region *newitem_reg = model->deref_rvalue (
newitem_sval, cd.get_arg_tree (1), cd.get_ctxt ());
tree ob_size_field = get_field_by_name (varobj_record, "ob_size");
const region *ob_size_region
= mgr->get_field_region (pylist_reg, ob_size_field);
= api.get_region_PyVarObject_ob_size (mgr, pylist_reg);
const svalue *ob_size_sval = nullptr;
const svalue *new_size_sval = nullptr;
inc_field_val (mgr, model, ob_size_region, integer_type_node, cd,
&ob_size_sval, &new_size_sval);
api.inc_field_val (model, ob_size_region, integer_type_node, ctxt,
&ob_size_sval, &new_size_sval);
const svalue *sizeof_sval = mgr->get_or_create_cast (
ob_size_sval->get_type (), get_sizeof_pyobjptr (mgr));
ob_size_sval->get_type (), api.get_sval_sizeof_PyObject_ptr (mgr));
const svalue *num_allocated_bytes = mgr->get_or_create_binop (
size_type_node, MULT_EXPR, sizeof_sval, new_size_sval);
tree ob_item_field = get_field_by_name (pylistobj_record, "ob_item");
tree ob_item_field = api.get_field_PyListObject_ob_item ();
const region *ob_item_region
= mgr->get_field_region (pylist_reg, ob_item_field);
const svalue *ob_item_ptr_sval
@@ -655,7 +882,7 @@ kf_PyList_Append::impl_call_post (const call_details &cd) const
/* We can only grow in place with a non-NULL pointer and no unknown
*/
{
const svalue *null_ptr = mgr->get_or_create_null_ptr (pyobj_ptr_ptr);
const svalue *null_ptr = mgr->get_or_create_null_ptr (api.get_type_PyObject_ptr_ptr ());
if (!model->add_constraint (ob_item_ptr_sval, NE_EXPR, null_ptr,
cd.get_ctxt ()))
{
@@ -696,14 +923,11 @@ kf_PyList_Append::impl_call_post (const call_details &cd) const
const svalue *offset_sval = mgr->get_or_create_binop (
size_type_node, MULT_EXPR, sizeof_sval, ob_size_sval);
const region *element_region
= mgr->get_offset_region (curr_reg, pyobj_ptr_ptr, offset_sval);
= mgr->get_offset_region (curr_reg, api.get_type_PyObject_ptr_ptr (), offset_sval);
model->set_value (element_region, newitem_sval, cd.get_ctxt ());
// Increment ob_refcnt of appended item.
tree ob_refcnt_tree = get_field_by_name (pyobj_record, "ob_refcnt");
const region *ob_refcnt_region
= mgr->get_field_region (newitem_reg, ob_refcnt_tree);
inc_field_val (mgr, model, ob_refcnt_region, size_type_node, cd);
api.do_Py_INCREF (model, newitem_reg, ctxt);
if (cd.get_lhs_type ())
{
@@ -742,20 +966,19 @@ kf_PyList_Append::impl_call_post (const call_details &cd) const
const region *newitem_reg = model->deref_rvalue (
newitem_sval, cd.get_arg_tree (1), cd.get_ctxt ());
tree ob_size_field = get_field_by_name (varobj_record, "ob_size");
const region *ob_size_region
= mgr->get_field_region (pylist_reg, ob_size_field);
= api.get_region_PyVarObject_ob_size (mgr, pylist_reg);
const svalue *old_ob_size_sval = nullptr;
const svalue *new_ob_size_sval = nullptr;
inc_field_val (mgr, model, ob_size_region, integer_type_node, cd,
&old_ob_size_sval, &new_ob_size_sval);
api.inc_field_val (model, ob_size_region, integer_type_node, ctxt,
&old_ob_size_sval, &new_ob_size_sval);
const svalue *sizeof_sval = mgr->get_or_create_cast (
old_ob_size_sval->get_type (), get_sizeof_pyobjptr (mgr));
old_ob_size_sval->get_type (), api.get_sval_sizeof_PyObject_ptr (mgr));
const svalue *new_size_sval = mgr->get_or_create_binop (
size_type_node, MULT_EXPR, sizeof_sval, new_ob_size_sval);
tree ob_item_field = get_field_by_name (pylistobj_record, "ob_item");
tree ob_item_field = api.get_field_PyListObject_ob_item ();
const region *ob_item_reg
= mgr->get_field_region (pylist_reg, ob_item_field);
const svalue *old_ptr_sval
@@ -765,7 +988,7 @@ kf_PyList_Append::impl_call_post (const call_details &cd) const
const region *new_reg = model->get_or_create_region_for_heap_alloc (
new_size_sval, cd.get_ctxt ());
const svalue *new_ptr_sval
= mgr->get_ptr_svalue (pyobj_ptr_ptr, new_reg);
= mgr->get_ptr_svalue (api.get_type_PyObject_ptr_ptr (), new_reg);
if (!model->add_constraint (new_ptr_sval, NE_EXPR, old_ptr_sval,
cd.get_ctxt ()))
return false;
@@ -780,11 +1003,11 @@ kf_PyList_Append::impl_call_post (const call_details &cd) const
const svalue *copied_size_sval
= get_copied_size (model, old_size_sval, new_size_sval);
const region *copied_old_reg = mgr->get_sized_region (
freed_reg, pyobj_ptr_ptr, copied_size_sval);
freed_reg, api.get_type_PyObject_ptr_ptr (), copied_size_sval);
const svalue *buffer_content_sval
= model->get_store_value (copied_old_reg, cd.get_ctxt ());
const region *copied_new_reg = mgr->get_sized_region (
new_reg, pyobj_ptr_ptr, copied_size_sval);
new_reg, api.get_type_PyObject_ptr_ptr (), copied_size_sval);
model->set_value (copied_new_reg, buffer_content_sval,
cd.get_ctxt ());
}
@@ -797,7 +1020,7 @@ kf_PyList_Append::impl_call_post (const call_details &cd) const
model->unset_dynamic_extents (freed_reg);
}
const svalue *null_ptr = mgr->get_or_create_null_ptr (pyobj_ptr_ptr);
const svalue *null_ptr = mgr->get_or_create_null_ptr (api.get_type_PyObject_ptr_ptr ());
if (!model->add_constraint (new_ptr_sval, NE_EXPR, null_ptr,
cd.get_ctxt ()))
return false;
@@ -808,14 +1031,11 @@ kf_PyList_Append::impl_call_post (const call_details &cd) const
const svalue *offset_sval = mgr->get_or_create_binop (
size_type_node, MULT_EXPR, sizeof_sval, old_ob_size_sval);
const region *element_region
= mgr->get_offset_region (new_reg, pyobj_ptr_ptr, offset_sval);
= mgr->get_offset_region (new_reg, api.get_type_PyObject_ptr_ptr (), offset_sval);
model->set_value (element_region, newitem_sval, cd.get_ctxt ());
// Increment ob_refcnt of appended item.
tree ob_refcnt_tree = get_field_by_name (pyobj_record, "ob_refcnt");
const region *ob_refcnt_region
= mgr->get_field_region (newitem_reg, ob_refcnt_tree);
inc_field_val (mgr, model, ob_refcnt_region, size_type_node, cd);
api.do_Py_INCREF (model, newitem_reg, ctxt);
if (cd.get_lhs_type ())
{
@@ -885,11 +1105,11 @@ kf_PyList_New::impl_call_post (const call_details &cd) const
region_model_manager *mgr = cd.get_manager ();
const svalue *pyobj_svalue
= mgr->get_or_create_unknown_svalue (pyobj_record);
= mgr->get_or_create_unknown_svalue (api.get_type_PyObject ());
const svalue *varobj_svalue
= mgr->get_or_create_unknown_svalue (varobj_record);
= mgr->get_or_create_unknown_svalue (api.get_type_PyVarObject ());
const svalue *pylist_svalue
= mgr->get_or_create_unknown_svalue (pylistobj_record);
= mgr->get_or_create_unknown_svalue (api.get_type_PyListObject ());
const svalue *size_sval = cd.get_arg_svalue (0);
@@ -904,12 +1124,13 @@ kf_PyList_New::impl_call_post (const call_details &cd) const
Py_ssize_t allocated;
} PyListObject;
*/
tree varobj_field = get_field_by_name (pylistobj_record, "ob_base");
tree varobj_field = get_field_by_name (api.get_type_PyListObject (),
"ob_base");
const region *varobj_region
= mgr->get_field_region (pylist_region, varobj_field);
model->set_value (varobj_region, varobj_svalue, cd.get_ctxt ());
tree ob_item_field = get_field_by_name (pylistobj_record, "ob_item");
tree ob_item_field = api.get_field_PyListObject_ob_item ();
const region *ob_item_region
= mgr->get_field_region (pylist_region, ob_item_field);
@@ -925,13 +1146,13 @@ kf_PyList_New::impl_call_post (const call_details &cd) const
integer_one_node))
{
const svalue *null_sval
= mgr->get_or_create_null_ptr (pyobj_ptr_ptr);
= mgr->get_or_create_null_ptr (api.get_type_PyObject_ptr_ptr ());
model->set_value (ob_item_region, null_sval, cd.get_ctxt ());
}
else // calloc
{
const svalue *sizeof_sval = mgr->get_or_create_cast (
size_sval->get_type (), get_sizeof_pyobjptr (mgr));
size_sval->get_type (), api.get_sval_sizeof_PyObject_ptr (mgr));
const svalue *prod_sval = mgr->get_or_create_binop (
size_type_node, MULT_EXPR, sizeof_sval, size_sval);
const region *ob_item_sized_region
@@ -939,7 +1160,8 @@ kf_PyList_New::impl_call_post (const call_details &cd) const
cd.get_ctxt ());
model->zero_fill_region (ob_item_sized_region, cd.get_ctxt ());
const svalue *ob_item_ptr_sval
= mgr->get_ptr_svalue (pyobj_ptr_ptr, ob_item_sized_region);
= mgr->get_ptr_svalue (api.get_type_PyObject_ptr_ptr (),
ob_item_sized_region);
const svalue *ob_item_unmergeable
= mgr->get_or_create_unmergeable (ob_item_ptr_sval);
model->set_value (ob_item_region, ob_item_unmergeable,
@@ -952,27 +1174,17 @@ kf_PyList_New::impl_call_post (const call_details &cd) const
Py_ssize_t ob_size; // Number of items in variable part
} PyVarObject;
*/
const region *ob_base_region = get_ob_base_region (
mgr, model, varobj_region, varobj_record, pyobj_svalue, cd);
const region *ob_base_region
= get_ob_base_region (mgr, model, varobj_region,
api.get_type_PyVarObject (),
pyobj_svalue, cd);
tree ob_size_tree = get_field_by_name (varobj_record, "ob_size");
const region *ob_size_region
= mgr->get_field_region (varobj_region, ob_size_tree);
= api.get_region_PyVarObject_ob_size (mgr, varobj_region);
model->set_value (ob_size_region, size_sval, cd.get_ctxt ());
/*
typedef struct _object {
_PyObject_HEAD_EXTRA
Py_ssize_t ob_refcnt;
PyTypeObject *ob_type;
} PyObject;
*/
// Initialize ob_refcnt field to 1.
init_ob_refcnt_field(mgr, model, ob_base_region, pyobj_record, cd);
// Get pointer svalue for PyList_Type then assign it to ob_type field.
set_ob_type_field(mgr, model, ob_base_region, pyobj_record, pylisttype_vardecl, cd);
api.init_PyObject_HEAD (model, ob_base_region,
api.get_vardecl_PyList_Type (), ctxt);
if (cd.get_lhs_type ())
{
@@ -1019,29 +1231,27 @@ kf_PyLong_FromLong::impl_call_post (const call_details &cd) const
region_model_manager *mgr = cd.get_manager ();
const svalue *pyobj_svalue
= mgr->get_or_create_unknown_svalue (pyobj_record);
= mgr->get_or_create_unknown_svalue (api.get_type_PyObject ());
const svalue *pylongobj_sval
= mgr->get_or_create_unknown_svalue (pylongobj_record);
= mgr->get_or_create_unknown_svalue (api.get_type_PyLongObject ());
const region *pylong_region
= init_pyobject_region (mgr, model, pylongobj_sval, cd);
// Create a region for the base PyObject within the PyLongObject.
const region *ob_base_region = get_ob_base_region (
mgr, model, pylong_region, pylongobj_record, pyobj_svalue, cd);
mgr, model, pylong_region, api.get_type_PyLongObject (), pyobj_svalue, cd);
// Initialize ob_refcnt field to 1.
init_ob_refcnt_field(mgr, model, ob_base_region, pyobj_record, cd);
// Get pointer svalue for PyLong_Type then assign it to ob_type field.
set_ob_type_field(mgr, model, ob_base_region, pyobj_record, pylongtype_vardecl, cd);
api.init_PyObject_HEAD (model, ob_base_region,
api.get_vardecl_PyLong_Type (), ctxt);
// Set the PyLongObject value.
tree ob_digit_field = get_field_by_name (pylongobj_record, "ob_digit");
const region *ob_digit_region
= mgr->get_field_region (pylong_region, ob_digit_field);
const svalue *ob_digit_sval = cd.get_arg_svalue (0);
model->set_value (ob_digit_region, ob_digit_sval, cd.get_ctxt ());
if (const region *ob_digit_region
= api.get_region_PyLongObject_ob_digit (mgr, pylong_region))
{
const svalue *ob_digit_sval = cd.get_arg_svalue (0);
model->set_value (ob_digit_region, ob_digit_sval, cd.get_ctxt ());
}
if (cd.get_lhs_type ())
{
@@ -1138,6 +1348,7 @@ get_stashed_type_by_name (const char *name)
gcc_assert (TREE_CODE (*slot) == RECORD_TYPE);
return *slot;
}
inform (UNKNOWN_LOCATION, "could not find CPython type %qs", name);
return NULL_TREE;
}
@@ -1152,24 +1363,72 @@ get_stashed_global_var_by_name (const char *name)
gcc_assert (TREE_CODE (*slot) == VAR_DECL);
return *slot;
}
inform (UNKNOWN_LOCATION, "could not find CPython global %qs", name);
return NULL_TREE;
}
static void
init_py_structs ()
{
pyobj_record = get_stashed_type_by_name ("PyObject");
varobj_record = get_stashed_type_by_name ("PyVarObject");
pylistobj_record = get_stashed_type_by_name ("PyListObject");
pylongobj_record = get_stashed_type_by_name ("PyLongObject");
pylongtype_vardecl = get_stashed_global_var_by_name ("PyLong_Type");
pylisttype_vardecl = get_stashed_global_var_by_name ("PyList_Type");
/* Attempt to find the various types for the CPython API, which
we hope were stashed when the frontend ran.
if (pyobj_record)
{
pyobj_ptr_tree = build_pointer_type (pyobj_record);
pyobj_ptr_ptr = build_pointer_type (pyobj_ptr_tree);
}
Return true if we found enough to run the plugin,
false if it doesn't make sense to run it. */
bool
api::init_from_stashed_types ()
{
memset (this, 0, sizeof (*this));
m_type_PyObject = get_stashed_type_by_name ("PyObject");
if (!m_type_PyObject)
return false;
gcc_assert (TREE_CODE (m_type_PyObject) == RECORD_TYPE);
m_field_PyObject_ob_refcnt
= get_field_by_name (m_type_PyObject, "ob_refcnt");
if (!m_field_PyObject_ob_refcnt)
return false;
m_field_PyObject_ob_type
= get_field_by_name (m_type_PyObject, "ob_type");
if (!m_field_PyObject_ob_type)
return false;
/* PyVarObject. */
m_type_PyVarObject = get_stashed_type_by_name ("PyVarObject");
if (!m_type_PyVarObject)
return false;
gcc_assert (TREE_CODE (m_type_PyVarObject) == RECORD_TYPE);
m_field_PyVarObject_ob_size
= get_field_by_name (m_type_PyVarObject, "ob_size");
if (!m_field_PyVarObject_ob_size)
return false;
/* PyListObject. */
m_type_PyListObject = get_stashed_type_by_name ("PyListObject");
if (!m_type_PyListObject)
return false;
m_field_PyListObject_ob_item
= get_field_by_name (m_type_PyListObject, "ob_item");
if (!m_field_PyListObject_ob_item)
return false;
m_vardecl_PyList_Type = get_stashed_global_var_by_name ("PyList_Type");
if (!m_vardecl_PyList_Type)
return false;
/* PyLongObject. */
m_type_PyLongObject = get_stashed_type_by_name ("PyLongObject");
if (!m_type_PyLongObject)
return false;
m_vardecl_PyLong_Type = get_stashed_global_var_by_name ("PyLong_Type");
if (!m_vardecl_PyLong_Type)
return false;
m_type_PyObject_ptr = build_pointer_type (m_type_PyObject);
m_type_PyObject_ptr_ptr = build_pointer_type (m_type_PyObject_ptr);
return true;
}
void
@@ -1182,9 +1441,14 @@ sorry_no_cpython_plugin ()
namespace analyzer_events = ::gcc::topics::analyzer_events;
class cpython_analyzer_events_subscriber : public analyzer_events::subscriber
class analyzer_events_subscriber : public analyzer_events::subscriber
{
public:
analyzer_events_subscriber ()
: m_init_failed (false)
{
}
void
on_message (const analyzer_events::on_tu_finished &msg) final override
{
@@ -1198,13 +1462,13 @@ public:
{
LOG_SCOPE (m.get_logger ());
init_py_structs ();
if (pyobj_record == NULL_TREE)
if (!api.init_from_stashed_types ())
{
sorry_no_cpython_plugin ();
m_init_failed = true;
return;
}
gcc_assert (api.get_type_PyObject ());
m.register_known_function ("PyList_Append",
std::make_unique<kf_PyList_Append> ());
@@ -1220,13 +1484,19 @@ public:
void
on_message (const analyzer_events::on_frame_popped &msg) final override
{
if (m_init_failed)
return;
pyobj_refcnt_checker (msg.m_new_model,
msg.m_old_model,
msg.m_retval,
msg.m_ctxt);
}
} cpython_sub;
private:
bool m_init_failed;
} event_sub;
} // namespace ana::cpython_plugin
} // namespace ana
#endif /* #if ENABLE_ANALYZER */
@@ -1239,7 +1509,8 @@ plugin_init (struct plugin_name_args *plugin_info,
const char *plugin_name = plugin_info->base_name;
if (0)
inform (input_location, "got here; %qs", plugin_name);
g->get_channels ().analyzer_events_channel.add_subscriber (ana::cpython_sub);
g->get_channels ().analyzer_events_channel.add_subscriber
(ana::cpython_plugin::event_sub);
#else
sorry_no_analyzer ();
#endif