Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
81 commits
Select commit Hold shift + click to select a range
9a4fb5c
Python: Add self-validating CFG tests
tausbn Apr 16, 2026
cc471fd
Python: Add some CFG-validation queries
tausbn Apr 16, 2026
661fd31
Python: Add BasicBlockOrdering test
tausbn Apr 16, 2026
6d829d6
Python: Add NeverReachable test
tausbn Apr 16, 2026
df6d0ca
Python: Add ConsecutiveTimestamps test
tausbn Apr 16, 2026
019e6f2
Python: Make CFG tests parameterised
tausbn Apr 20, 2026
30f28ba
Python: First stab at shared control-flow
tausbn Apr 16, 2026
b5df188
Python: Use fields everywhere in new AST classes
tausbn Apr 20, 2026
6b3a790
Python: Instantiate CFG module fully
tausbn Apr 20, 2026
e66bf87
Python: Instantiate CFG tests with new CFG library
tausbn Apr 20, 2026
4583244
Python: More AstNodeImpl improvements
tausbn Apr 21, 2026
f89a773
Python: Ignore synthetic CFG nodes
tausbn Apr 21, 2026
4e3a633
Python: Support various literals
tausbn Apr 21, 2026
b8bc230
Python: Assert statements
tausbn Apr 21, 2026
4336b07
Python: Function calls
tausbn Apr 21, 2026
999b8f2
Python: Attributes
tausbn Apr 21, 2026
da408d7
Python: assignments
tausbn Apr 21, 2026
cc09df2
Python: More simple statements
tausbn Apr 21, 2026
9f93d6c
Python: Add `with`
tausbn Apr 21, 2026
5b1de9e
Python: Comprehensions
tausbn Apr 21, 2026
d83d943
Python: More nodes
tausbn Apr 21, 2026
146a3a9
Python: Support `match`
tausbn Apr 21, 2026
cc77f0b
Python: Fix match
tausbn Apr 21, 2026
aaf9cc5
Python: Fix exception issue
tausbn Apr 21, 2026
655f84e
Python: Handle dict unpacking in calls
tausbn Apr 21, 2026
8d814e1
WIP
tausbn Apr 21, 2026
498aece
WIP2
tausbn Apr 28, 2026
41b5589
Cleanup, printCFG
tausbn Apr 28, 2026
768bdb5
Python: add pattern nodes
yoff May 4, 2026
5746ed7
python: add consistency checks
yoff May 4, 2026
7912e1b
Python: collapse two-layer AstNodeImpl into a single Ast module
May 5, 2026
2a04316
Python: compact-renumber FunctionExpr/Lambda defaults
May 5, 2026
c24e476
Shared CFG: support for-else and while-else loops
May 5, 2026
c398f92
Python: include try-else in getChild for completion propagation
May 5, 2026
414ebb9
Python: refactor getChild into per-class OO dispatch
May 5, 2026
3f6d099
Python: adapt to new shared CFG signature
May 5, 2026
a2d6d82
Python: dispatch toString/getLocation/getEnclosingCallable per branch
May 7, 2026
b3f87be
Python: merge T*AstNode wrappers into matching public classes
May 7, 2026
23e278e
Python: unify Py::BoolExpr handling via TBoolExprPair
May 7, 2026
b7fa080
Python: index TBlockStmt by Py::StmtList instead of (parent, slot)
May 7, 2026
e66f53d
Python: document why Assignment subclasses are empty
May 7, 2026
a66463d
Python: use private-abstract + final-alias pattern for AstNode
May 7, 2026
5e86f5b
Python: introduce TStmt union via newtype-branch alias
May 7, 2026
8d4fd93
Python: simplify TBlockStmt char pred via exclusion list
May 7, 2026
e9f14fc
Python: introduce TExpr union via newtype-branch alias
May 7, 2026
2c43ca9
Python: use newtype-branch constructors in characteristic predicates
May 7, 2026
9bdae5a
Python: project via as* helpers outside characteristic predicates
May 7, 2026
93bd4e3
Python: add CFG-binding gap tests (red)
Copilot May 12, 2026
bd20042
Python: wire AnnAssign into the shared CFG (green)
Copilot May 12, 2026
17c6d10
Python: wire parameters into the shared CFG (C# pattern)
Copilot May 12, 2026
8918067
Python: wire import-statement bindings into the shared CFG (green)
Copilot May 12, 2026
f17a662
Python: wire match-pattern bindings into the shared CFG (green)
Copilot May 12, 2026
3cf3420
Python: wire PEP 695 type parameters into the shared CFG (green)
yoff May 18, 2026
a249d58
Python: test dead bindings under no-raise CFG abstraction
yoff May 18, 2026
8f0e3c8
Python: introduce new-CFG facade
yoff May 18, 2026
780995c
Python: introduce shared-SSA adapter on the new CFG
yoff May 18, 2026
2257d1e
Python: fix augstore for the new CFG and add store/load test
yoff May 18, 2026
e74f8c9
Python: bring Cfg.qll's facade to API parity with Flow.qll
yoff May 18, 2026
47532ec
Python: qualify Flow.qll's AST references with Py:: prefix
yoff May 18, 2026
2741492
Python: extend new SSA with ESSA-shaped adapter + baseline comparison…
yoff May 18, 2026
ccd60a5
Python: SSA: handle closure variables via per-scope entry defs
yoff May 19, 2026
d308c88
Python: SSA adapter: add MultiAssignmentDefinition, definedBy, useOfDef
yoff May 19, 2026
f939854
Python: remove getAFlowNode() — bridge AST→CFG only via CFG-side getN…
yoff May 21, 2026
1037f52
Python: migrate dataflow library to new CFG + shared SSA
yoff May 26, 2026
8f6c246
Python: update dataflow tests for new CFG + shared SSA
yoff May 26, 2026
438092f
Python: model `from X import *` as uncertain SSA writes
yoff May 26, 2026
0a4ddf8
Python: treat augmented-assignment targets as both load and store
yoff May 26, 2026
6365b72
Python: drop legacy essa import from ImportResolution
yoff May 26, 2026
81b2f34
Python: adapt AstNodeImpl to upstream shared-CFG signature changes
yoff May 26, 2026
d5f1e09
Python: migrate remaining query-side files to new Cfg::
yoff May 26, 2026
2d997aa
Python: omit PEP 695 type-param names from FunctionDefExpr/ClassDefEx…
yoff May 26, 2026
b8c5e25
Python: migrate src queries to new shared CFG types + reformat
yoff May 26, 2026
ba0f24f
Python: canonicalize CFG nodes for dataflow
yoff May 28, 2026
04b130f
Python: fix library-test compile errors and rebless after CFG migration
yoff May 28, 2026
de77449
Python: migrate remaining tests off getAFlowNode() and fix star-impor…
yoff May 28, 2026
9696ee9
Python: migrate two more test queries off legacy CFG types
yoff May 28, 2026
cf28c32
Python: rebless toString churn from shared-CFG migration
yoff May 28, 2026
1bcaa56
Python: rebless second round after shared-CFG dataflow migration
yoff May 28, 2026
2e82990
Python: rebless CONSISTENCY queries + revert LongPath
yoff May 28, 2026
ef74ec1
Python: fold in evaluation-order review-comment fixes from main
yoff May 28, 2026
03c1f77
Python: revert spurious Py2 hidden-test rebless
yoff May 29, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
2 changes: 2 additions & 0 deletions python/ql/consistency-queries/CfgConsistency.ql
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
import semmle.python.controlflow.internal.AstNodeImpl
import ControlFlow::Consistency
11 changes: 6 additions & 5 deletions python/ql/consistency-queries/DataFlowConsistency.ql
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ private import semmle.python.dataflow.new.internal.DataFlowImplSpecific
private import semmle.python.dataflow.new.internal.DataFlowDispatch
private import semmle.python.dataflow.new.internal.TaintTrackingImplSpecific
private import codeql.dataflow.internal.DataFlowImplConsistency
private import semmle.python.controlflow.internal.Cfg as Cfg

private module Input implements InputSig<Location, PythonDataFlow> {
private import Private
Expand Down Expand Up @@ -72,7 +73,7 @@ private module Input implements InputSig<Location, PythonDataFlow> {
// resolve to multiple functions), but we only make _one_ ArgumentNode for each
// argument in the CallNode, we end up violating this consistency check in those
// cases. (see `getCallArg` in DataFlowDispatch.qll)
exists(DataFlowCall other, CallNode cfgCall | other != call |
exists(DataFlowCall other, Cfg::CallNode cfgCall | other != call |
call.getNode() = cfgCall and
other.getNode() = cfgCall and
isArgumentNode(arg, call, _) and
Expand All @@ -88,16 +89,16 @@ private module Input implements InputSig<Location, PythonDataFlow> {
// allow it instead.
(
call.getScope() = attr.getScope() and
any(CfgNode n | n.asCfgNode() = call.getNode().(CallNode).getFunction()).getALocalSource() =
attr
any(CfgNode n | n.asCfgNode() = call.getNode().(Cfg::CallNode).getFunction())
.getALocalSource() = attr
or
not exists(call.getScope().(Function).getDefinition()) and
call.getScope().getScope+() = attr.getScope()
) and
(
other.getScope() = attr.getScope() and
any(CfgNode n | n.asCfgNode() = other.getNode().(CallNode).getFunction()).getALocalSource() =
attr
any(CfgNode n | n.asCfgNode() = other.getNode().(Cfg::CallNode).getFunction())
.getALocalSource() = attr
or
not exists(other.getScope().(Function).getDefinition()) and
other.getScope().getScope+() = attr.getScope()
Expand Down
33 changes: 24 additions & 9 deletions python/ql/lib/LegacyPointsTo.qll
Original file line number Diff line number Diff line change
Expand Up @@ -213,9 +213,11 @@ class ExprWithPointsTo extends Expr {
* Gets what this expression might "refer-to" in the given `context`.
*/
predicate refersTo(Context context, Object obj, ClassObject cls, AstNode origin) {
this.getAFlowNode()
.(ControlFlowNodeWithPointsTo)
.refersTo(context, obj, cls, origin.getAFlowNode())
exists(ControlFlowNode this_, ControlFlowNode origin_ |
this_.getNode() = this and origin_.getNode() = origin
|
this_.(ControlFlowNodeWithPointsTo).refersTo(context, obj, cls, origin_)
)
}

/**
Expand All @@ -226,7 +228,11 @@ class ExprWithPointsTo extends Expr {
*/
pragma[nomagic]
predicate refersTo(Object obj, AstNode origin) {
this.getAFlowNode().(ControlFlowNodeWithPointsTo).refersTo(obj, origin.getAFlowNode())
exists(ControlFlowNode this_, ControlFlowNode origin_ |
this_.getNode() = this and origin_.getNode() = origin
|
this_.(ControlFlowNodeWithPointsTo).refersTo(obj, origin_)
)
}

/**
Expand All @@ -240,16 +246,22 @@ class ExprWithPointsTo extends Expr {
* in the given `context`.
*/
predicate pointsTo(Context context, Value value, AstNode origin) {
this.getAFlowNode()
.(ControlFlowNodeWithPointsTo)
.pointsTo(context, value, origin.getAFlowNode())
exists(ControlFlowNode this_, ControlFlowNode origin_ |
this_.getNode() = this and origin_.getNode() = origin
|
this_.(ControlFlowNodeWithPointsTo).pointsTo(context, value, origin_)
)
}

/**
* Holds if this expression might "point-to" to `value` which is from `origin`.
*/
predicate pointsTo(Value value, AstNode origin) {
this.getAFlowNode().(ControlFlowNodeWithPointsTo).pointsTo(value, origin.getAFlowNode())
exists(ControlFlowNode this_, ControlFlowNode origin_ |
this_.getNode() = this and origin_.getNode() = origin
|
this_.(ControlFlowNodeWithPointsTo).pointsTo(value, origin_)
)
}

/**
Expand Down Expand Up @@ -475,7 +487,10 @@ class FunctionMetricsWithPointsTo extends FunctionMetrics {
not non_coupling_method(result) and
exists(Call call | call.getScope() = this |
exists(FunctionObject callee | callee.getFunction() = result |
call.getAFlowNode().getFunction().(ControlFlowNodeWithPointsTo).refersTo(callee)
exists(CallNode call_ |
call_.getNode() = call and
call_.getFunction().(ControlFlowNodeWithPointsTo).refersTo(callee)
)
)
or
exists(Attribute a | call.getFunc() = a |
Expand Down
4 changes: 2 additions & 2 deletions python/ql/lib/analysis/DefinitionTracking.qll
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ private predicate jump_to_defn(ControlFlowNode use, Definition defn) {
private predicate preferred_jump_to_defn(Expr use, Definition def) {
not use instanceof ClassExpr and
not use instanceof FunctionExpr and
jump_to_defn(use.getAFlowNode(), def)
exists(ControlFlowNode useNode | useNode.getNode() = use | jump_to_defn(useNode, def))
}

private predicate unique_jump_to_defn(Expr use, Definition def) {
Expand Down Expand Up @@ -452,7 +452,7 @@ private predicate self_parameter_jump_to_defn_attribute(
* This exists primarily for testing use `getPreferredDefinition()` instead.
*/
Definition getADefinition(Expr use) {
jump_to_defn(use.getAFlowNode(), result) and
exists(ControlFlowNode useNode | useNode.getNode() = use | jump_to_defn(useNode, result)) and
not use instanceof Call and
not use.isArtificial() and
// Not the use itself
Expand Down
4 changes: 4 additions & 0 deletions python/ql/lib/change-notes/2026-05-26-shared-cfg-and-ssa.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
---
category: minorAnalysis
---
* The Python dataflow library is now built on the shared CFG and SSA libraries (`shared/controlflow` and `shared/ssa`), bringing Python in line with the other CodeQL languages. The legacy CFG in `semmle/python/Flow.qll` and the legacy ESSA SSA in `semmle/python/essa/*` remain available for downstream queries but are no longer used by the new dataflow library, type tracking, or API graphs. Most queries should be unaffected; a small number may produce slightly different results because of differences in CFG granularity (e.g. separate pre/post nodes per expression) and in how attribute and tuple-unpacking writes are modelled.
45 changes: 45 additions & 0 deletions python/ql/lib/printCfgNew.ql
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
/**
* @name Print CFG (New)
* @description Produces a representation of a file's Control Flow Graph
* using the new shared control flow library.
* This query is used by the VS Code extension.
* @id python/print-cfg
* @kind graph
* @tags ide-contextual-queries/print-cfg
*/

private import python as Py
import semmle.python.controlflow.internal.AstNodeImpl

external string selectedSourceFile();

private predicate selectedSourceFileAlias = selectedSourceFile/0;

external int selectedSourceLine();

private predicate selectedSourceLineAlias = selectedSourceLine/0;

external int selectedSourceColumn();

private predicate selectedSourceColumnAlias = selectedSourceColumn/0;

module ViewCfgQueryInput implements ControlFlow::ViewCfgQueryInputSig<Py::File> {
predicate selectedSourceFile = selectedSourceFileAlias/0;

predicate selectedSourceLine = selectedSourceLineAlias/0;

predicate selectedSourceColumn = selectedSourceColumnAlias/0;

predicate cfgScopeSpan(
Ast::Callable callable, Py::File file, int startLine, int startColumn, int endLine,
int endColumn
) {
exists(Py::Scope scope |
scope = callable.asScope() and
file = scope.getLocation().getFile() and
scope.getLocation().hasLocationInfo(_, startLine, startColumn, endLine, endColumn)
)
}
}

import ControlFlow::ViewCfgQuery<Py::File, ViewCfgQueryInput>
1 change: 1 addition & 0 deletions python/ql/lib/qlpack.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ library: true
upgrades: upgrades
dependencies:
codeql/concepts: ${workspace}
codeql/controlflow: ${workspace}
codeql/dataflow: ${workspace}
codeql/mad: ${workspace}
codeql/regex: ${workspace}
Expand Down
13 changes: 7 additions & 6 deletions python/ql/lib/semmle/python/ApiGraphs.qll
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,9 @@
* directed and labeled; they specify how the components represented by nodes relate to each other.
*/

// Importing python under the `py` namespace to avoid importing `CallNode` from `Flow.qll` and thereby having a naming conflict with `API::CallNode`.
// Importing python under the `py` namespace to avoid importing `Cfg::CallNode` from `Flow.qll` and thereby having a naming conflict with `API::CallNode`.
private import python as PY
private import semmle.python.controlflow.internal.Cfg as Cfg
import semmle.python.dataflow.new.DataFlow
private import semmle.python.internal.CachedStages

Expand Down Expand Up @@ -282,15 +283,15 @@ module API {
index = this.getIndex() and
(
// subscripting
exists(PY::SubscriptNode subscript |
exists(Cfg::SubscriptNode subscript |
subscript.getObject() = this.getAValueReachableFromSource().asCfgNode() and
subscript.getIndex() = index.asSink().asCfgNode()
|
// reading
subscript = result.asSource().asCfgNode()
or
// writing
subscript.(PY::DefinitionNode).getValue() = result.asSink().asCfgNode()
subscript.(Cfg::DefinitionNode).getValue() = result.asSink().asCfgNode()
)
or
// dictionary literals
Expand Down Expand Up @@ -684,7 +685,7 @@ module API {
* Ignores relative imports, such as `from ..foo.bar import baz`.
*/
private predicate imports(DataFlow::CfgNode imp, string name) {
exists(PY::ImportExprNode iexpr |
exists(Cfg::ImportExprNode iexpr |
imp.getNode() = iexpr and
not iexpr.getNode().isRelative() and
name = iexpr.getNode().getImportedModuleName()
Expand Down Expand Up @@ -775,7 +776,7 @@ module API {
// list literals, from `x` to `[x]`
// TODO: once convenient, this should be done at a higher level than the AST,
// at least at the CFG layer, to take splitting into account.
// Also consider `SequenceNode for generality.
// Also consider `Cfg::SequenceNode for generality.
exists(PY::List list | list = pred.(DataFlow::ExprNode).getNode().getNode() |
rhs.(DataFlow::ExprNode).getNode().getNode() = list.getAnElt() and
lbl = Label::subscript()
Expand Down Expand Up @@ -805,7 +806,7 @@ module API {
subscript = trackUseNode(src).getSubscript(index)
|
// from `x` to a definition of `x[...]`
rhs.asCfgNode() = subscript.asCfgNode().(PY::DefinitionNode).getValue() and
rhs.asCfgNode() = subscript.asCfgNode().(Cfg::DefinitionNode).getValue() and
lbl = Label::subscript()
or
// from `x` to `"key"` in `x["key"]`
Expand Down
11 changes: 0 additions & 11 deletions python/ql/lib/semmle/python/AstExtended.qll
Original file line number Diff line number Diff line change
Expand Up @@ -16,17 +16,6 @@ abstract class AstNode extends AstNode_ {
/** Gets the scope that this node occurs in */
abstract Scope getScope();

/**
* Gets a flow node corresponding directly to this node.
* NOTE: For some statements and other purely syntactic elements,
* there may not be a `ControlFlowNode`
*/
cached
ControlFlowNode getAFlowNode() {
Stages::AST::ref() and
py_flow_bb_node(result, this, _, _)
}

/** Gets the location for this AST node */
cached
Location getLocation() { none() }
Expand Down
5 changes: 3 additions & 2 deletions python/ql/lib/semmle/python/Concepts.qll
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
*/

private import python
private import semmle.python.controlflow.internal.Cfg as Cfg
private import semmle.python.dataflow.new.DataFlow
private import semmle.python.dataflow.new.internal.DataFlowImplSpecific
private import semmle.python.dataflow.new.RemoteFlowSources
Expand Down Expand Up @@ -214,7 +215,7 @@ module Path {
SafeAccessCheck() { this = DataFlow::BarrierGuard<safeAccessCheck/3>::getABarrierNode() }
}

private predicate safeAccessCheck(DataFlow::GuardNode g, ControlFlowNode node, boolean branch) {
private predicate safeAccessCheck(DataFlow::GuardNode g, Cfg::ControlFlowNode node, boolean branch) {
g.(SafeAccessCheck::Range).checks(node, branch)
}

Expand All @@ -223,7 +224,7 @@ module Path {
/** A data-flow node that checks that a path is safe to access in some way, for example by having a controlled prefix. */
abstract class Range extends DataFlow::GuardNode {
/** Holds if this guard validates `node` upon evaluating to `branch`. */
abstract predicate checks(ControlFlowNode node, boolean branch);
abstract predicate checks(Cfg::ControlFlowNode node, boolean branch);
}
}
}
Expand Down
18 changes: 3 additions & 15 deletions python/ql/lib/semmle/python/Exprs.qll
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,9 @@ class Expr extends Expr_, AstNode {
/** Whether this expression may have a side effect (as determined purely from its syntax) */
predicate hasSideEffects() {
/* If an exception raised by this expression handled, count that as a side effect */
this.getAFlowNode().getASuccessor().getNode() instanceof ExceptStmt
exists(ControlFlowNode n | n.getNode() = this |
n.getASuccessor().getNode() instanceof ExceptStmt
)
or
this.getASubExpression().hasSideEffects()
}
Expand Down Expand Up @@ -68,8 +70,6 @@ class Attribute extends Attribute_ {
/* syntax: Expr.name */
override Expr getASubExpression() { result = this.getObject() }

override AttrNode getAFlowNode() { result = super.getAFlowNode() }

/** Gets the name of this attribute. That is the `name` in `obj.name` */
string getName() { result = Attribute_.super.getAttr() }

Expand All @@ -96,8 +96,6 @@ class Subscript extends Subscript_ {
}

Expr getObject() { result = Subscript_.super.getValue() }

override SubscriptNode getAFlowNode() { result = super.getAFlowNode() }
}

/** A call expression, such as `func(...)` */
Expand All @@ -113,8 +111,6 @@ class Call extends Call_ {

override string toString() { result = this.getFunc().toString() + "()" }

override CallNode getAFlowNode() { result = super.getAFlowNode() }

/** Gets a tuple (*) argument of this call. */
Expr getStarargs() { result = this.getAPositionalArg().(Starred).getValue() }

Expand Down Expand Up @@ -200,8 +196,6 @@ class IfExp extends IfExp_ {
override Expr getASubExpression() {
result = this.getTest() or result = this.getBody() or result = this.getOrelse()
}

override IfExprNode getAFlowNode() { result = super.getAFlowNode() }
}

/** A starred expression, such as the `*rest` in the assignment `first, *rest = seq` */
Expand Down Expand Up @@ -410,8 +404,6 @@ class PlaceHolder extends PlaceHolder_ {
override Expr getASubExpression() { none() }

override string toString() { result = "$" + this.getId() }

override NameNode getAFlowNode() { result = super.getAFlowNode() }
}

/** A tuple expression such as `( 1, 3, 5, 7, 9 )` */
Expand Down Expand Up @@ -478,8 +470,6 @@ class Name extends Name_ {

override string toString() { result = this.getId() }

override NameNode getAFlowNode() { result = super.getAFlowNode() }

override predicate isArtificial() {
/* Artificial variable names in comprehensions all start with "." */
this.getId().charAt(0) = "."
Expand Down Expand Up @@ -585,8 +575,6 @@ abstract class NameConstant extends Name, ImmutableLiteral {

override predicate isConstant() { any() }

override NameConstantNode getAFlowNode() { result = Name.super.getAFlowNode() }

override predicate isArtificial() { none() }
}

Expand Down
Loading
Loading