Skip to content

pyk: context-aware sort-parameter inference in add_sort_params (port of Java AddSortInjections)#4915

Draft
ehildenb wants to merge 24 commits into
developfrom
sort-handling-improvements
Draft

pyk: context-aware sort-parameter inference in add_sort_params (port of Java AddSortInjections)#4915
ehildenb wants to merge 24 commits into
developfrom
sort-handling-improvements

Conversation

@ehildenb
Copy link
Copy Markdown
Member

@ehildenb ehildenb commented Apr 27, 2026

The K frontend often emits parametric KLabels with their sort parameters left off. This reworks pyk's add_sort_params into a context-aware algorithm mirroring the Java frontend's AddSortInjections: a single recursive sweep threads an expected sort downwards (so a parametric return sort is taken from its context) and resolves the remaining parameters from the argument sorts coming back up.

Changes:

  • add_sort_params now takes an optional expected sort and resolves each parameter either bottom-up (argument-bound) or top-down (return-only); a parameter left genuinely free becomes a unique fresh sort variable (#SortParam{Q0}, {Q1}, …), with one such variable shared down a constraint spine. It asserts that a parametric return sort only stays free on the spine.
  • KDefinition.least_upper_bound: a set-based, order-independent least upper bound over the subsort lattice (resolves siblings that share a common supersort).
  • KSort.contains: a structural "does this sort mention that sort" query (mirrors Java Sort.contains), replacing a module-level helper.
  • sort and resolve_sorts handle KAs and unfilled KApply, with recursive substitution into nested sort params.
  • The #SortParam sentinel is not yet emittable to Kore: _ksort_to_kore rejects it with a clear error rather than producing invalid Kore, and kast_to_kore is left calling add_sort_params without an expected sort for now (opt-in), so pipeline emission is unchanged. Full design in pyk/docs/2026-06-01-sortparam-kore-emission.md.
  • Date-prefix the existing pyk/docs markdown filenames for chronological sorting.

Validation:

  • Unit tests for add_sort_params (bottom-up, top-down expected-sort resolution, spine sharing, the spine-violation assertion, conflict and unsortable-argument handling), plus least_upper_bound, KSort.contains, and _ksort_to_kore sentinel rejection. Full pyk unit suite green; make -C pyk check clean.
  • Not yet run locally: the konvert integration and regression-new suites (the machine's K install is pinned for a benchmark) — relying on CI.

ehildenb and others added 6 commits April 27, 2026 22:39
…(), add_sort_params()

Adjacent tests (already pass): sort for KVariable, KApply with direct result sort,
resolve_sorts direct substitution, add_sort_params with already-filled or direct params.

New-feature tests (fail at HEAD, committed with the fixes that make them pass):
sort for unfilled KApply (should return None not raise), sort for nested result sort
(MInt{N} → MInt{Int}), sort for KAs, resolve_sorts nested substitution, add_sort_params
with nested param (MInt{N} ~ MInt{Int} → N=Int), add_sort_params ML pred sentinel.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three parametrized test functions (test_sort, test_resolve_sorts, test_add_sort_params)
replace the eleven individual test functions.  Each has a *_DATA tuple at module scope
so new cases can be added by appending one tuple element.

New cases added:
- test_sort: ktoken, ksequence, kapply_unknown_label (KeyError → None),
  kas_unsorted_alias (alias with no sort annotation → None)
- test_add_sort_params: unsortable_arg_unchanged (arg with no sort annotation
  prevents param filling — returns term unchanged)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Tests both the positive path (wrap when parent production expects the cell
map sort) and the negative path (no wrap when parent expects the individual
cell element sort) introduced in 78bfdab101.  The cell-map productions are
merged into the existing single DEFN to keep the fixture count low.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ive substitution

sort() gains a KAs case that returns the alias variable's sort, and the KApply case
is now wrapped in try/except so unfilled sort params (e.g. when label.params=[] but
the production requires params) return None instead of raising ValueError.

resolve_sorts() now recurses into nested compound sorts, enabling N → Int to also
substitute inside MInt{N} → MInt{Int}.  The old code used sorts.get(sort, sort)
which only handled the case where sort IS a parameter, not nested inside one.

Makes test_definition tests pass: test_sort_kas, test_sort_kapply_unfilled_params_returns_none,
test_sort_kapply_nested_result_sort, test_resolve_sorts_nested.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…t sort sentinel

The old implementation only extracted sort-param bindings when the production
argument sort was exactly a sort parameter (psort in prod.params).  This missed
cases like MInt{Width} ~ MInt{8} where the parameter is nested.

New helper _unify_sort_params recursively matches a parametric sort against an
actual sort, extracting {Width: 8} from MInt{Width} ~ MInt{8}.  _merge_binding
handles conflicts and LUB computation when the same parameter appears in multiple
argument positions.

ML predicates (#Equals, #Ceil, #Floor, #In) have a context-dependent result sort
that cannot be inferred from arguments alone.  When Sort1 (the operand sort) is
determined but Sort2 (the axiom result sort) is not, Sort2 is filled with the
sentinel KSort('#SortParam').  Downstream code (_ksort_to_kore) maps this sentinel
to SortVar('Q0') to emit the correct axiom{R,Q0} universally-quantified form.

Makes test_definition tests pass: test_add_sort_params_nested_param,
test_add_sort_params_ml_pred_sentinel.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The old _wrap_elements unconditionally wrapped any KApply matching a cell label,
even when the parent production expected the individual cell sort (not the map sort).
This caused double-wrapping in productions like EntryCellMapKey(<entry>(...)) where
the argument sort is EntryCell, not EntryCellMap.

The new implementation inspects the parent production's expected argument sorts and
only wraps when the expected sort matches the cell map sort (e.g. EntryCellMap).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@rv-jenkins rv-jenkins changed the base branch from master to develop April 27, 2026 22:51
Comment thread pyk/src/pyk/kast/outer.py Outdated
ehildenb and others added 6 commits April 28, 2026 23:11
The single KSort('#SortParam') sentinel is only unambiguous when exactly
one sort parameter is unresolvable bottom-up.  Add an assertion that
documents this invariant and crashes loudly if it is ever violated, with
a message pointing toward the Java-style unique fresh-param fix.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ulti-unbound ML pred sort params

The single KSort('#SortParam') sentinel can only represent one unbound sort
parameter at a time.  If an ML predicate ever had two or more unbound params
simultaneously the old assert (disabled by -O) would silently pass or raise a
bare AssertionError with no guidance.

Replace the assert with NotImplementedError that names the fix: generate unique
fresh sentinels analogous to Java's #SortParam{Q0}, #SortParam{Q1}, etc.  Add
a unit test with a hypothetical 3-sort-param #Equals production (two unbound
params) that verifies the error is raised.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… sort params

After the argument loop, if some sort params remain unbound and the label is not
an ML predicate, add_sort_params previously fell through silently.  Add a
_LOGGER.warning that names the label and the unresolved params, matching the
existing warning style for the asort-is-None case.

Add test_add_sort_params_user_label_unresolvable_warns using a pair(S1,S2)
production where S2 does not appear in any argument sort, verifying the warning
fires and the term is returned unchanged.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… inference

Extract _sort_contains and _match_sort_params as module-level pure functions,
implementing the same three-strategy matching algorithm as Java's
AddSortInjections.match(): direct sort param binding, structural unification,
and subsort-aware matching (iterate s ≤ actual with same head as parametric).

Add KDefinition.infer_sort_params() mirroring substituteProd(), including the
matchExpected path for binding result-sort-only params from a known expected
sort. Simplify add_sort_params() to delegate to infer_sort_params(), keeping
only the ML predicate sentinel policy inline.

Expand test_definition.py with: test_infer_sort_params (direct, nested,
subsort-aware, matchExpected, unbound cases), test_match_sort_params (all three
strategies in isolation), test_sort_contains, and a subsort_aware case in the
existing test_add_sort_params table.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@ehildenb ehildenb changed the title Handle KAs in sort inference, basic sort parameter inference, and better cell map handling Handle KAs in sort inference, add sort parameter inference May 12, 2026
ehildenb and others added 7 commits May 27, 2026 17:51
…_sort) tuple

Return a tuple from infer_sort_params so callers can tell, without inspecting
prod.params separately, whether inference produced a fully-determined result sort:
- bindings: dict[KSort, KSort | None] — None value signals a detected conflict
  (candidates found but no common supersort), absent key means no candidates
- inferred_sort: KSort | None — None when any sort param in prod.sort is unbound
  or conflicted; concrete sort otherwise (always concrete for non-parametric prods)

Add module-level _resolve_sort_partial helper that substitutes successful bindings
into a sort and returns None for unbound params.

Update add_sort_params to unpack the tuple (discards the inferred sort; filters
None bindings before the existing downstream logic, which is unchanged).

Test improvements:
- replace hardcoded _subsorts_fn with DEFN.subsorts directly
- add conflicting_args and expected_sort_mismatch negative cases
- add no_params_trivial case to distinguish "no params needed" from "params exist
  but no candidates" — both were previously indistinguishable as {}
- note in subsort_aware add_sort_params test that variable sorts are not modified

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…sort_params, add_sort_params)

Fold the bug-demonstration harness into the permanent unit suite. These cases
encode the desired (Java AddSortInjections-equivalent) behaviour and currently
FAIL; subsequent commits fix each issue and turn them green.

- expected_sort_param_in_arg: matchExpected must skip params already bound by an
  argument sort (the membership check currently uses concrete actual sorts).
- lub_common_supersort + lub_order_independent: candidate LUB must be set-based
  and order-independent (the pairwise least_common_supersort fold is neither).
- conflicting_ml_pred_args: a genuine sort mismatch on an ML predicate must warn
  and return unchanged, not raise NotImplementedError about the sentinel scheme.

Adds f2/f3 productions and a Number ::= Int | Float subsort lattice to the
shared test definition; all pre-existing cases remain green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…of raising

infer_sort_params maps a param to None when its candidate sorts have no common
supersort (a genuine sort mismatch), versus omitting params that have no
candidates at all. add_sort_params previously discarded that distinction: a
conflicted param looked simply "unbound", so a clash on an ML predicate's
argument sort raised NotImplementedError blaming the single-sentinel scheme.

Detect conflicts explicitly and warn + return the term unchanged (best-effort),
matching the pre-existing behaviour for irreconcilable sorts. The legitimate
multi-unbound-result-param guard is unaffected — it only fires when params have
no candidates, not when they conflict.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…sorts

The matchExpected path infers a sort param from the expected result sort only
when that param appears in the result but in no argument sort. The membership
check compared the concrete actual_sorts against the param, which can never
contain a sort variable, so the condition was always satisfied: params already
bound by an argument were wrongly re-matched against the expected sort, adding a
spurious candidate that could turn a clean binding into a conflict.

Test the production's declared (parametric) argument_sorts instead, mirroring
Java's `nt.sort().contains(param)`.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…resolution

The candidate-resolution step folded sort candidates pairwise through
least_common_supersort, which only succeeds when one sort is a subsort of the
other. That made the result order-dependent and reported a conflict for sibling
candidates that share a common supersort (e.g. Int and Float under Number).

Add KDefinition.least_upper_bound, which computes the unique minimal common
supersort over the whole subsort lattice (or None when none exists or the bound
is ambiguous), mirroring Java AddSortInjections.lub. infer_sort_params now
resolves each param's candidate set through it, so inference is order-independent
and resolves siblings with a shared supersort.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@ehildenb ehildenb changed the title Handle KAs in sort inference, add sort parameter inference pyk: direct sort-parameter inference in add_sort_params (port of Java AddSortInjections) Jun 2, 2026
ehildenb and others added 3 commits June 2, 2026 01:35
…entinel; document full fix

add_sort_params fills an uninferable ML-predicate result sort with the
#SortParam sentinel sort, but _ksort_to_kore had no handling for it: it built
SortApp('Sort#SortParam'), which crashes the Kore Id constructor ('#' is not a
legal identifier char). Emitting it correctly needs a universally-quantified
sort variable bound at the axiom level (Java's sortParams scheme), which is a
larger change requiring integration-level validation.

Until that lands, reject the sentinel in _ksort_to_kore with a clear, actionable
error instead of a cryptic crash, and document the full fix.

- prelude/k: define the shared SORT_PARAM_SENTINEL sort, used by both the
  inference site (add_sort_params) and the emission site (_ksort_to_kore).
- konvert/_kast_to_kore: _ksort_to_kore raises a descriptive ValueError for any
  member of the sentinel family (name startswith match), pointing at the design doc.
- test_konvert: cover the ordinary-sort path and sentinel rejection (bare and
  uniquely-named forms).
- docs/2026-06-01-sortparam-kore-emission.md: design for the full sentinel -> sort-variable fix.
- outer: source the sentinel from prelude/k; note the Kore emission step is not yet implemented.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ological sorting

Prefix existing pyk/docs markdown with their yyyy-mm-dd add date so the directory
sorts chronologically, matching the convention adopted for new design docs. Both
files were added 2026-04-27; pure renames with no content or reference changes
(neither is wired into the Sphinx toctree or linked elsewhere).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…with KSort.contains method

The _sort_contains module-level helper in outer was a pure structural query on a
sort. Make it a KSort.contains method instead: it reads cleanly at the call site
(prod.sort.contains(p)) and mirrors Java's Sort.contains used by the
AddSortInjections algorithm this code ports. The lone caller in infer_sort_params
is updated, and the unit test moves to test_inner alongside KSort.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@ehildenb ehildenb force-pushed the sort-handling-improvements branch from 77a4e83 to a35b570 Compare June 2, 2026 02:21
Comment thread pyk/src/pyk/kast/outer.py
Comment on lines +1029 to +1033
for s in subsorts_fn(actual):
if s.name == parametric.name and len(s.params) == len(parametric.params):
for p_sub, a_sub in zip(parametric.params, s.params, strict=True):
for k, vs in _match_sort_params(p_sub, a_sub, params).items():
result.setdefault(k, []).extend(vs)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we break if the if s.name == ... condition is true? We are looking for the first subsort that matches the given sort name right?

Also, can we factor out the result = {} ... for p_sub, a_sub ... for k, vs ... result.setdefault(...) logic into a helper inner function defined here? Something that computes piecewise the sort parameter bindings on arguments? Then we can reuse it here and the clause above this one.

Comment thread pyk/src/pyk/kast/outer.py Outdated
Comment on lines +1038 to +1054
def _resolve_sort_partial(
sort: KSort,
bindings: dict[KSort, KSort],
params: frozenset[KSort],
) -> KSort | None:
"""Substitute ``bindings`` into ``sort``, returning ``None`` if any param is unbound.

A param in ``params`` is considered unbound if it does not appear in ``bindings``.
"""
if sort in params:
return bindings.get(sort)
if sort.params:
resolved = [_resolve_sort_partial(p, bindings, params) for p in sort.params]
if any(r is None for r in resolved):
return None
return KSort(sort.name, tuple(resolved)) # type: ignore[arg-type]
return sort
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only used in one place. Is it reasonable to inline it to its use site?

… sweep with fresh sort vars

Replace the hardcoded ML-predicate special-casing and single #SortParam sentinel
with a structural, context-aware algorithm mirroring Java's AddSortInjections.

add_sort_params now takes an optional expected `sort` and runs one recursive
sweep that threads the expected sort downwards (so a parametric *return* sort is
taken from its context) and resolves the remaining parameters from the argument
sorts coming back up. Each parameter is either argument-bound (bottom-up) or
return-only (top-down); #SortParam sentinels are low-priority placeholders that a
real sort always beats. A parameter left genuinely free becomes a unique fresh
sort variable (#SortParam{Q0}, {Q1}, ...), and one such variable is shared down a
constraint spine. It is asserted that a parametric return sort only stays free on
the spine — a concrete expected sort it cannot satisfy is an ill-formed term.

This removes the _ML_PRED_LABELS list, the single-sentinel limitation, and the
NotImplementedError guard. kast_to_kore is left calling add_sort_params without
an expected sort for now (opt-in), so pipeline emission is unchanged pending
integration validation; _ksort_to_kore still rejects every #SortParam{Qn}.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…tial; test least_upper_bound directly

The add_sort_params rewrite inlined per-node inference into its recursive sweep,
leaving infer_sort_params (and its only helper _resolve_sort_partial) with no
callers. Remove both so the module presents a single sort-inference algorithm.

Replace the infer_sort_params tests with a direct least_upper_bound test (the
set-based, order-independent LUB those tests had been exercising) and add an
add_sort_params case asserting an argument-bound parameter ignores a conflicting
expected sort. Drop the now-unused baz/f2/f3 fixtures.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@ehildenb ehildenb changed the title pyk: direct sort-parameter inference in add_sort_params (port of Java AddSortInjections) pyk: context-aware sort-parameter inference in add_sort_params (port of Java AddSortInjections) Jun 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant