pyk: context-aware sort-parameter inference in add_sort_params (port of Java AddSortInjections)#4915
Draft
ehildenb wants to merge 24 commits into
Draft
pyk: context-aware sort-parameter inference in add_sort_params (port of Java AddSortInjections)#4915ehildenb wants to merge 24 commits into
ehildenb wants to merge 24 commits into
Conversation
…(), add_sort_params()
Adjacent tests (already pass): sort for KVariable, KApply with direct result sort,
resolve_sorts direct substitution, add_sort_params with already-filled or direct params.
New-feature tests (fail at HEAD, committed with the fixes that make them pass):
sort for unfilled KApply (should return None not raise), sort for nested result sort
(MInt{N} → MInt{Int}), sort for KAs, resolve_sorts nested substitution, add_sort_params
with nested param (MInt{N} ~ MInt{Int} → N=Int), add_sort_params ML pred sentinel.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three parametrized test functions (test_sort, test_resolve_sorts, test_add_sort_params) replace the eleven individual test functions. Each has a *_DATA tuple at module scope so new cases can be added by appending one tuple element. New cases added: - test_sort: ktoken, ksequence, kapply_unknown_label (KeyError → None), kas_unsorted_alias (alias with no sort annotation → None) - test_add_sort_params: unsortable_arg_unchanged (arg with no sort annotation prevents param filling — returns term unchanged) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Tests both the positive path (wrap when parent production expects the cell map sort) and the negative path (no wrap when parent expects the individual cell element sort) introduced in 78bfdab101. The cell-map productions are merged into the existing single DEFN to keep the fixture count low. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ive substitution
sort() gains a KAs case that returns the alias variable's sort, and the KApply case
is now wrapped in try/except so unfilled sort params (e.g. when label.params=[] but
the production requires params) return None instead of raising ValueError.
resolve_sorts() now recurses into nested compound sorts, enabling N → Int to also
substitute inside MInt{N} → MInt{Int}. The old code used sorts.get(sort, sort)
which only handled the case where sort IS a parameter, not nested inside one.
Makes test_definition tests pass: test_sort_kas, test_sort_kapply_unfilled_params_returns_none,
test_sort_kapply_nested_result_sort, test_resolve_sorts_nested.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…t sort sentinel
The old implementation only extracted sort-param bindings when the production
argument sort was exactly a sort parameter (psort in prod.params). This missed
cases like MInt{Width} ~ MInt{8} where the parameter is nested.
New helper _unify_sort_params recursively matches a parametric sort against an
actual sort, extracting {Width: 8} from MInt{Width} ~ MInt{8}. _merge_binding
handles conflicts and LUB computation when the same parameter appears in multiple
argument positions.
ML predicates (#Equals, #Ceil, #Floor, #In) have a context-dependent result sort
that cannot be inferred from arguments alone. When Sort1 (the operand sort) is
determined but Sort2 (the axiom result sort) is not, Sort2 is filled with the
sentinel KSort('#SortParam'). Downstream code (_ksort_to_kore) maps this sentinel
to SortVar('Q0') to emit the correct axiom{R,Q0} universally-quantified form.
Makes test_definition tests pass: test_add_sort_params_nested_param,
test_add_sort_params_ml_pred_sentinel.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The old _wrap_elements unconditionally wrapped any KApply matching a cell label, even when the parent production expected the individual cell sort (not the map sort). This caused double-wrapping in productions like EntryCellMapKey(<entry>(...)) where the argument sort is EntryCell, not EntryCellMap. The new implementation inspects the parent production's expected argument sorts and only wraps when the expected sort matches the cell map sort (e.g. EntryCellMap). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ehildenb
commented
Apr 27, 2026
The single KSort('#SortParam') sentinel is only unambiguous when exactly
one sort parameter is unresolvable bottom-up. Add an assertion that
documents this invariant and crashes loudly if it is ever violated, with
a message pointing toward the Java-style unique fresh-param fix.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ulti-unbound ML pred sort params
The single KSort('#SortParam') sentinel can only represent one unbound sort
parameter at a time. If an ML predicate ever had two or more unbound params
simultaneously the old assert (disabled by -O) would silently pass or raise a
bare AssertionError with no guidance.
Replace the assert with NotImplementedError that names the fix: generate unique
fresh sentinels analogous to Java's #SortParam{Q0}, #SortParam{Q1}, etc. Add
a unit test with a hypothetical 3-sort-param #Equals production (two unbound
params) that verifies the error is raised.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… sort params After the argument loop, if some sort params remain unbound and the label is not an ML predicate, add_sort_params previously fell through silently. Add a _LOGGER.warning that names the label and the unresolved params, matching the existing warning style for the asort-is-None case. Add test_add_sort_params_user_label_unresolvable_warns using a pair(S1,S2) production where S2 does not appear in any argument sort, verifying the warning fires and the term is returned unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… inference Extract _sort_contains and _match_sort_params as module-level pure functions, implementing the same three-strategy matching algorithm as Java's AddSortInjections.match(): direct sort param binding, structural unification, and subsort-aware matching (iterate s ≤ actual with same head as parametric). Add KDefinition.infer_sort_params() mirroring substituteProd(), including the matchExpected path for binding result-sort-only params from a known expected sort. Simplify add_sort_params() to delegate to infer_sort_params(), keeping only the ML predicate sentinel policy inline. Expand test_definition.py with: test_infer_sort_params (direct, nested, subsort-aware, matchExpected, unbound cases), test_match_sort_params (all three strategies in isolation), test_sort_contains, and a subsort_aware case in the existing test_add_sort_params table. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…_sort) tuple
Return a tuple from infer_sort_params so callers can tell, without inspecting
prod.params separately, whether inference produced a fully-determined result sort:
- bindings: dict[KSort, KSort | None] — None value signals a detected conflict
(candidates found but no common supersort), absent key means no candidates
- inferred_sort: KSort | None — None when any sort param in prod.sort is unbound
or conflicted; concrete sort otherwise (always concrete for non-parametric prods)
Add module-level _resolve_sort_partial helper that substitutes successful bindings
into a sort and returns None for unbound params.
Update add_sort_params to unpack the tuple (discards the inferred sort; filters
None bindings before the existing downstream logic, which is unchanged).
Test improvements:
- replace hardcoded _subsorts_fn with DEFN.subsorts directly
- add conflicting_args and expected_sort_mismatch negative cases
- add no_params_trivial case to distinguish "no params needed" from "params exist
but no candidates" — both were previously indistinguishable as {}
- note in subsort_aware add_sort_params test that variable sorts are not modified
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…sort_params, add_sort_params) Fold the bug-demonstration harness into the permanent unit suite. These cases encode the desired (Java AddSortInjections-equivalent) behaviour and currently FAIL; subsequent commits fix each issue and turn them green. - expected_sort_param_in_arg: matchExpected must skip params already bound by an argument sort (the membership check currently uses concrete actual sorts). - lub_common_supersort + lub_order_independent: candidate LUB must be set-based and order-independent (the pairwise least_common_supersort fold is neither). - conflicting_ml_pred_args: a genuine sort mismatch on an ML predicate must warn and return unchanged, not raise NotImplementedError about the sentinel scheme. Adds f2/f3 productions and a Number ::= Int | Float subsort lattice to the shared test definition; all pre-existing cases remain green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…of raising infer_sort_params maps a param to None when its candidate sorts have no common supersort (a genuine sort mismatch), versus omitting params that have no candidates at all. add_sort_params previously discarded that distinction: a conflicted param looked simply "unbound", so a clash on an ML predicate's argument sort raised NotImplementedError blaming the single-sentinel scheme. Detect conflicts explicitly and warn + return the term unchanged (best-effort), matching the pre-existing behaviour for irreconcilable sorts. The legitimate multi-unbound-result-param guard is unaffected — it only fires when params have no candidates, not when they conflict. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…sorts The matchExpected path infers a sort param from the expected result sort only when that param appears in the result but in no argument sort. The membership check compared the concrete actual_sorts against the param, which can never contain a sort variable, so the condition was always satisfied: params already bound by an argument were wrongly re-matched against the expected sort, adding a spurious candidate that could turn a clean binding into a conflict. Test the production's declared (parametric) argument_sorts instead, mirroring Java's `nt.sort().contains(param)`. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…resolution The candidate-resolution step folded sort candidates pairwise through least_common_supersort, which only succeeds when one sort is a subsort of the other. That made the result order-dependent and reported a conflict for sibling candidates that share a common supersort (e.g. Int and Float under Number). Add KDefinition.least_upper_bound, which computes the unique minimal common supersort over the whole subsort lattice (or None when none exists or the bound is ambiguous), mirroring Java AddSortInjections.lub. infer_sort_params now resolves each param's candidate set through it, so inference is order-independent and resolves siblings with a shared supersort. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…entinel; document full fix
add_sort_params fills an uninferable ML-predicate result sort with the
#SortParam sentinel sort, but _ksort_to_kore had no handling for it: it built
SortApp('Sort#SortParam'), which crashes the Kore Id constructor ('#' is not a
legal identifier char). Emitting it correctly needs a universally-quantified
sort variable bound at the axiom level (Java's sortParams scheme), which is a
larger change requiring integration-level validation.
Until that lands, reject the sentinel in _ksort_to_kore with a clear, actionable
error instead of a cryptic crash, and document the full fix.
- prelude/k: define the shared SORT_PARAM_SENTINEL sort, used by both the
inference site (add_sort_params) and the emission site (_ksort_to_kore).
- konvert/_kast_to_kore: _ksort_to_kore raises a descriptive ValueError for any
member of the sentinel family (name startswith match), pointing at the design doc.
- test_konvert: cover the ordinary-sort path and sentinel rejection (bare and
uniquely-named forms).
- docs/2026-06-01-sortparam-kore-emission.md: design for the full sentinel -> sort-variable fix.
- outer: source the sentinel from prelude/k; note the Kore emission step is not yet implemented.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ological sorting Prefix existing pyk/docs markdown with their yyyy-mm-dd add date so the directory sorts chronologically, matching the convention adopted for new design docs. Both files were added 2026-04-27; pure renames with no content or reference changes (neither is wired into the Sphinx toctree or linked elsewhere). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…with KSort.contains method The _sort_contains module-level helper in outer was a pure structural query on a sort. Make it a KSort.contains method instead: it reads cleanly at the call site (prod.sort.contains(p)) and mirrors Java's Sort.contains used by the AddSortInjections algorithm this code ports. The lone caller in infer_sort_params is updated, and the unit test moves to test_inner alongside KSort. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
77a4e83 to
a35b570
Compare
ehildenb
commented
Jun 2, 2026
Comment on lines
+1029
to
+1033
| for s in subsorts_fn(actual): | ||
| if s.name == parametric.name and len(s.params) == len(parametric.params): | ||
| for p_sub, a_sub in zip(parametric.params, s.params, strict=True): | ||
| for k, vs in _match_sort_params(p_sub, a_sub, params).items(): | ||
| result.setdefault(k, []).extend(vs) |
Member
Author
There was a problem hiding this comment.
Should we break if the if s.name == ... condition is true? We are looking for the first subsort that matches the given sort name right?
Also, can we factor out the result = {} ... for p_sub, a_sub ... for k, vs ... result.setdefault(...) logic into a helper inner function defined here? Something that computes piecewise the sort parameter bindings on arguments? Then we can reuse it here and the clause above this one.
ehildenb
commented
Jun 2, 2026
Comment on lines
+1038
to
+1054
| def _resolve_sort_partial( | ||
| sort: KSort, | ||
| bindings: dict[KSort, KSort], | ||
| params: frozenset[KSort], | ||
| ) -> KSort | None: | ||
| """Substitute ``bindings`` into ``sort``, returning ``None`` if any param is unbound. | ||
|
|
||
| A param in ``params`` is considered unbound if it does not appear in ``bindings``. | ||
| """ | ||
| if sort in params: | ||
| return bindings.get(sort) | ||
| if sort.params: | ||
| resolved = [_resolve_sort_partial(p, bindings, params) for p in sort.params] | ||
| if any(r is None for r in resolved): | ||
| return None | ||
| return KSort(sort.name, tuple(resolved)) # type: ignore[arg-type] | ||
| return sort |
Member
Author
There was a problem hiding this comment.
This is only used in one place. Is it reasonable to inline it to its use site?
… sweep with fresh sort vars
Replace the hardcoded ML-predicate special-casing and single #SortParam sentinel
with a structural, context-aware algorithm mirroring Java's AddSortInjections.
add_sort_params now takes an optional expected `sort` and runs one recursive
sweep that threads the expected sort downwards (so a parametric *return* sort is
taken from its context) and resolves the remaining parameters from the argument
sorts coming back up. Each parameter is either argument-bound (bottom-up) or
return-only (top-down); #SortParam sentinels are low-priority placeholders that a
real sort always beats. A parameter left genuinely free becomes a unique fresh
sort variable (#SortParam{Q0}, {Q1}, ...), and one such variable is shared down a
constraint spine. It is asserted that a parametric return sort only stays free on
the spine — a concrete expected sort it cannot satisfy is an ill-formed term.
This removes the _ML_PRED_LABELS list, the single-sentinel limitation, and the
NotImplementedError guard. kast_to_kore is left calling add_sort_params without
an expected sort for now (opt-in), so pipeline emission is unchanged pending
integration validation; _ksort_to_kore still rejects every #SortParam{Qn}.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…tial; test least_upper_bound directly The add_sort_params rewrite inlined per-node inference into its recursive sweep, leaving infer_sort_params (and its only helper _resolve_sort_partial) with no callers. Remove both so the module presents a single sort-inference algorithm. Replace the infer_sort_params tests with a direct least_upper_bound test (the set-based, order-independent LUB those tests had been exercising) and add an add_sort_params case asserting an argument-bound parameter ignores a conflicting expected sort. Drop the now-unused baz/f2/f3 fixtures. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The K frontend often emits parametric
KLabels with their sort parameters left off. This reworks pyk'sadd_sort_paramsinto a context-aware algorithm mirroring the Java frontend'sAddSortInjections: a single recursive sweep threads an expected sort downwards (so a parametric return sort is taken from its context) and resolves the remaining parameters from the argument sorts coming back up.Changes:
add_sort_paramsnow takes an optional expectedsortand resolves each parameter either bottom-up (argument-bound) or top-down (return-only); a parameter left genuinely free becomes a unique fresh sort variable (#SortParam{Q0},{Q1}, …), with one such variable shared down a constraint spine. It asserts that a parametric return sort only stays free on the spine.KDefinition.least_upper_bound: a set-based, order-independent least upper bound over the subsort lattice (resolves siblings that share a common supersort).KSort.contains: a structural "does this sort mention that sort" query (mirrors JavaSort.contains), replacing a module-level helper.sortandresolve_sortshandleKAsand unfilledKApply, with recursive substitution into nested sort params.#SortParamsentinel is not yet emittable to Kore:_ksort_to_korerejects it with a clear error rather than producing invalid Kore, andkast_to_koreis left callingadd_sort_paramswithout an expected sort for now (opt-in), so pipeline emission is unchanged. Full design inpyk/docs/2026-06-01-sortparam-kore-emission.md.pyk/docsmarkdown filenames for chronological sorting.Validation:
add_sort_params(bottom-up, top-down expected-sort resolution, spine sharing, the spine-violation assertion, conflict and unsortable-argument handling), plusleast_upper_bound,KSort.contains, and_ksort_to_koresentinel rejection. Full pyk unit suite green;make -C pyk checkclean.regression-newsuites (the machine's K install is pinned for a benchmark) — relying on CI.