Skip to content

chore: changes cost & latency optimization to post-process#198

Open
andrewklatzke wants to merge 4 commits into
aklatzke/AIC-2599/sdk-field-additionsfrom
aklatzke/AIC-2628/rearch-cost-latency-opt
Open

chore: changes cost & latency optimization to post-process#198
andrewklatzke wants to merge 4 commits into
aklatzke/AIC-2599/sdk-field-additionsfrom
aklatzke/AIC-2628/rearch-cost-latency-opt

Conversation

@andrewklatzke
Copy link
Copy Markdown
Contributor

@andrewklatzke andrewklatzke commented Jun 3, 2026

Requirements

  • I have added test coverage for new or changed functionality
  • I have followed the repository's pull request submission guidelines
  • I have validated my changes against all supported platform versions

Describe the solution you've provided

Moves the cost and latency optimization process to happen as a post-process pass rather than attempting to optimize for everything in each loop.

This helps reduce the amount of noise the LLM is dealing with in a single loop. Flow is now optimize for quality -> validate with additional samples -> optimize for meta (latency, cost).

Describe alternatives you've considered

The ultimate goal here is to move to distinct scorers/criteria that can be ranked. For now, this is a better solution than the all-in-one passes we were doing previously which could regress.


Note

Medium Risk
Changes when optimizations pass/fail, which model/parameters are committed, and callback timing—behavioral regressions are possible despite extensive test updates.

Overview
Cost and latency are no longer mixed into the main optimization loop. Phase 1 only chases judge/validation quality; duration and cost gates are removed from standard turns, validation, and ground-truth samples. When latency or token optimization is enabled and Phase 1 succeeds, _run_cost_latency_phase runs with instructions frozen, reuses the winner’s input/variables, evaluates each distinct model_choices entry, applies latency/cost gates there, and picks the best passing candidate via normalized duration + cost vs baseline.

Prompting and variation generation split by phase: build_new_variation_prompt no longer takes cost/latency flags; Phase 2 uses new build_token_latency_variation_prompt (content lock, model/param-only changes). LLM instruction edits in Phase 2 are reverted if they drift from the frozen winner. Judge prompts inject latency/cost guidance only while _in_cost_latency_phase.

Run lifecycle and API surface: on_passing_result fires once with the true final context (Phase 2 winner or Phase 1 fallback); _handle_success can suppress that callback during intermediate success. Every agent turn adds a _meta score entry for raw latency/cost telemetry. auto_commit now persists parameters on the created variation. Tests were updated so Phase 1 success no longer depends on duration gates.

Reviewed by Cursor Bugbot for commit 4eb0bb0. Bugbot is set up for automated code reviews on this repo. Configure here.

@andrewklatzke andrewklatzke requested a review from a team as a code owner June 3, 2026 18:15
Comment thread packages/optimization/src/ldai_optimizer/client.py
Comment thread packages/optimization/src/ldai_optimizer/client.py
Comment thread packages/optimization/src/ldai_optimizer/client.py Outdated
Comment thread packages/optimization/src/ldai_optimizer/client.py
Comment thread packages/optimization/src/ldai_optimizer/client.py
Comment thread packages/optimization/src/ldai_optimizer/client.py
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit aa0a77f. Configure here.

Comment thread packages/optimization/src/ldai_optimizer/client.py
Comment thread packages/optimization/src/ldai_optimizer/client.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant