chore: changes cost & latency optimization to post-process by andrewklatzke · Pull Request #198 · launchdarkly/python-server-sdk-ai

andrewklatzke · 2026-06-03T18:15:46Z

Requirements

I have added test coverage for new or changed functionality
I have followed the repository's pull request submission guidelines
I have validated my changes against all supported platform versions

Describe the solution you've provided

Moves the cost and latency optimization process to happen as a post-process pass rather than attempting to optimize for everything in each loop.

This helps reduce the amount of noise the LLM is dealing with in a single loop. Flow is now optimize for quality -> validate with additional samples -> optimize for meta (latency, cost).

Describe alternatives you've considered

The ultimate goal here is to move to distinct scorers/criteria that can be ranked. For now, this is a better solution than the all-in-one passes we were doing previously which could regress.

Note

Medium Risk
Changes when optimizations pass/fail, which model/parameters are committed, and callback timing—behavioral regressions are possible despite extensive test updates.

Overview
Cost and latency are no longer mixed into the main optimization loop. Phase 1 only chases judge/validation quality; duration and cost gates are removed from standard turns, validation, and ground-truth samples. When latency or token optimization is enabled and Phase 1 succeeds, _run_cost_latency_phase runs with instructions frozen, reuses the winner’s input/variables, evaluates each distinct model_choices entry, applies latency/cost gates there, and picks the best passing candidate via normalized duration + cost vs baseline.

Prompting and variation generation split by phase: build_new_variation_prompt no longer takes cost/latency flags; Phase 2 uses new build_token_latency_variation_prompt (content lock, model/param-only changes). LLM instruction edits in Phase 2 are reverted if they drift from the frozen winner. Judge prompts inject latency/cost guidance only while _in_cost_latency_phase.

Run lifecycle and API surface: on_passing_result fires once with the true final context (Phase 2 winner or Phase 1 fallback); _handle_success can suppress that callback during intermediate success. Every agent turn adds a _meta score entry for raw latency/cost telemetry. auto_commit now persists parameters on the created variation. Tests were updated so Phase 1 success no longer depends on duration gates.

^{Reviewed by Cursor Bugbot for commit 4eb0bb0. Bugbot is set up for automated code reviews on this repo. Configure here.}

cursor

Cursor Bugbot has reviewed your changes using default effort and found 2 potential issues.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit aa0a77f. Configure here.}

changes cost & latency optimization to post-process

af4ec5d

andrewklatzke requested a review from a team as a code owner June 3, 2026 18:15

cursor Bot reviewed Jun 3, 2026

View reviewed changes

Comment thread packages/optimization/src/ldai_optimizer/client.py

Comment thread packages/optimization/src/ldai_optimizer/client.py

Comment thread packages/optimization/src/ldai_optimizer/client.py Outdated

cursor feedback

de5f24f

cursor Bot reviewed Jun 3, 2026

View reviewed changes

Comment thread packages/optimization/src/ldai_optimizer/client.py

Comment thread packages/optimization/src/ldai_optimizer/client.py

Comment thread packages/optimization/src/ldai_optimizer/client.py

more cursor feedback

aa0a77f

cursor Bot reviewed Jun 3, 2026

View reviewed changes

Comment thread packages/optimization/src/ldai_optimizer/client.py

Comment thread packages/optimization/src/ldai_optimizer/client.py

fix: ensure cost data is persisted

4eb0bb0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: changes cost & latency optimization to post-process#198

chore: changes cost & latency optimization to post-process#198
andrewklatzke wants to merge 4 commits into
aklatzke/AIC-2599/sdk-field-additionsfrom
aklatzke/AIC-2628/rearch-cost-latency-opt

andrewklatzke commented Jun 3, 2026 •

edited by cursor Bot

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

andrewklatzke commented Jun 3, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

andrewklatzke commented Jun 3, 2026 •

edited by cursor Bot

Loading