feat(agent-core): rework compaction to keep only user prompts and summary#1192
Open
7Sageer wants to merge 18 commits into
Open
feat(agent-core): rework compaction to keep only user prompts and summary#11927Sageer wants to merge 18 commits into
7Sageer wants to merge 18 commits into
Conversation
…mary Compact the whole history, keeping only real user prompts within a 20k token budget followed by a user-role summary prefixed with SUMMARY_PREFIX. Replace the compaction prompt with SUMMARIZATION_PROMPT, trigger auto-compaction at 90% of the context window, and drop assistant/tool messages and deferred injections on compaction.
🦋 Changeset detectedLatest commit: 4baee98 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
# Conflicts: # packages/agent-core/src/agent/compaction/full.ts # packages/agent-core/src/agent/context/index.ts # packages/agent-core/test/agent/compaction/full.test.ts
commit: |
- Revert auto-compaction trigger/block ratio to 0.85 - Rewrite truncateTextToTokens as a single-pass O(n) scan so CJK inputs do not freeze compaction - Mirror pending tool-exchange and deferred cleanup in the wire transcript reducer - Append the todo list to the compaction summary again - Restore the no-tools guard in the compaction prompt
Re-render the cached system prompt with fresh runtime context (cwd listing, AGENTS.md, additional-dirs info, skill list) once compaction finishes, so post-compaction turns do not keep the bootstrap snapshot. Cache the active profile on the Agent and expose refreshSystemPrompt(); FullCompaction invokes it after applyCompaction. This intentionally invalidates the prompt-cache prefix.
- Record keptUserMessageCount on the wire so transcript replay reproduces the live folded length after truncation. - Flush steered messages after compaction so notifications land in the post-compaction context instead of being dropped. - Unify real-user-input detection across context, transcript, and vis. - Reset injector state correctly after compaction. - Make the overflow compaction retry cap configurable. - Sync the vis context projector to the kept-users-plus-summary shape.
applyCompaction now preserves the persisted tokensAfter and keptUserMessageCount when replaying a compaction record during resume, so restored bookkeeping matches the wire record instead of being re-derived from replayed history (which can drift when token estimation changes, and breaks replay projections that assert the recorded values). Live compaction still derives both values from the current history. Update the affected compaction, resume, and replay-range tests.
Resolve conflicts in compaction telemetry: adopt the snake_case telemetry keys from main (MoonshotAI#1196) while keeping this branch's single-round compaction design that retains user messages. 'round' is hard-coded to 1 since this branch compacts in a single round; affected test assertions are updated to match the snake_case keys and this branch's token counts.
…o types - Reuse the shared isRealUserInput helper in ContextMemory.undo and SessionService.canUndoHistory instead of two local copies. - Sync the wire-transcript header comment with the new post-compaction shape ([...keptUserMessages, compaction_summary]). - Tighten memento.ts types by using kosong ContentPart and widening estimateTokensForMessage to a structural subset, dropping the `as never` cast.
Make the keep/drop decision for user-role messages explicit in the compaction memento helpers and cover every PromptOrigin kind. Keep Codex-style semantics: only real user prompts and user-slash skill activations survive compaction; other user-role messages are either re-injected or ephemeral. Add parity coverage across live context, transcript, and vis projector tests.
…pic adapter Strict Anthropic-compatible backends reject consecutive user messages with HTTP 400, so the adapter collapses them — but a plain-text user turn and an adjacent tool-result user message carry different semantics and must stay separate. Merge plain-text with plain-text (collapsing the post-compaction run of kept prompts + user-role summary + reminders) and tool-result with tool-result (parallel-tool-use spec), but not across the two kinds.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Related Issue
No prior issue. This reworks how conversation compaction rebuilds the model context.
Problem
The previous compaction strategy kept a mixed history (user prompts, assistant messages, tool calls, and tool results) and could layer multiple compaction summaries over time. This made the post-compaction context hard to reason about, wasted tokens on tool exchanges that no longer matter, and produced inconsistent results between the live context rewrite and the transcript reducer.
What changed
COMPACTION_SUMMARY_PREFIX.memento.ts) between the live context rewrite and the transcript reducer so both apply the exact same rule.blockRatioequal totriggerRatioso compaction runs synchronously.memento.test.tsand update compaction, context, and transcript tests to cover the new behavior.Checklist
gen-changesetsskill, or this PR needs no changeset.gen-docsskill, or this PR needs no doc update.