refactor: decouple TelemetryClient from command_run logic#1246
refactor: decouple TelemetryClient from command_run logic#1246Hweinstock wants to merge 12 commits into
Conversation
TelemetryClient is now a generic metric emitter (emit, flush, shutdown). withCommandRun + recordCommandRun moved to cli-command-run.ts as standalone functions. All sinks accept metricName as first arg to record().
- help: migrated to withCommandRunTelemetry - dev browser mode: migrated to direct client.emit() - withCommandRun and CANCELLED are now module-private - Tests rewritten to go through withCommandRunTelemetry
…→deploy_mode - Remove `& Record<string, unknown>` from emit() — typos in attrs are now compile errors - MetricSink.record() uses MetricName instead of string - Remove unused instrument field from registry (YAGNI) - Rename deploy `mode` attr to `deploy_mode` to avoid collision with session Mode - Registry schema includes all per-command attrs as optional for full type coverage
…ommandGroup enum - Optional fields in registry derived automatically from COMMAND_SCHEMAS (eliminates 55 lines) - CommandGroup reverted to type-level derivation from Command (zero maintenance) - Removed CommandGroup z.enum from common-shapes.ts
…ependency - Derive Command from keyof COMMAND_SCHEMAS (zero maintenance, as before) - Remove Command z.enum from common-shapes.ts (~50 lines) - Remove mode from ATTRIBUTES (resource attr, not metric attr) - Remove unused barrel re-exports (CompositeSink, classifyError, isUserError, CommandGroup) - Revert deploy/utils.ts to simple literal union (no zod dependency)
Package TarballHow to installnpm install https://github.com/aws/agentcore-cli/releases/download/pr-1246-tarball/aws-agentcore-0.13.1.tgz |
agentcore-cli-automation
left a comment
There was a problem hiding this comment.
Nice cleanup overall — splitting the generic emitter from the command-run logic is a good direction, and resilientParse is a reasonable improvement over silently dropping the whole metric.
A few issues I think should be addressed before merging:
1. Same callback can either throw or return a failure Result depending on whether telemetry is initialized (src/cli/telemetry/cli-command-run.ts:86 and 100–107)
When client is unavailable, withCommandRunTelemetry returns fn() directly — if fn() throws, the rejection propagates. When client is available, a thrown exception is caught and converted to { success: false, error }, so the caller never sees the throw.
That means the same callback has two different observable behaviors depending on whether telemetry happens to be initialized:
- Without telemetry:
await withCommandRunTelemetry(...)rejects on throw. - With telemetry:
await withCommandRunTelemetry(...)resolves with a failureResult.
This is a footgun for callers (including the void withCommandRunTelemetry(...) calls in useDeployFlow.ts — those would emit unhandled rejections only when telemetry is disabled).
Options to fix:
- Make the no-client branch also catch and convert:
if (!client) { try { return await fn(); } catch (e) { return { success: false, error: ... } as R; } }. - Make the client branch propagate the throw instead of converting (and let call sites that expect a
Resultuse the existingif (!result.success) throw result.error;pattern). This also more naturally aligns withrunCliCommand, which re-throws.
2. Docstring on withCommandRunTelemetry contradicts the implementation (src/cli/telemetry/cli-command-run.ts:77-78)
The doc says: "If the callback throws, telemetry is recorded and the exception propagates." But lines 100–107 explicitly catch the exception and return a failure Result instead of re-throwing — and the test 'records failure and returns error result when callback throws' confirms the conversion is intentional. Whichever direction is taken to resolve issue #1, please update the docstring to match.
3. Stale comment on standardize() (src/cli/telemetry/schemas/common-shapes.ts:31-34, unchanged in this PR but inaccurate after the changes here)
It says invalid values "pass through to recordCommandRun, where COMMAND_SCHEMAS[command].parse(attrs) validates the full attr object in a try/catch — silently dropping the metric if any field is invalid." That's no longer true: recordCommandRun now uses resilientParse, which defaults invalid fields to 'unknown' rather than dropping the whole metric. This matters for anyone trying to understand whether passing an unvalidated string into standardize is safe.
Description
This PR addresses a few friction points for adding new metrics and attributes and cleans up the implementation to how its being used:
Adding a new metric after this PR
common-shapes.ts+ATTRIBUTESMETRICSinregistry.tswithMetricAttrsbranch for strong typing on emit.client.emit()or build higher-level utilities such aswithCommandRunTelemetry.Follow-Up
Now that attributes are global, we may want to prefix them to avoid ambiguity. This will be done as a follow-up.
Related Issue
Closes #
Documentation PR
N/A
Type of Change
Testing
How have you tested the change?
npm run test:unitandnpm run test:integnpm run typechecknpm run lintsrc/assets/, I rannpm run test:update-snapshotsand committed the updated snapshotsChecklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the
terms of your choice.