refactor: decouple TelemetryClient from command_run logic by Hweinstock · Pull Request #1246 · aws/agentcore-cli

Hweinstock · 2026-05-14T13:53:42Z

Description

This PR addresses a few friction points for adding new metrics and attributes and cleans up the implementation to how its being used:

TelemetryClient is now generic and does not contain command_run specific logic.
Attributes are now global, such that all metric attribute names must be unique across metrics.
Remove unused cancellation logic.

Adding a new metric after this PR

Add attribute schemas to common-shapes.ts + ATTRIBUTES
Add entry to METRICS in registry.ts with MetricAttrs branch for strong typing on emit.
Call client.emit() or build higher-level utilities such as withCommandRunTelemetry.

Follow-Up

Now that attributes are global, we may want to prefix them to avoid ambiguity. This will be done as a follow-up.

Related Issue

Closes #

Documentation PR

N/A

Type of Change

Testing

How have you tested the change?

I ran npm run test:unit and npm run test:integ
I ran npm run typecheck
I ran npm run lint
If I modified src/assets/, I ran npm run test:update-snapshots and committed the updated snapshots

Checklist

I have read the CONTRIBUTING document
I have added any necessary tests that prove my fix is effective or my feature works
I have updated the documentation accordingly
I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
My changes generate no new warnings
Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the
terms of your choice.

TelemetryClient is now a generic metric emitter (emit, flush, shutdown). withCommandRun + recordCommandRun moved to cli-command-run.ts as standalone functions. All sinks accept metricName as first arg to record().

- help: migrated to withCommandRunTelemetry - dev browser mode: migrated to direct client.emit() - withCommandRun and CANCELLED are now module-private - Tests rewritten to go through withCommandRunTelemetry

…→deploy_mode - Remove `& Record<string, unknown>` from emit() — typos in attrs are now compile errors - MetricSink.record() uses MetricName instead of string - Remove unused instrument field from registry (YAGNI) - Rename deploy `mode` attr to `deploy_mode` to avoid collision with session Mode - Registry schema includes all per-command attrs as optional for full type coverage

…ommandGroup enum - Optional fields in registry derived automatically from COMMAND_SCHEMAS (eliminates 55 lines) - CommandGroup reverted to type-level derivation from Command (zero maintenance) - Removed CommandGroup z.enum from common-shapes.ts

…ependency - Derive Command from keyof COMMAND_SCHEMAS (zero maintenance, as before) - Remove Command z.enum from common-shapes.ts (~50 lines) - Remove mode from ATTRIBUTES (resource attr, not metric attr) - Remove unused barrel re-exports (CompositeSink, classifyError, isUserError, CommandGroup) - Revert deploy/utils.ts to simple literal union (no zod dependency)

…vents

github-actions · 2026-05-14T13:54:47Z

Package Tarball

aws-agentcore-0.13.1.tgz

How to install

npm install https://github.com/aws/agentcore-cli/releases/download/pr-1246-tarball/aws-agentcore-0.13.1.tgz

agentcore-cli-automation

Nice cleanup overall — splitting the generic emitter from the command-run logic is a good direction, and resilientParse is a reasonable improvement over silently dropping the whole metric.

A few issues I think should be addressed before merging:

1. Same callback can either throw or return a failure Result depending on whether telemetry is initialized (src/cli/telemetry/cli-command-run.ts:86 and 100–107)

When client is unavailable, withCommandRunTelemetry returns fn() directly — if fn() throws, the rejection propagates. When client is available, a thrown exception is caught and converted to { success: false, error }, so the caller never sees the throw.

That means the same callback has two different observable behaviors depending on whether telemetry happens to be initialized:

Without telemetry: await withCommandRunTelemetry(...) rejects on throw.
With telemetry: await withCommandRunTelemetry(...) resolves with a failure Result.

This is a footgun for callers (including the void withCommandRunTelemetry(...) calls in useDeployFlow.ts — those would emit unhandled rejections only when telemetry is disabled).

Options to fix:

Make the no-client branch also catch and convert: if (!client) { try { return await fn(); } catch (e) { return { success: false, error: ... } as R; } }.
Make the client branch propagate the throw instead of converting (and let call sites that expect a Result use the existing if (!result.success) throw result.error; pattern). This also more naturally aligns with runCliCommand, which re-throws.

2. Docstring on withCommandRunTelemetry contradicts the implementation (src/cli/telemetry/cli-command-run.ts:77-78)

The doc says: "If the callback throws, telemetry is recorded and the exception propagates." But lines 100–107 explicitly catch the exception and return a failure Result instead of re-throwing — and the test 'records failure and returns error result when callback throws' confirms the conversion is intentional. Whichever direction is taken to resolve issue #1, please update the docstring to match.

3. Stale comment on standardize() (src/cli/telemetry/schemas/common-shapes.ts:31-34, unchanged in this PR but inaccurate after the changes here)

It says invalid values "pass through to recordCommandRun, where COMMAND_SCHEMAS[command].parse(attrs) validates the full attr object in a try/catch — silently dropping the metric if any field is invalid." That's no longer true: recordCommandRun now uses resilientParse, which defaults invalid fields to 'unknown' rather than dropping the whole metric. This matters for anyone trying to understand whether passing an unvalidated string into standardize is safe.

Hweinstock added 12 commits May 14, 2026 03:00

refactor: decouple TelemetryClient from command_run logic

5da7e4b

TelemetryClient is now a generic metric emitter (emit, flush, shutdown). withCommandRun + recordCommandRun moved to cli-command-run.ts as standalone functions. All sinks accept metricName as first arg to record().

refactor: make withCommandRun internal, migrate consumers

ffbe31f

- help: migrated to withCommandRunTelemetry - dev browser mode: migrated to direct client.emit() - withCommandRun and CANCELLED are now module-private - Tests rewritten to go through withCommandRunTelemetry

feat: add typed metric registry, MetricName enforced at compile time

992ad6d

docs: update telemetry README with new metric and new command guides

7a1ded1

fix: restore try/catch in recordCommandRun to prevent telemetry crashes

50e781d

fix: remove dead CancelResult schema — no code path produces cancel e…

ab570dd

…vents

fix: rename stale mode→deploy_mode in useDeployFlow diff path

1021dc3

test: add coverage for callback-throws path in withCommandRunTelemetry

36f3c36

docs: rewrite telemetry README to match implementation

bed69e4

Hweinstock temporarily deployed to e2e-testing May 14, 2026 13:53 — with GitHub Actions Inactive

github-actions Bot added size/l PR size: L agentcore-harness-reviewing AgentCore Harness review in progress labels May 14, 2026

github-actions Bot removed the agentcore-harness-reviewing AgentCore Harness review in progress label May 14, 2026

agentcore-cli-automation suggested changes May 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: decouple TelemetryClient from command_run logic#1246

refactor: decouple TelemetryClient from command_run logic#1246
Hweinstock wants to merge 12 commits into
aws:mainfrom
Hweinstock:decouple-telemetry

Hweinstock commented May 14, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

agentcore-cli-automation left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Hweinstock commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Adding a new metric after this PR

Follow-Up

Related Issue

Documentation PR

Type of Change

Testing

Checklist

Uh oh!

github-actions Bot commented May 14, 2026

Package Tarball

How to install

Uh oh!

agentcore-cli-automation left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Hweinstock commented May 14, 2026 •

edited

Loading