Skip to content

LCORE-2655: OpenTelemetry feature proposal#1981

Open
asimurka wants to merge 1 commit into
lightspeed-core:mainfrom
asimurka:opentelemetry_spike
Open

LCORE-2655: OpenTelemetry feature proposal#1981
asimurka wants to merge 1 commit into
lightspeed-core:mainfrom
asimurka:opentelemetry_spike

Conversation

@asimurka

@asimurka asimurka commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Description

Refined OpenTelemetry feature to reflect recent request updates. Split the OpenTelemetry tracing design into a spike doc and a feature design doc (chosen approach: OTEL_* + opentelemetry-instrument, facade spans, OTLP export), replacing the single monolithic proposal file.

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement
  • Benchmarks improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Assisted-by: Cursor

Related Tickets & Documents

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • Documentation
    • Added new OpenTelemetry tracing design documentation for the Lightspeed Core request path, including environment-based setup, effective OTEL_* exposure via /config (with secrets redacted), default W3C propagation behavior, and trace/span boundary guidance.
    • Added an OpenTelemetry tracing design spike with recommended initialization and propagation choices, plus span coverage and guidance on moving sampling/filtering downstream.
    • Removed the older consolidated OpenTelemetry tracing design document in favor of the new materials.

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Walkthrough

Replaces the prior OpenTelemetry design document with a spike document and a formal design document covering tracing decisions, requirements, span coverage, configuration, failure handling, and implementation guidance.

Changes

OpenTelemetry Design Documentation

Layer / File(s) Summary
Spike: terminology and context
docs/design/observability-opentelemetry/observability-opentelemetry-spike.md
Introduces the spike framing, OpenTelemetry terminology, and the current metrics-only observability context.
Spike: configuration and SDK decisions
docs/design/observability-opentelemetry/observability-opentelemetry-spike.md
Records the OTEL_* configuration model, /config exposure with redaction, SDK initialization, and inbound W3C trace context handling.
Spike: outbound boundary and filtering
docs/design/observability-opentelemetry/observability-opentelemetry-spike.md
Defines the outbound trace boundary, collector scope, and span filtering and sampling decisions.
Design: requirements and tracing boundary
docs/design/observability-opentelemetry/observability-opentelemetry-design.md
Defines the formal tracing requirements, operator controls, use cases, and the LCORE-owned trace model for inbound requests.
Design: span coverage and conventions
docs/design/observability-opentelemetry/observability-opentelemetry-design.md
Specifies export topology, downstream filtering ownership, span coverage across LCORE workflows, and trace content conventions.
Design: failure handling and rollout
docs/design/observability-opentelemetry/observability-opentelemetry-design.md
Covers tracing failure behavior, sensitive-data constraints, OTEL_* variables, migration notes, implementation insertion points, open questions, and references.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested reviewers

  • anik120
  • tisnik
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title is concise and clearly matches the documentation change around the OpenTelemetry feature proposal.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
✨ Simplify code
  • Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@docs/design/observability-opentelemetry/observability-opentelemetry-design.md`:
- Around line 362-364: The "Open Questions" section in the design document
currently contains only a single high-level question about OTEL_* variable
inclusion in the /config scrape, which lacks specificity and leaves several
implementation concerns unaddressed. Expand this section to include more
detailed questions such as: specific OTEL_* variable names to whitelist (e.g.,
OTEL_EXPORTER_OTLP_HEADERS), the mechanism for redacting or handling sensitive
values like bearer tokens and API keys, privacy and security considerations for
the scrape endpoint, and any operational or testing concerns that must be
resolved before implementation begins. These additions will provide clearer
direction for the implementation phase.
- Around line 50-51: Expand and clarify the "New dependencies" section to
reconcile it with the implementation section (line 356). Explicitly state
whether the listed packages are only direct dependencies or the complete set
required. If using the distro approach (opentelemetry-distro), document that
this package automatically includes the SDK and default propagators via
environment variables, making separate propagator packages unnecessary, and
explicitly note that opentelemetry-instrument (the CLI used in the Containerfile
ENTRYPOINT) is provided by the distro. For OTLP export, clarify that
opentelemetry-exporter-otlp is a meta-package and explicitly document which
protocol (gRPC or HTTP) is expected to be used. Update the dependencies list to
match the actual packages needed and add explanatory notes about transitive
dependencies and the distro approach to eliminate discrepancies with the
implementation section.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2510269d-a4d9-44bf-b4d4-185811e5a7a9

📥 Commits

Reviewing files that changed from the base of the PR and between 2c52cc0 and 9c0221e.

📒 Files selected for processing (3)
  • docs/design/observability-opentelemetry/observability-opentelemetry-design.md
  • docs/design/observability-opentelemetry/observability-opentelemetry-spike.md
  • docs/design/observability-opentelemetry/observability-opentelemetry.md
💤 Files with no reviewable changes (1)
  • docs/design/observability-opentelemetry/observability-opentelemetry.md
📜 Review details
⏰ Context from checks skipped due to timeout. (14)
  • GitHub Check: unit_tests (3.12)
  • GitHub Check: Pylinter
  • GitHub Check: integration_tests (3.12)
  • GitHub Check: integration_tests (3.13)
  • GitHub Check: E2E Tests for Lightspeed Evaluation job
  • GitHub Check: E2E: library mode / ci / group 3
  • GitHub Check: E2E: library mode / ci / group 1
  • GitHub Check: E2E: server mode / ci / group 2
  • GitHub Check: E2E: server mode / ci / group 3
  • GitHub Check: E2E: server mode / ci / group 1
  • GitHub Check: E2E: library mode / ci / group 2
  • GitHub Check: build-pr
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-0-6-on-pull-request
🧰 Additional context used
🪛 LanguageTool
docs/design/observability-opentelemetry/observability-opentelemetry-spike.md

[style] ~52-~52: To elevate your writing, try using a synonym here.
Context: ...ving option set; modeling it in YAML is hard to maintain. Incompatible with `opentel...

(HARD_TO)

🪛 markdownlint-cli2 (0.22.1)
docs/design/observability-opentelemetry/observability-opentelemetry-design.md

[warning] 123-123: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (1)
docs/design/observability-opentelemetry/observability-opentelemetry-spike.md (1)

1-185: LGTM!

The spike document is well-structured and technically sound. The six architectural decisions are clearly presented with pros/cons, recommendations align with OpenTelemetry best practices, and the rationale is clear. Cross-document references and appendix links are appropriate.

Comment thread docs/design/observability-opentelemetry/observability-opentelemetry-design.md Outdated
Comment thread docs/design/observability-opentelemetry/observability-opentelemetry-design.md Outdated
@asimurka asimurka marked this pull request as draft June 26, 2026 06:21
@asimurka asimurka force-pushed the opentelemetry_spike branch from 9c0221e to 2c03714 Compare June 30, 2026 07:58
@asimurka asimurka marked this pull request as ready for review June 30, 2026 08:09

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

♻️ Duplicate comments (2)
docs/design/observability-opentelemetry/observability-opentelemetry-design.md (2)

378-380: 📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick win

Expand the "Open Questions" section.

This issue remains unaddressed from previous review. The single question lacks specificity needed for implementation. Add questions covering:

  • Specific OTEL_* variable whitelist for /config scrape (e.g., should OTEL_EXPORTER_OTLP_HEADERS be included?)
  • Redaction mechanism for secrets: pattern matching on key names, value masking, or complete exclusion?
  • Privacy/security: who can access /config, does auth cover the observability.otel section?
  • Testing: how to verify /config redaction works without exposing real secrets in test output?
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@docs/design/observability-opentelemetry/observability-opentelemetry-design.md`
around lines 378 - 380, The Open Questions section is too vague and needs to be
expanded in the observability-opentelemetry design doc. Update the existing
`/config` scrape question to explicitly cover the `OTEL_*` whitelist, whether
`OTEL_EXPORTER_OTLP_HEADERS` belongs in it, how secrets should be redacted
(key-based patterns, value masking, or exclusion), who can access `/config` and
whether auth applies to the `observability.otel` section, and how redaction will
be tested without leaking real secrets. Refer to the “Open Questions” section
and the `/config` scrape discussion when revising the content.

360-364: 📐 Maintainability & Code Quality | 🟠 Major | ⚖️ Poor tradeoff

Clarify and expand the "New dependencies" section.

This issue remains unaddressed from previous review. The "New dependencies" list (3 packages) does not reconcile with the implementation section (line 372) which mentions "OTel API, SDK, OTLP exporter, FastAPI instrumentor, propagators." Explicitly document:

  • Whether these are direct dependencies only or the complete set
  • That opentelemetry-distro transitively includes the SDK and default propagators (tracecontext, baggage)
  • That opentelemetry-instrument CLI is provided by the distro package
  • Which OTLP protocol variant is expected (opentelemetry-exporter-otlp-proto-grpc or opentelemetry-exporter-otlp-proto-http)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@docs/design/observability-opentelemetry/observability-opentelemetry-design.md`
around lines 360 - 364, The “New dependencies” section is incomplete and
inconsistent with the implementation summary, so update the observability design
doc’s dependency list to explicitly state whether it is only the direct packages
or the full runtime set. Clarify in the same section that `opentelemetry-distro`
brings in the SDK and default propagators (`tracecontext`, `baggage`)
transitively, that the `opentelemetry-instrument` CLI comes from the distro
package, and specify which OTLP exporter variant is expected
(`opentelemetry-exporter-otlp-proto-grpc` or
`opentelemetry-exporter-otlp-proto-http`) using the “New dependencies” heading
and the nearby implementation language as the reference points.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@docs/design/observability-opentelemetry/observability-opentelemetry-design.md`:
- Line 114: Align the observability design wording around collector scope by
updating the design document’s “Export topology” row and any related Spike
Decision 5 references so they no longer imply collector deployment is part of
the feature implementation; clarify that collector guidance is operator-facing
recommendations only, or mark it consistently out of scope everywhere. Use the
observability-opentelemetry design table entry and the Spike Decision 5 text as
the main anchors when revising the language.
- Line 338: The `/config` env-scraping design currently says secret-bearing
variables must be redacted but does not define how; update the tracing/config
section to specify the redaction policy for environment values, including how
secrets are detected (for example via key-name patterns, value heuristics, or an
explicit denylist), what redaction placeholder is emitted, and whether the
denylist is hardcoded or configurable. Keep the guidance tied to the `/config`
scraping behavior and the tracing design text so implementation can follow it
unambiguously.
- Line 352: Clarify the secret handling guidance for the observability design
text by separating environment-based and file-based patterns: update the wording
around OTEL_* and secret mounts so it explicitly states whether sensitive values
are injected into OTEL_* at runtime from secrets, or referenced via mounted
secret files and path-based settings. Use the existing
observability-opentelemetry design section text to revise the sentence so it no
longer conflates bearer tokens, client keys, and sensitive headers with both
approaches, and keep the /config redaction guidance aligned with the chosen
model.
- Around line 195-254: The span tables in observability-opentelemetry-design.md
use “(additional work)” without defining whether those spans are planned or
required now; update the document by either moving the affected spans (for
example in `run_shield_moderation`, `build_agent`,
`summarize_file_search_result`, and the feedback section) into a separate
“Future span coverage” area, or add a legend near the tables that clearly
defines “(additional work)” and includes Jira references where available.
- Around line 54-56: The `/config` observability requirement is describing
behavior that does not yet exist in the current config endpoint implementation,
so update the design text to treat it as a new requirement rather than an
existing contract. In the `/config` section, reference the `config` endpoint
behavior explicitly and note that `src/app/endpoints/config.py` will need new
OTEL env-var scraping and redaction logic before it can return effective
settings. Keep the requirement phrased as implementation work for the endpoint,
not as something already supported.

In
`@docs/design/observability-opentelemetry/observability-opentelemetry-spike.md`:
- Line 103: Update the observability OpenTelemetry note to state the full
default propagator set, since the current wording around OTEL_PROPAGATORS is
incomplete. In the relevant guidance text, clarify that the Python SDK default
includes both tracecontext and baggage, or explicitly list the full default so
readers are not misled; keep the rest of the explanation about FastAPI
auto-instrumentation and upstream trace attachment unchanged.
- Line 66: Clarify that the `/config` endpoint behavior described in the spike
is not currently implemented: `GET /config` in `src/app/endpoints/config.py`
only returns the existing `ConfigurationResponse` from
`app.endpoints.config.configuration` and does not inspect `OTEL_*` environment
variables. Update the `observability-opentelemetry-spike.md` recommendation to
state that exposing `observability.otel` via `/config` is future work or a
planned enhancement, and distinguish it from the current implementation of
`ConfigurationResponse`/`/config`.

---

Duplicate comments:
In
`@docs/design/observability-opentelemetry/observability-opentelemetry-design.md`:
- Around line 378-380: The Open Questions section is too vague and needs to be
expanded in the observability-opentelemetry design doc. Update the existing
`/config` scrape question to explicitly cover the `OTEL_*` whitelist, whether
`OTEL_EXPORTER_OTLP_HEADERS` belongs in it, how secrets should be redacted
(key-based patterns, value masking, or exclusion), who can access `/config` and
whether auth applies to the `observability.otel` section, and how redaction will
be tested without leaking real secrets. Refer to the “Open Questions” section
and the `/config` scrape discussion when revising the content.
- Around line 360-364: The “New dependencies” section is incomplete and
inconsistent with the implementation summary, so update the observability design
doc’s dependency list to explicitly state whether it is only the direct packages
or the full runtime set. Clarify in the same section that `opentelemetry-distro`
brings in the SDK and default propagators (`tracecontext`, `baggage`)
transitively, that the `opentelemetry-instrument` CLI comes from the distro
package, and specify which OTLP exporter variant is expected
(`opentelemetry-exporter-otlp-proto-grpc` or
`opentelemetry-exporter-otlp-proto-http`) using the “New dependencies” heading
and the nearby implementation language as the reference points.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6d2ca6f1-bb9c-407e-a79e-5bf71ac8e6a6

📥 Commits

Reviewing files that changed from the base of the PR and between 9c0221e and 2c03714.

📒 Files selected for processing (3)
  • docs/design/observability-opentelemetry/observability-opentelemetry-design.md
  • docs/design/observability-opentelemetry/observability-opentelemetry-spike.md
  • docs/design/observability-opentelemetry/observability-opentelemetry.md
💤 Files with no reviewable changes (1)
  • docs/design/observability-opentelemetry/observability-opentelemetry.md
📜 Review details
⏰ Context from checks skipped due to timeout. (7)
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-0-6-on-pull-request
  • GitHub Check: E2E: library mode / ci / group 2
  • GitHub Check: E2E: server mode / ci / group 3
  • GitHub Check: E2E: library mode / ci / group 3
  • GitHub Check: E2E: server mode / ci / group 1
  • GitHub Check: E2E: server mode / ci / group 2
⚠️ CI failures not shown inline (2)

GitHub Actions: OpenAPI (Spectral) / 0_spectral.txt: LCORE-2655: OpenTelemetry feature proposal

Conclusion: failure

View job details

##[group]Run set -euo pipefail
 �[36;1mset -euo pipefail�[0m
 �[36;1muv run python scripts/generate_openapi_schema.py /tmp/openapi-generated.json�[0m
 �[36;1mif ! diff -u docs/openapi.json /tmp/openapi-generated.json; then�[0m
 �[36;1m  echo "::error::docs/openapi.json is out of date. Regenerate with: uv run scripts/generate_openapi_schema.py docs/openapi.json"�[0m

GitHub Actions: OpenAPI (Spectral) / spectral: LCORE-2655: OpenTelemetry feature proposal

Conclusion: failure

View job details

##[group]Run set -euo pipefail
 �[36;1mset -euo pipefail�[0m
 �[36;1muv run python scripts/generate_openapi_schema.py /tmp/openapi-generated.json�[0m
 �[36;1mif ! diff -u docs/openapi.json /tmp/openapi-generated.json; then�[0m
 �[36;1m  echo "::error::docs/openapi.json is out of date. Regenerate with: uv run scripts/generate_openapi_schema.py docs/openapi.json"�[0m
🧰 Additional context used
🪛 LanguageTool
docs/design/observability-opentelemetry/observability-opentelemetry-spike.md

[style] ~54-~54: To elevate your writing, try using a synonym here.
Context: ...ving option set; modeling it in YAML is hard to maintain. Incompatible with automati...

(HARD_TO)

🪛 markdownlint-cli2 (0.22.1)
docs/design/observability-opentelemetry/observability-opentelemetry-design.md

[warning] 130-130: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (17)
docs/design/observability-opentelemetry/observability-opentelemetry-spike.md (5)

70-89: LGTM!


118-139: LGTM!


143-153: LGTM!


156-163: LGTM!


167-174: LGTM!

docs/design/observability-opentelemetry/observability-opentelemetry-design.md (12)

306-306: Verify Containerfile entrypoint matches project structure.

The specified ENTRYPOINT assumes python3.12 and src/lightspeed_stack.py. Verify these paths match the actual Containerfile and project layout, as entrypoint paths vary by repository structure.


264-270: LGTM!


272-276: LGTM!


84-103: LGTM!


278-298: LGTM!


308-310: LGTM!


312-314: LGTM!


340-342: LGTM!


344-347: LGTM!


355-358: LGTM!


366-376: LGTM!


383-396: LGTM!

Comment thread docs/design/observability-opentelemetry/observability-opentelemetry-design.md Outdated
Comment thread docs/design/observability-opentelemetry/observability-opentelemetry-design.md Outdated
@asimurka asimurka requested a review from tisnik June 30, 2026 08:21
@asimurka asimurka force-pushed the opentelemetry_spike branch from 2c03714 to aa3dd58 Compare June 30, 2026 08:47
@asimurka asimurka force-pushed the opentelemetry_spike branch from aa3dd58 to 581a66f Compare June 30, 2026 12:15

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (7)
docs/design/observability-opentelemetry/observability-opentelemetry-spike.md (2)

66-66: 🗄️ Data Integrity & Integration | 🟠 Major

/config enrichment is still presented without a future-work caveat.

The text continues to imply GET /config can expose OTEL_* variables without clarifying that this requires implementation work. The current /config endpoint (src/app/endpoints/config.py:36-74) returns only the loaded ConfigurationResponse and does not scrape environment variables. Please add explicit language that /config enrichment for observability is planned/future work, not current behavior.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/design/observability-opentelemetry/observability-opentelemetry-spike.md`
at line 66, The observability section still reads as if GET /config already
exposes OTEL_* settings, but ConfigEndpoint currently only returns the loaded
ConfigurationResponse. Update the recommendation text to clearly mark /config
enrichment as planned/future work, and reference the existing ConfigEndpoint
behavior so readers know the observability.otel inspection layer is not
implemented yet.

103-103: 🎯 Functional Correctness | 🟡 Minor

Default propagator set still omits baggage.

The standard OpenTelemetry Python SDK default for OTEL_PROPAGATORS is tracecontext,baggage, not just tracecontext. The current wording risks confusing operators who see baggage enabled by default. Update to clarify that the default includes both, or explicitly list the full default.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/design/observability-opentelemetry/observability-opentelemetry-spike.md`
at line 103, The observability design note currently suggests the default
OTEL_PROPAGATORS only includes tracecontext, but the OpenTelemetry Python
default also includes baggage; update the wording around the OTEL_PROPAGATORS
guidance to explicitly state the full default or mention both tracecontext and
baggage, and keep the reference to the OpenTelemetry SDK and FastAPI
auto-instrumentation behavior consistent.
docs/design/observability-opentelemetry/observability-opentelemetry-design.md (5)

372-374: 📐 Maintainability & Code Quality | 🟡 Minor

Expand the "Open Questions" section.

The section still contains only one high-level question. Add specifics: which OTEL_* variables to whitelist (e.g., OTEL_EXPORTER_OTLP_HEADERS), how sensitive values are redacted (pattern matching vs explicit denylist), and any operational or testing concerns that must be resolved before implementation.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@docs/design/observability-opentelemetry/observability-opentelemetry-design.md`
around lines 372 - 374, Expand the Open Questions section in
observability-opentelemetry-design.md to cover the concrete /config scrape
policy details. Add the specific OTEL_* variables that should be whitelisted,
such as OTEL_EXPORTER_OTLP_HEADERS, clarify how sensitive values will be
redacted (pattern matching versus an explicit denylist), and include any
operational or testing concerns that must be resolved before implementation.

346-346: 🔒 Security & Privacy | 🟡 Minor

Clarify secret management pattern for OTEL variables.

The phrasing still conflates two distinct approaches: (a) secrets injected into OTEL_* environment variables from secret mounts at runtime, and (b) secrets kept in mounted files referenced via path variables like OTEL_EXPORTER_OTLP_CERTIFICATE. These have different security postures (env vars are visible in /proc/<pid>/environ to container runtime users; file-based requires path configuration). State which pattern LCORE expects operators to use, or document both with their trade-offs.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@docs/design/observability-opentelemetry/observability-opentelemetry-design.md`
at line 346, Clarify the OTEL secret-handling guidance in the observability
design doc by separating the two patterns: secrets loaded into OTEL_*
environment variables via secret mounts, and secrets kept in mounted files
referenced by path variables such as OTEL_EXPORTER_OTLP_CERTIFICATE. Update the
observability-opentelemetry design section to state which pattern LCORE expects
operators to use, or explicitly document both approaches with their security
trade-offs. Use the OTEL_* configuration and /config redaction language to keep
the guidance consistent.

332-332: 🔒 Security & Privacy | 🟠 Major

Define the redaction mechanism for secret-bearing variables.

The document states secrets "shall be redacted" but provides no implementation guidance: how are secrets identified (key name patterns like *_HEADERS, *_CERTIFICATE, *_KEY? value heuristics? explicit denylist?), what redaction placeholder is emitted ([REDACTED], ***, omission?), and is the denylist hardcoded or configurable?

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@docs/design/observability-opentelemetry/observability-opentelemetry-design.md`
at line 332, The observability design text says secret-bearing variables must be
redacted, but it does not define how redaction works. Update the tracing
environment-variable section to specify the redaction mechanism used by the
request-time reader, including how secrets are identified (for example, key-name
patterns or a denylist), what placeholder is emitted, and whether the denylist
is hardcoded or configurable. Use the existing tracing/environment-variable
wording in this document to place the guidance clearly.

354-358: 📐 Maintainability & Code Quality | 🟠 Major

Clarify "New dependencies" section — discrepancy with implementation table persists.

The "New dependencies" section still lists only three packages, while the implementation table (line 366) mentions "OTel API, SDK, OTLP exporter, FastAPI instrumentor, propagators." Reconcile these lists: specify whether the three listed packages are direct dependencies that transitively pull in the others, or update the dependencies list to match. Also explicitly note that opentelemetry-instrument is provided by opentelemetry-distro, and state which OTLP protocol (gRPC or HTTP/protobuf) is intended.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@docs/design/observability-opentelemetry/observability-opentelemetry-design.md`
around lines 354 - 358, The “New dependencies” section in the observability
design doc is inconsistent with the implementation table, so update the list to
clearly match what will actually be installed and used. In the
observability-opentelemetry design content, clarify whether
`opentelemetry-distro`, `opentelemetry-exporter-otlp`, and
`opentelemetry-instrumentation-fastapi` are the direct dependencies that pull in
the OTel API/SDK/protocol pieces transitively, or expand the section to list
those components explicitly. Also note that `opentelemetry-instrument` comes
from `opentelemetry-distro`, and specify whether OTLP uses gRPC or
HTTP/protobuf.

149-149: 🗄️ Data Integrity & Integration | 🟠 Major

Mark /config enrichment as implementation work, not existing behavior.

The phrasing "GET /v1/config handler reads relevant OTEL_* variables and appends them" implies this already exists. Given src/app/endpoints/config.py:36-74 currently returns only configuration.configuration with no env scraping, update this to explicitly state that new logic must be added to the config endpoint.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@docs/design/observability-opentelemetry/observability-opentelemetry-design.md`
at line 149, Update the `/config` visibility note in the observability design
doc so it clearly describes planned implementation work rather than existing
behavior: change the wording around the `GET /v1/config` handler to indicate
that `config.py` in the config endpoint must be extended to read relevant
`OTEL_*` environment variables and append them under `observability.otel` with
secrets redacted, since the current handler only returns
`configuration.configuration`.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@docs/design/observability-opentelemetry/observability-opentelemetry-design.md`:
- Around line 364-369: The implementation table has an inconsistent path prefix:
`src/app/endpoints/config.py` is correct, but `app/endpoints/*.py` and
`utils/*.py` should match the same `src/` project layout. Update the entries in
the observability design table so the referenced paths are consistent with the
actual source tree, especially the rows covering `config.py`, endpoint handlers,
and utilities.
- Around line 126-139: Add the missing language specifier to the fenced ASCII
diagram block in observability-opentelemetry-design.md by updating the markdown
fence around the Caller/LCORE trace tree diagram to use text. Keep the content
unchanged, and ensure the fenced block in this section renders consistently and
passes markdownlint.
- Around line 205-208: Clarify the observability schema entries for rag.retrieve
and response.generate so they cannot be read as storing sensitive raw content.
Update the documentation around `rag.sources[]` and `llm.response` to explicitly
state they hold source identifiers/URLs and response metadata only, not
retrieved text or generated output, and adjust the symbol names in this table if
needed (for example, in `rag.retrieve` and `response.generate`) to use a safer
label like `llm.response.summary` or equivalent scrubbed metadata.

---

Duplicate comments:
In
`@docs/design/observability-opentelemetry/observability-opentelemetry-design.md`:
- Around line 372-374: Expand the Open Questions section in
observability-opentelemetry-design.md to cover the concrete /config scrape
policy details. Add the specific OTEL_* variables that should be whitelisted,
such as OTEL_EXPORTER_OTLP_HEADERS, clarify how sensitive values will be
redacted (pattern matching versus an explicit denylist), and include any
operational or testing concerns that must be resolved before implementation.
- Line 346: Clarify the OTEL secret-handling guidance in the observability
design doc by separating the two patterns: secrets loaded into OTEL_*
environment variables via secret mounts, and secrets kept in mounted files
referenced by path variables such as OTEL_EXPORTER_OTLP_CERTIFICATE. Update the
observability-opentelemetry design section to state which pattern LCORE expects
operators to use, or explicitly document both approaches with their security
trade-offs. Use the OTEL_* configuration and /config redaction language to keep
the guidance consistent.
- Line 332: The observability design text says secret-bearing variables must be
redacted, but it does not define how redaction works. Update the tracing
environment-variable section to specify the redaction mechanism used by the
request-time reader, including how secrets are identified (for example, key-name
patterns or a denylist), what placeholder is emitted, and whether the denylist
is hardcoded or configurable. Use the existing tracing/environment-variable
wording in this document to place the guidance clearly.
- Around line 354-358: The “New dependencies” section in the observability
design doc is inconsistent with the implementation table, so update the list to
clearly match what will actually be installed and used. In the
observability-opentelemetry design content, clarify whether
`opentelemetry-distro`, `opentelemetry-exporter-otlp`, and
`opentelemetry-instrumentation-fastapi` are the direct dependencies that pull in
the OTel API/SDK/protocol pieces transitively, or expand the section to list
those components explicitly. Also note that `opentelemetry-instrument` comes
from `opentelemetry-distro`, and specify whether OTLP uses gRPC or
HTTP/protobuf.
- Line 149: Update the `/config` visibility note in the observability design doc
so it clearly describes planned implementation work rather than existing
behavior: change the wording around the `GET /v1/config` handler to indicate
that `config.py` in the config endpoint must be extended to read relevant
`OTEL_*` environment variables and append them under `observability.otel` with
secrets redacted, since the current handler only returns
`configuration.configuration`.

In
`@docs/design/observability-opentelemetry/observability-opentelemetry-spike.md`:
- Line 66: The observability section still reads as if GET /config already
exposes OTEL_* settings, but ConfigEndpoint currently only returns the loaded
ConfigurationResponse. Update the recommendation text to clearly mark /config
enrichment as planned/future work, and reference the existing ConfigEndpoint
behavior so readers know the observability.otel inspection layer is not
implemented yet.
- Line 103: The observability design note currently suggests the default
OTEL_PROPAGATORS only includes tracecontext, but the OpenTelemetry Python
default also includes baggage; update the wording around the OTEL_PROPAGATORS
guidance to explicitly state the full default or mention both tracecontext and
baggage, and keep the reference to the OpenTelemetry SDK and FastAPI
auto-instrumentation behavior consistent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 4abfd480-cdb9-4ff2-80aa-b2d9ff297417

📥 Commits

Reviewing files that changed from the base of the PR and between 2c03714 and 581a66f.

📒 Files selected for processing (3)
  • docs/design/observability-opentelemetry/observability-opentelemetry-design.md
  • docs/design/observability-opentelemetry/observability-opentelemetry-spike.md
  • docs/design/observability-opentelemetry/observability-opentelemetry.md
💤 Files with no reviewable changes (1)
  • docs/design/observability-opentelemetry/observability-opentelemetry.md
📜 Review details
⏰ Context from checks skipped due to timeout. (2)
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-0-6-on-pull-request
🧰 Additional context used
🪛 LanguageTool
docs/design/observability-opentelemetry/observability-opentelemetry-spike.md

[style] ~54-~54: To elevate your writing, try using a synonym here.
Context: ...ving option set; modeling it in YAML is hard to maintain. Incompatible with automati...

(HARD_TO)

docs/design/observability-opentelemetry/observability-opentelemetry-design.md

[grammar] ~89-~89: Use a hyphen to join words.
Context: ... latency and errors without storing high volume data in the trace backend. *U4...

(QB_NEW_EN_HYPHEN)


[style] ~182-~182: This phrase is redundant. Consider using “outside”.
Context: ...am telemetry infrastructure are managed outside of LCORE for now. ### Span filtering ...

(OUTSIDE_OF)

🪛 markdownlint-cli2 (0.22.1)
docs/design/observability-opentelemetry/observability-opentelemetry-design.md

[warning] 126-126: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (2)
docs/design/observability-opentelemetry/observability-opentelemetry-spike.md (1)

1-174: LGTM! (with noted issues above)

docs/design/observability-opentelemetry/observability-opentelemetry-design.md (1)

157-157: 🎯 Functional Correctness

Keep OTEL_PROPAGATORS=none here; Python OTel uses none to disable automatic propagation.

			> Likely an incorrect or invalid review comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant