Skip to content

feat OTel: added support rpc and pool metrics#642

Open
medvezhonokok wants to merge 9 commits intoydb-platform:masterfrom
medvezhonokok:ydb-otel-tracing
Open

feat OTel: added support rpc and pool metrics#642
medvezhonokok wants to merge 9 commits intoydb-platform:masterfrom
medvezhonokok:ydb-otel-tracing

Conversation

@medvezhonokok
Copy link
Copy Markdown
Contributor

added metrics

@medvezhonokok medvezhonokok force-pushed the ydb-otel-tracing branch 2 times, most recently from 9bc2b1a to 0ac006e Compare April 30, 2026 10:31
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 30, 2026

Codecov Report

❌ Patch coverage is 38.31169% with 95 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.83%. Comparing base (6922803) to head (7e81c5a).
⚠️ Report is 8 commits behind head on master.

Files with missing lines Patch % Lines
...java/tech/ydb/core/metrics/OpenTelemetryMeter.java 0.00% 74 Missing ⚠️
...src/main/java/tech/ydb/core/metrics/NoopMeter.java 0.00% 8 Missing ⚠️
...src/main/java/tech/ydb/query/impl/SessionImpl.java 77.77% 4 Missing and 2 partials ⚠️
...main/java/tech/ydb/query/impl/QueryClientImpl.java 83.33% 0 Missing and 2 partials ⚠️
...src/main/java/tech/ydb/query/impl/SessionPool.java 91.30% 0 Missing and 2 partials ⚠️
...main/java/tech/ydb/query/impl/TableClientImpl.java 50.00% 2 Missing ⚠️
...able/src/main/java/tech/ydb/table/TableClient.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master     #642      +/-   ##
============================================
- Coverage     71.03%   70.83%   -0.20%     
- Complexity     3352     3364      +12     
============================================
  Files           379      381       +2     
  Lines         15862    16052     +190     
  Branches       1664     1675      +11     
============================================
+ Hits          11267    11371     +104     
- Misses         3950     4032      +82     
- Partials        645      649       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread core/src/main/java/tech/ydb/core/tracing/OpenTelemetryTracer.java
Comment thread core/src/main/java/tech/ydb/core/tracing/Span.java
Comment thread core/src/main/java/tech/ydb/core/tracing/OpenTelemetryTracer.java
Comment thread query/src/main/java/tech/ydb/query/impl/SessionImpl.java
medvezhonokok and others added 6 commits May 5, 2026 23:14
Drop the standalone opentelemetry/ module and the redundant SpanFinalizer
helper, revert unrelated regressions in TableClientImpl/SimpleTableClient
and the QueryTracingTest test reduction. Move OpenTelemetryMeter alongside
OpenTelemetryTracer in core (using the optional opentelemetry-api dep) and
relocate the integration test to query/test.

Implements the agreed-on metric set: ydb.client.operation.{duration,failed},
ydb.query.session.{create_time,pending_requests,timeouts,count,min,max} with
operation.name, db.response.status_code, ydb.query.session.{pool.name,state}
attributes. Pool name comes from QueryClient.Builder#sessionPoolName, default
is the connection's database. Retry-policy metrics are intentionally out of
scope for this iteration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Operation metrics now carry only `database`, `endpoint`, `operation.name`
(+ `status_code` for failed). Session metrics carry only
`ydb.query.session.pool.name` (+ `ydb.query.session.state` for count).

OpenTelemetryMeter.fromOpenTelemetry takes (database, endpoint) instead of
(database, host, port). Drops `db.system.name` / `db.namespace` /
`server.address` / `server.port` / `db.response.status_code` keys, replaces
them with `database` / `endpoint` / `status_code`. Descriptions match the
C# reference table verbatim.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Roll back the C#-style short descriptions to the original verbose English
ones. The C# table was a reference for attribute layout, not for the
description strings themselves.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…plete

Drop the recordCommitMetrics / recordRollbackMetrics helpers — they wrapped
the future in an extra .whenComplete just to record duration/failed. Master
already terminates both chains with a .whenComplete handling currentStatus,
so the metric calls fit there directly. For early-return paths (transactionId
== null) the future is already completed, so duration is recorded synchronously
without a wrapper.

SessionPool.Handler#create keeps its dedicated .whenComplete: master has no
existing one for the createSession stage, and reusing Span.endOnResult's
internal one would mean duplicating its span-finalization logic.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@KirillKurdyukov KirillKurdyukov changed the title Ydb otel tracing feat OTel: added support rpc and pool metrics May 7, 2026
KirillKurdyukov and others added 2 commits May 7, 2026 17:53
The previous version called .join() on the second createSession future and
discarded the result, then asserted that the timeouts counter is non-zero.
If the waiter ever completed via the queue path (e.g., a session became
available before the 100ms timeout) the test would silently fail later with
a misleading "ydb.query.session.timeouts metric not found" — there'd be no
timeout to record.

Now we explicitly assert that the waiter completed with CLIENT_DEADLINE_EXPIRED
before checking the counter, give it 500ms (still short, but well above
scheduler jitter), and read both pending and timeouts metrics from a single
metricReader collection so the assertion order doesn't matter.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants