Skip to content

Diagnose log injection smoke test flakiness instead of masking it#11075

Open
bm1549 wants to merge 6 commits intomasterfrom
brian.marks/fix-log-injection-smoke-test-flake
Open

Diagnose log injection smoke test flakiness instead of masking it#11075
bm1549 wants to merge 6 commits intomasterfrom
brian.marks/fix-log-injection-smoke-test-flake

Conversation

@bm1549
Copy link
Copy Markdown
Contributor

@bm1549 bm1549 commented Apr 9, 2026

What Does This Do

Adds diagnostic instrumentation to the check raw file injection smoke test so the next CI failure tells us the root cause instead of a bare "Condition not satisfied after 30s" with traceCount=0.

Changes to LogInjectionSmokeTest:

  1. waitForTraceCountAlive — checks process liveness on every poll iteration; if the process dies, fails immediately with exit code + last 20 lines of process output
  2. Enriched timeout errors — on timeout, dumps: process alive?, traceCount, RC polls received, last 30 lines of process output
  3. Reorder waitForTraceCount(4) before waitFor + assert waitFor return value

Motivation

CI Visibility data for the last 30 days on master shows 10 failures of check raw file injection:

Failure mode Count Line Duration Root cause
traceCount=0 at waitForTraceCount(2) 9/10 368 30.3s Unknown — no diagnostics
logLines.size()=3 at assertRawLogLinesWithInjection 1/10 229 8.3s Incomplete log file

The failure distribution is bimodal — successful runs complete in 3.5-8.7s (80 data points, zero above 9s), while failures sit at exactly 30.3s. There is nothing in between. This means the process either works or is totally broken — a timeout increase would just delay the same failure.

<9s:  ████████████████████████████████████████  80/80 passes
9-30s:                                           0 runs
30s:  █████████                                  9/10 failures (at timeout)

The current test is blind during the wait — it just polls traceCount in a loop. We don't know if the process crashed, hung during agent init, failed to connect to the test server, or something else entirely. This PR makes the next failure self-diagnosing.

Example output when process crashes:

Process exited with code 1 while waiting for 2 traces (received 0, RC polls: 3).
Last process output:
[dd.trace ...] ERROR ... NullPointerException during instrumentation
...

Example output on timeout (process alive but not sending traces):

Timed out waiting for 2 traces after 30s. traceCount=0, process.alive=true, RC polls received: 142.
Last process output:
[dd.trace ...] DEBUG ... Still loading instrumentations...
...

Additional Notes

  • Only LogInjectionSmokeTest.groovy is changed
  • No timeout increase — the 30s defaultPoll is kept as-is
  • All 11 historically flaky backends pass locally
  • rcClientMessages.size() tells us whether the agent connected to the test server at all (RC polls hit /v0.7/config every 200ms)

Contributor Checklist

tag: no release notes
tag: ai generated

🤖 Generated with Claude Code

The `check raw file injection` test has been flaking across 11+ logging
backend variants for months. CI Visibility data shows 90% of failures are
`traceCount=0` at `waitForTraceCount(2)` after exactly 30s — the JVM +
agent bytecode instrumentation simply takes >30s on overloaded CI machines.

Changes:
- Add `startupPoll` with 120s timeout for the initial `waitForTraceCount(2)`
  that covers JVM startup + agent init, giving 4x headroom over the current
  30s `defaultPoll`
- Add `waitForTraceCountAlive` that checks process liveness on each poll
  iteration, turning silent 30-120s timeouts into instant, actionable errors
  when the process crashes
- Reorder `waitForTraceCount(4)` before `waitFor` to confirm all traces are
  delivered while the process is still alive
- Assert `waitFor` return value for a clear error if the process hangs

tag: no release note

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@bm1549 bm1549 added type: bug Bug report and fix comp: core Tracer core tag: no release notes Changes to exclude from release notes tag: ai generated Largely based on code generated by an AI or LLM labels Apr 9, 2026
@pr-commenter
Copy link
Copy Markdown

pr-commenter bot commented Apr 9, 2026

Benchmarks

Startup

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master brian.marks/fix-log-injection-smoke-test-flake
git_commit_date 1776263862 1776265033
git_commit_sha caae0f7 31ae072
release_version 1.62.0-SNAPSHOT~caae0f79ce 1.62.0-SNAPSHOT~31ae072eeb
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1776266846 1776266846
ci_job_id 1597965227 1597965227
ci_pipeline_id 107839755 107839755
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-1-j4gb90u3 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-1-j4gb90u3 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module Agent Agent
parent None None

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 59 metrics, 12 unstable metrics.

Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.62.0-SNAPSHOT~31ae072eeb, baseline=1.62.0-SNAPSHOT~caae0f79ce

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.057 s) : 0, 1057349
Total [baseline] (11.106 s) : 0, 11105561
Agent [candidate] (1.056 s) : 0, 1056492
Total [candidate] (11.15 s) : 0, 11149694
section appsec
Agent [baseline] (1.25 s) : 0, 1250355
Total [baseline] (11.15 s) : 0, 11150242
Agent [candidate] (1.252 s) : 0, 1252160
Total [candidate] (11.132 s) : 0, 11131514
section iast
Agent [baseline] (1.225 s) : 0, 1225317
Total [baseline] (11.338 s) : 0, 11338197
Agent [candidate] (1.237 s) : 0, 1237167
Total [candidate] (11.411 s) : 0, 11410820
section profiling
Agent [baseline] (1.186 s) : 0, 1186337
Total [baseline] (11.114 s) : 0, 11114498
Agent [candidate] (1.185 s) : 0, 1184524
Total [candidate] (11.09 s) : 0, 11090136
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.057 s -
Agent appsec 1.25 s 193.006 ms (18.3%)
Agent iast 1.225 s 167.968 ms (15.9%)
Agent profiling 1.186 s 128.988 ms (12.2%)
Total tracing 11.106 s -
Total appsec 11.15 s 44.681 ms (0.4%)
Total iast 11.338 s 232.636 ms (2.1%)
Total profiling 11.114 s 8.937 ms (0.1%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.056 s -
Agent appsec 1.252 s 195.668 ms (18.5%)
Agent iast 1.237 s 180.675 ms (17.1%)
Agent profiling 1.185 s 128.032 ms (12.1%)
Total tracing 11.15 s -
Total appsec 11.132 s -18.179 ms (-0.2%)
Total iast 11.411 s 261.126 ms (2.3%)
Total profiling 11.09 s -59.557 ms (-0.5%)
gantt
    title petclinic - break down per module: candidate=1.62.0-SNAPSHOT~31ae072eeb, baseline=1.62.0-SNAPSHOT~caae0f79ce

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.237 ms) : 0, 1237
crashtracking [candidate] (1.247 ms) : 0, 1247
BytebuddyAgent [baseline] (632.102 ms) : 0, 632102
BytebuddyAgent [candidate] (632.192 ms) : 0, 632192
AgentMeter [baseline] (29.372 ms) : 0, 29372
AgentMeter [candidate] (29.366 ms) : 0, 29366
GlobalTracer [baseline] (248.545 ms) : 0, 248545
GlobalTracer [candidate] (248.853 ms) : 0, 248853
AppSec [baseline] (32.15 ms) : 0, 32150
AppSec [candidate] (32.451 ms) : 0, 32451
Debugger [baseline] (59.947 ms) : 0, 59947
Debugger [candidate] (60.045 ms) : 0, 60045
Remote Config [baseline] (597.056 µs) : 0, 597
Remote Config [candidate] (591.707 µs) : 0, 592
Telemetry [baseline] (8.055 ms) : 0, 8055
Telemetry [candidate] (8.045 ms) : 0, 8045
Flare Poller [baseline] (9.107 ms) : 0, 9107
Flare Poller [candidate] (7.447 ms) : 0, 7447
section appsec
crashtracking [baseline] (1.254 ms) : 0, 1254
crashtracking [candidate] (1.253 ms) : 0, 1253
BytebuddyAgent [baseline] (662.488 ms) : 0, 662488
BytebuddyAgent [candidate] (664.197 ms) : 0, 664197
AgentMeter [baseline] (12.095 ms) : 0, 12095
AgentMeter [candidate] (12.038 ms) : 0, 12038
GlobalTracer [baseline] (249.34 ms) : 0, 249340
GlobalTracer [candidate] (249.056 ms) : 0, 249056
AppSec [baseline] (185.544 ms) : 0, 185544
AppSec [candidate] (185.663 ms) : 0, 185663
Debugger [baseline] (65.672 ms) : 0, 65672
Debugger [candidate] (66.197 ms) : 0, 66197
Remote Config [baseline] (606.627 µs) : 0, 607
Remote Config [candidate] (609.855 µs) : 0, 610
Telemetry [baseline] (8.562 ms) : 0, 8562
Telemetry [candidate] (8.371 ms) : 0, 8371
Flare Poller [baseline] (3.567 ms) : 0, 3567
Flare Poller [candidate] (3.577 ms) : 0, 3577
IAST [baseline] (24.621 ms) : 0, 24621
IAST [candidate] (24.63 ms) : 0, 24630
section iast
crashtracking [baseline] (1.238 ms) : 0, 1238
crashtracking [candidate] (1.246 ms) : 0, 1246
BytebuddyAgent [baseline] (802.574 ms) : 0, 802574
BytebuddyAgent [candidate] (809.93 ms) : 0, 809930
AgentMeter [baseline] (11.403 ms) : 0, 11403
AgentMeter [candidate] (11.663 ms) : 0, 11663
GlobalTracer [baseline] (239.087 ms) : 0, 239087
GlobalTracer [candidate] (240.949 ms) : 0, 240949
AppSec [baseline] (32.619 ms) : 0, 32619
AppSec [candidate] (32.228 ms) : 0, 32228
Debugger [baseline] (60.631 ms) : 0, 60631
Debugger [candidate] (60.624 ms) : 0, 60624
Remote Config [baseline] (538.275 µs) : 0, 538
Remote Config [candidate] (558.985 µs) : 0, 559
Telemetry [baseline] (11.725 ms) : 0, 11725
Telemetry [candidate] (13.618 ms) : 0, 13618
Flare Poller [baseline] (3.461 ms) : 0, 3461
Flare Poller [candidate] (3.777 ms) : 0, 3777
IAST [baseline] (25.742 ms) : 0, 25742
IAST [candidate] (26.043 ms) : 0, 26043
section profiling
ProfilingAgent [baseline] (94.078 ms) : 0, 94078
ProfilingAgent [candidate] (93.869 ms) : 0, 93869
crashtracking [baseline] (1.189 ms) : 0, 1189
crashtracking [candidate] (1.177 ms) : 0, 1177
BytebuddyAgent [baseline] (692.232 ms) : 0, 692232
BytebuddyAgent [candidate] (691.684 ms) : 0, 691684
AgentMeter [baseline] (9.119 ms) : 0, 9119
AgentMeter [candidate] (9.117 ms) : 0, 9117
GlobalTracer [baseline] (207.613 ms) : 0, 207613
GlobalTracer [candidate] (207.145 ms) : 0, 207145
AppSec [baseline] (32.882 ms) : 0, 32882
AppSec [candidate] (32.84 ms) : 0, 32840
Debugger [baseline] (65.882 ms) : 0, 65882
Debugger [candidate] (65.498 ms) : 0, 65498
Remote Config [baseline] (582.872 µs) : 0, 583
Remote Config [candidate] (571.224 µs) : 0, 571
Telemetry [baseline] (7.807 ms) : 0, 7807
Telemetry [candidate] (7.741 ms) : 0, 7741
Flare Poller [baseline] (3.584 ms) : 0, 3584
Flare Poller [candidate] (3.541 ms) : 0, 3541
Profiling [baseline] (94.646 ms) : 0, 94646
Profiling [candidate] (94.435 ms) : 0, 94435
Loading
Startup time reports for insecure-bank
gantt
    title insecure-bank - global startup overhead: candidate=1.62.0-SNAPSHOT~31ae072eeb, baseline=1.62.0-SNAPSHOT~caae0f79ce

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.059 s) : 0, 1058669
Total [baseline] (8.912 s) : 0, 8912376
Agent [candidate] (1.061 s) : 0, 1060736
Total [candidate] (8.873 s) : 0, 8872752
section iast
Agent [baseline] (1.231 s) : 0, 1230772
Total [baseline] (9.573 s) : 0, 9573228
Agent [candidate] (1.239 s) : 0, 1239493
Total [candidate] (9.584 s) : 0, 9584489
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.059 s -
Agent iast 1.231 s 172.103 ms (16.3%)
Total tracing 8.912 s -
Total iast 9.573 s 660.852 ms (7.4%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.061 s -
Agent iast 1.239 s 178.757 ms (16.9%)
Total tracing 8.873 s -
Total iast 9.584 s 711.737 ms (8.0%)
gantt
    title insecure-bank - break down per module: candidate=1.62.0-SNAPSHOT~31ae072eeb, baseline=1.62.0-SNAPSHOT~caae0f79ce

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.259 ms) : 0, 1259
crashtracking [candidate] (1.235 ms) : 0, 1235
BytebuddyAgent [baseline] (634.722 ms) : 0, 634722
BytebuddyAgent [candidate] (633.303 ms) : 0, 633303
AgentMeter [baseline] (29.526 ms) : 0, 29526
AgentMeter [candidate] (29.354 ms) : 0, 29354
GlobalTracer [baseline] (249.295 ms) : 0, 249295
GlobalTracer [candidate] (249.171 ms) : 0, 249171
AppSec [baseline] (32.325 ms) : 0, 32325
AppSec [candidate] (32.387 ms) : 0, 32387
Debugger [baseline] (59.238 ms) : 0, 59238
Debugger [candidate] (59.117 ms) : 0, 59117
Remote Config [baseline] (594.486 µs) : 0, 594
Remote Config [candidate] (600.631 µs) : 0, 601
Telemetry [baseline] (8.068 ms) : 0, 8068
Telemetry [candidate] (8.0 ms) : 0, 8000
Flare Poller [baseline] (7.378 ms) : 0, 7378
Flare Poller [candidate] (11.211 ms) : 0, 11211
section iast
crashtracking [baseline] (1.249 ms) : 0, 1249
crashtracking [candidate] (1.246 ms) : 0, 1246
BytebuddyAgent [baseline] (808.113 ms) : 0, 808113
BytebuddyAgent [candidate] (813.449 ms) : 0, 813449
AgentMeter [baseline] (11.646 ms) : 0, 11646
AgentMeter [candidate] (11.803 ms) : 0, 11803
GlobalTracer [baseline] (239.187 ms) : 0, 239187
GlobalTracer [candidate] (240.897 ms) : 0, 240897
AppSec [baseline] (31.964 ms) : 0, 31964
AppSec [candidate] (32.21 ms) : 0, 32210
Debugger [baseline] (60.241 ms) : 0, 60241
Debugger [candidate] (61.995 ms) : 0, 61995
Remote Config [baseline] (539.727 µs) : 0, 540
Remote Config [candidate] (544.723 µs) : 0, 545
Telemetry [baseline] (12.238 ms) : 0, 12238
Telemetry [candidate] (11.243 ms) : 0, 11243
Flare Poller [baseline] (3.384 ms) : 0, 3384
Flare Poller [candidate] (3.462 ms) : 0, 3462
IAST [baseline] (25.823 ms) : 0, 25823
IAST [candidate] (25.974 ms) : 0, 25974
Loading

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master brian.marks/fix-log-injection-smoke-test-flake
git_commit_date 1776263862 1776265033
git_commit_sha caae0f7 31ae072
release_version 1.62.0-SNAPSHOT~caae0f79ce 1.62.0-SNAPSHOT~31ae072eeb
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1776267298 1776267298
ci_job_id 1597965228 1597965228
ci_pipeline_id 107839755 107839755
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-2-gtdg8azb 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-2-gtdg8azb 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 4 performance regressions! Performance is the same for 15 metrics, 17 unstable metrics.

scenario Δ mean agg_http_req_duration_p50 Δ mean agg_http_req_duration_p95 Δ mean throughput candidate mean agg_http_req_duration_p50 candidate mean agg_http_req_duration_p95 candidate mean throughput baseline mean agg_http_req_duration_p50 baseline mean agg_http_req_duration_p95 baseline mean throughput
scenario:load:petclinic:code_origins:high_load worse
[+0.912ms; +1.703ms] or [+5.172%; +9.655%]
worse
[+0.796ms; +2.212ms] or [+2.751%; +7.652%]
unstable
[-39.694op/s; +6.381op/s] or [-15.274%; +2.455%]
18.945ms 30.419ms 243.219op/s 17.637ms 28.915ms 259.875op/s
scenario:load:petclinic:iast:high_load worse
[+464.918µs; +909.602µs] or [+2.692%; +5.267%]
worse
[+0.789ms; +1.787ms] or [+2.796%; +6.329%]
unstable
[-32.952op/s; +14.140op/s] or [-12.518%; +5.371%]
17.956ms 29.521ms 253.844op/s 17.268ms 28.233ms 263.250op/s
Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~31ae072eeb, baseline=1.62.0-SNAPSHOT~caae0f79ce
    dateFormat X
    axisFormat %s
section baseline
no_agent (18.553 ms) : 18361, 18744
.   : milestone, 18553,
appsec (19.974 ms) : 19768, 20181
.   : milestone, 19974,
code_origins (17.956 ms) : 17780, 18132
.   : milestone, 17956,
iast (17.723 ms) : 17552, 17894
.   : milestone, 17723,
profiling (18.261 ms) : 18081, 18442
.   : milestone, 18261,
tracing (17.616 ms) : 17442, 17790
.   : milestone, 17616,
section candidate
no_agent (18.748 ms) : 18556, 18939
.   : milestone, 18748,
appsec (19.303 ms) : 19108, 19497
.   : milestone, 19303,
code_origins (19.188 ms) : 18996, 19381
.   : milestone, 19188,
iast (18.379 ms) : 18195, 18564
.   : milestone, 18379,
profiling (18.426 ms) : 18241, 18610
.   : milestone, 18426,
tracing (17.987 ms) : 17811, 18164
.   : milestone, 17987,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 18.553 ms [18.361 ms, 18.744 ms] -
appsec 19.974 ms [19.768 ms, 20.181 ms] 1.422 ms (7.7%)
code_origins 17.956 ms [17.78 ms, 18.132 ms] -596.908 µs (-3.2%)
iast 17.723 ms [17.552 ms, 17.894 ms] -829.515 µs (-4.5%)
profiling 18.261 ms [18.081 ms, 18.442 ms] -291.082 µs (-1.6%)
tracing 17.616 ms [17.442 ms, 17.79 ms] -936.715 µs (-5.0%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 18.748 ms [18.556 ms, 18.939 ms] -
appsec 19.303 ms [19.108 ms, 19.497 ms] 554.963 µs (3.0%)
code_origins 19.188 ms [18.996 ms, 19.381 ms] 440.586 µs (2.4%)
iast 18.379 ms [18.195 ms, 18.564 ms] -368.16 µs (-2.0%)
profiling 18.426 ms [18.241 ms, 18.61 ms] -322.001 µs (-1.7%)
tracing 17.987 ms [17.811 ms, 18.164 ms] -760.436 µs (-4.1%)
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~31ae072eeb, baseline=1.62.0-SNAPSHOT~caae0f79ce
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.242 ms) : 1230, 1254
.   : milestone, 1242,
iast (3.357 ms) : 3309, 3406
.   : milestone, 3357,
iast_FULL (6.141 ms) : 6077, 6205
.   : milestone, 6141,
iast_GLOBAL (3.659 ms) : 3598, 3719
.   : milestone, 3659,
profiling (2.184 ms) : 2163, 2205
.   : milestone, 2184,
tracing (1.904 ms) : 1888, 1920
.   : milestone, 1904,
section candidate
no_agent (1.235 ms) : 1224, 1246
.   : milestone, 1235,
iast (3.273 ms) : 3226, 3321
.   : milestone, 3273,
iast_FULL (5.995 ms) : 5933, 6056
.   : milestone, 5995,
iast_GLOBAL (3.61 ms) : 3556, 3665
.   : milestone, 3610,
profiling (2.155 ms) : 2132, 2178
.   : milestone, 2155,
tracing (1.881 ms) : 1865, 1897
.   : milestone, 1881,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.242 ms [1.23 ms, 1.254 ms] -
iast 3.357 ms [3.309 ms, 3.406 ms] 2.115 ms (170.3%)
iast_FULL 6.141 ms [6.077 ms, 6.205 ms] 4.899 ms (394.5%)
iast_GLOBAL 3.659 ms [3.598 ms, 3.719 ms] 2.417 ms (194.6%)
profiling 2.184 ms [2.163 ms, 2.205 ms] 942.133 µs (75.9%)
tracing 1.904 ms [1.888 ms, 1.92 ms] 662.216 µs (53.3%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.235 ms [1.224 ms, 1.246 ms] -
iast 3.273 ms [3.226 ms, 3.321 ms] 2.038 ms (165.0%)
iast_FULL 5.995 ms [5.933 ms, 6.056 ms] 4.76 ms (385.4%)
iast_GLOBAL 3.61 ms [3.556 ms, 3.665 ms] 2.375 ms (192.3%)
profiling 2.155 ms [2.132 ms, 2.178 ms] 920.029 µs (74.5%)
tracing 1.881 ms [1.865 ms, 1.897 ms] 645.723 µs (52.3%)

Dacapo

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master brian.marks/fix-log-injection-smoke-test-flake
git_commit_date 1776263862 1776265033
git_commit_sha caae0f7 31ae072
release_version 1.62.0-SNAPSHOT~caae0f79ce 1.62.0-SNAPSHOT~31ae072eeb
See matching parameters
Baseline Candidate
application biojava biojava
ci_job_date 1776267208 1776267208
ci_job_id 1597965230 1597965230
ci_pipeline_id 107839755 107839755
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-2-9re15zna 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-2-9re15zna 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics.

Execution time for tomcat
gantt
    title tomcat - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~31ae072eeb, baseline=1.62.0-SNAPSHOT~caae0f79ce
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.483 ms) : 1472, 1495
.   : milestone, 1483,
appsec (3.782 ms) : 3565, 3999
.   : milestone, 3782,
iast (2.266 ms) : 2196, 2335
.   : milestone, 2266,
iast_GLOBAL (2.278 ms) : 2208, 2347
.   : milestone, 2278,
profiling (2.103 ms) : 2047, 2158
.   : milestone, 2103,
tracing (2.077 ms) : 2023, 2130
.   : milestone, 2077,
section candidate
no_agent (1.485 ms) : 1473, 1497
.   : milestone, 1485,
appsec (3.824 ms) : 3603, 4045
.   : milestone, 3824,
iast (2.262 ms) : 2193, 2331
.   : milestone, 2262,
iast_GLOBAL (2.306 ms) : 2236, 2376
.   : milestone, 2306,
profiling (2.096 ms) : 2041, 2151
.   : milestone, 2096,
tracing (2.084 ms) : 2030, 2139
.   : milestone, 2084,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.483 ms [1.472 ms, 1.495 ms] -
appsec 3.782 ms [3.565 ms, 3.999 ms] 2.299 ms (155.0%)
iast 2.266 ms [2.196 ms, 2.335 ms] 782.255 µs (52.7%)
iast_GLOBAL 2.278 ms [2.208 ms, 2.347 ms] 794.238 µs (53.5%)
profiling 2.103 ms [2.047 ms, 2.158 ms] 619.52 µs (41.8%)
tracing 2.077 ms [2.023 ms, 2.13 ms] 593.3 µs (40.0%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.485 ms [1.473 ms, 1.497 ms] -
appsec 3.824 ms [3.603 ms, 4.045 ms] 2.339 ms (157.5%)
iast 2.262 ms [2.193 ms, 2.331 ms] 777.484 µs (52.4%)
iast_GLOBAL 2.306 ms [2.236 ms, 2.376 ms] 821.111 µs (55.3%)
profiling 2.096 ms [2.041 ms, 2.151 ms] 611.075 µs (41.2%)
tracing 2.084 ms [2.03 ms, 2.139 ms] 599.499 µs (40.4%)
Execution time for biojava
gantt
    title biojava - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~31ae072eeb, baseline=1.62.0-SNAPSHOT~caae0f79ce
    dateFormat X
    axisFormat %s
section baseline
no_agent (14.939 s) : 14939000, 14939000
.   : milestone, 14939000,
appsec (14.51 s) : 14510000, 14510000
.   : milestone, 14510000,
iast (18.062 s) : 18062000, 18062000
.   : milestone, 18062000,
iast_GLOBAL (18.05 s) : 18050000, 18050000
.   : milestone, 18050000,
profiling (14.778 s) : 14778000, 14778000
.   : milestone, 14778000,
tracing (14.906 s) : 14906000, 14906000
.   : milestone, 14906000,
section candidate
no_agent (15.668 s) : 15668000, 15668000
.   : milestone, 15668000,
appsec (14.722 s) : 14722000, 14722000
.   : milestone, 14722000,
iast (18.433 s) : 18433000, 18433000
.   : milestone, 18433000,
iast_GLOBAL (17.983 s) : 17983000, 17983000
.   : milestone, 17983000,
profiling (15.262 s) : 15262000, 15262000
.   : milestone, 15262000,
tracing (14.924 s) : 14924000, 14924000
.   : milestone, 14924000,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 14.939 s [14.939 s, 14.939 s] -
appsec 14.51 s [14.51 s, 14.51 s] -429.0 ms (-2.9%)
iast 18.062 s [18.062 s, 18.062 s] 3.123 s (20.9%)
iast_GLOBAL 18.05 s [18.05 s, 18.05 s] 3.111 s (20.8%)
profiling 14.778 s [14.778 s, 14.778 s] -161.0 ms (-1.1%)
tracing 14.906 s [14.906 s, 14.906 s] -33.0 ms (-0.2%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.668 s [15.668 s, 15.668 s] -
appsec 14.722 s [14.722 s, 14.722 s] -946.0 ms (-6.0%)
iast 18.433 s [18.433 s, 18.433 s] 2.765 s (17.6%)
iast_GLOBAL 17.983 s [17.983 s, 17.983 s] 2.315 s (14.8%)
profiling 15.262 s [15.262 s, 15.262 s] -406.0 ms (-2.6%)
tracing 14.924 s [14.924 s, 14.924 s] -744.0 ms (-4.7%)

The `check raw file injection` test flakes across 11+ logging backend
variants. CI Visibility data shows the failure is bimodal — successful
runs complete in 3-9s, but failures sit at exactly 30s (the
PollingConditions timeout) with traceCount=0. Nothing in between. This
means the process either works or is totally broken — no amount of
timeout increase will help.

The current test is blind during the 30s wait — it just polls
traceCount with no diagnostics when the process crashes or hangs.

Changes:
- Add `waitForTraceCountAlive` that checks process liveness on every
  poll iteration. If the process dies, it fails immediately with the
  exit code, RC poll count, and last 20 lines of process output.
- On timeout, enrich the error with diagnostic state (process alive?,
  traceCount, RC polls received, last 30 lines of output) so the next
  CI failure tells us whether it's a crash, a hang, or a connectivity
  issue.
- Reorder `waitForTraceCount(4)` before `waitFor` to confirm all
  traces are delivered while the process is still alive.
- Assert `waitFor` return value for a clear error if the process hangs.

tag: no release notes

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@bm1549 bm1549 changed the title Fix log injection smoke test flakiness from startup timeout Diagnose log injection smoke test flakiness instead of masking it Apr 9, 2026
bm1549 and others added 3 commits April 10, 2026 11:58
The liveness check fired before the trace count check, so a normal
process exit after delivering all traces was treated as a failure.
Check traceCount >= count first and return early if satisfied.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
PollingConditions.eventually only retries AssertionError. The liveness
check was throwing AssertionError, so a dead process still waited the
full 30s timeout. Switch to RuntimeException so it propagates
immediately. Also narrow the catch from Throwable to AssertionError.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@bm1549 bm1549 marked this pull request as ready for review April 14, 2026 23:29
@bm1549 bm1549 requested a review from a team as a code owner April 14, 2026 23:30
@bm1549 bm1549 requested review from mhlidd and removed request for a team April 14, 2026 23:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp: core Tracer core tag: ai generated Largely based on code generated by an AI or LLM tag: no release notes Changes to exclude from release notes type: bug Bug report and fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants