Skip to content

DAOS-18797 build: disable Go compiler optimizations in debug builds#18185

Open
knard38 wants to merge 3 commits intomasterfrom
ckochhof/fix/master/daos-18797/patch-001
Open

DAOS-18797 build: disable Go compiler optimizations in debug builds#18185
knard38 wants to merge 3 commits intomasterfrom
ckochhof/fix/master/daos-18797/patch-001

Conversation

@knard38
Copy link
Copy Markdown
Contributor

@knard38 knard38 commented May 6, 2026

Description

When building DAOS in debug mode, the Go binaries produced by the build system
are compiled with optimizations enabled. This makes it effectively impossible to
use Go debuggers (e.g. dlv, gdb with Go support) because:

  • Debuggers warn: "Warning: debugging optimized function"
  • Source lines are not correctly associated with the execution pointer
  • Variables may be inlined or eliminated, making inspection unreliable

This PR fixes this by adding -gcflags "all=-N -l" to the non-release build
flags in src/control/SConscript. These flags instruct the Go compiler to
disable optimizations (-N) and inlining (-l), which are the standard flags
recommended by the Go debugger documentation. The fix also applies to sanitizer
builds (-asan), which were previously also missing the debugger-friendly flags.

This change only affects debug builds; release builds are completely unaffected.
The resulting binaries will be larger and slower, which is expected and
acceptable in a debug context.

src/control/run_go_tests.sh is also extended with --dlv and --run flags
to launch an interactive dlv session directly from the test runner, with the
CGo environment already set up — no manual configuration needed.


Reviewer Validation — dlv Examples

The following examples demonstrate the impact of this PR and can be reproduced
by any reviewer to validate it. All examples assume a debug DAOS build is
available and dlv is installed.

Example 1 — Observe the difference: before vs after

This is the most direct validation. With the old build, setting a breakpoint in
dmg triggers the optimization warning; with this PR it does not.

# Run dmg under dlv and set a breakpoint in the pool create command
dlv exec -- $(which dmg) pool list

# At the (dlv) prompt:
(dlv) break main.main
(dlv) continue

Before this PR, dlv outputs:

Warning: debugging optimized function

Source lines jump erratically and local variables are often shown as
optimized away.

After this PR, no warning is shown, breakpoints map correctly to source
lines, and all local variables are inspectable.

Example 2 — Inspect variables in dmg pool list

dlv exec -- $(which dmg) pool list
(dlv) break github.com/daos-stack/daos/src/control/lib/control.ListPools
(dlv) continue
(dlv) args        # shows all function arguments
(dlv) locals      # shows all local variables — empty/wrong before this PR

Expected after this PR: req, ctx, and rpcClient are all fully
inspectable with their concrete values.

Example 3 — Attach to a running daos_server and trace pool creation

Start daos_server normally, then in another terminal:

# Find the PID
pgrep -x daos_server

# Attach dlv to the running process
sudo dlv attach <PID>
(dlv) break github.com/daos-stack/daos/src/control/server.(*mgmtSvc).resolvePoolID
(dlv) continue

# In another terminal, trigger the code path:
dmg pool list

# Back in dlv — after this PR, the breakpoint is hit and you can inspect:
(dlv) args         # shows 'id' string argument
(dlv) locals       # shows local variables inside resolvePoolID
(dlv) step         # steps correctly line by line through the function

Before this PR: the breakpoint may be hit but step jumps to unrelated
lines and locals shows optimized away.

Example 4 — Debug daos_agent startup

dlv exec $(which daos_agent) -- -o /etc/daos/daos_agent.yml
(dlv) break main.(*startCmd).Execute
(dlv) continue
(dlv) locals       # inspect startup configuration values
(dlv) print cmd    # print the entire startCmd struct

After this PR: the cmd struct prints with all populated fields, making it
easy to verify that configuration was loaded correctly.

Example 5 — Use run_go_tests.sh --dlv to debug a unit test

src/control/run_go_tests.sh now supports a --dlv flag that sets up the full
CGo environment and launches an interactive dlv session for any package:

# From the repo root
src/control/run_go_tests.sh --dlv \
  --run TestFaultComparison \
  github.com/daos-stack/daos/src/control/fault
(dlv) break github.com/daos-stack/daos/src/control/fault.(*Fault).Equals
(dlv) continue
(dlv) args        # shows receiver 'f' and argument 'raw' with their concrete values
(dlv) step        # steps correctly to the next source line

This requires no additional setup — the script sources .build_vars.sh
automatically. For packages with CGo dependencies (e.g. server), it also
exports the correct CGO_CFLAGS/CGO_LDFLAGS/LD_LIBRARY_PATH.


Steps for the author:

  • Commit message follows the guidelines.
  • Appropriate Features or Test-tag pragmas were used.
  • Appropriate Functional Test Stages were run.
  • At least two positive code reviews including at least one code owner from each category referenced in the PR.
  • Testing is complete. If necessary, forced-landing label added and a reason added in a comment.

After all prior steps are complete:

  • Gatekeeper requested (daos-gatekeeper added as a reviewer).

@knard38 knard38 self-assigned this May 6, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

Ticket title is 'Go binaries not debuggable with dlv due to compiler optimizations'
Status is 'In Progress'
https://daosio.atlassian.net/browse/DAOS-18797

kanard38 added 2 commits May 6, 2026 14:32
When building DAOS in debug mode, the Go binaries were compiled with
optimizations enabled, making it effectively impossible to use Go
debuggers effectively:

- Debuggers warn about optimized functions
- Source lines are not correctly mapped to the execution pointer
- Variables may be inlined or eliminated, making inspection unreliable

Add '-gcflags "all=-N -l"' to the non-release build flags to disable
compiler optimizations (-N) and inlining (-l), as recommended by the
Go debugger documentation.

This change only affects debug builds; release builds are unaffected.
The resulting binaries will be larger and slower, which is expected and
acceptable in a debug context.
…ebugging

Add --dlv and --run flags to src/control/run_go_tests.sh to allow
launching an interactive Delve (dlv) debug session directly from the
test runner script without any additional environment setup.

  src/control/run_go_tests.sh --dlv --run <TestName> <package>

The script already handles CGo environment setup (CGO_CFLAGS,
CGO_LDFLAGS, LD_LIBRARY_PATH) via .build_vars.sh, so no manual
configuration is needed. The same build tags used for the full test
suite (firmware, fault_injection, test_stubs, spdk) are applied to
the dlv build as well.

A --help / -h flag is also added documenting all supported options.
@knard38 knard38 force-pushed the ckochhof/fix/master/daos-18797/patch-001 branch from d563ec0 to 9d2ee28 Compare May 6, 2026 14:33
@daosbuild3
Copy link
Copy Markdown
Collaborator

@daosbuild3
Copy link
Copy Markdown
Collaborator

Test stage Unit Test completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos/job/PR-18185/2/display/redirect

@daosbuild3
Copy link
Copy Markdown
Collaborator

Test stage Unit Test bdev with memcheck completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos/job/PR-18185/2/display/redirect

@daosbuild3
Copy link
Copy Markdown
Collaborator

Test stage Unit Test with memcheck completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos/job/PR-18185/2/display/redirect

@daosbuild3
Copy link
Copy Markdown
Collaborator

Test stage Unit Test bdev completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos/job/PR-18185/2/display/redirect

@knard38 knard38 force-pushed the ckochhof/fix/master/daos-18797/patch-001 branch from a301ef9 to 014d90b Compare May 6, 2026 14:47
@knard38 knard38 marked this pull request as ready for review May 6, 2026 14:51
@knard38 knard38 requested review from a team as code owners May 6, 2026 14:51
@knard38 knard38 requested review from kjacque, mjmac and tanabarr May 6, 2026 14:52
Copy link
Copy Markdown
Contributor

@kjacque kjacque left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Nice improvement, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

4 participants