pkg/aflow: pass fault injection info to LLM#7023
pkg/aflow: pass fault injection info to LLM#7023officialasishkumar wants to merge 3 commits intogoogle:masterfrom
Conversation
Extract fault injection reports from the kernel console output produced during crash reproduction, and pass them to the debugger LLM agent. Fault injection is an important debugging signal: when the reproducer uses fault injection, the FAULT_INJECTION trace in the kernel log shows exactly which allocation was forced to fail and the call path that led to it. This context helps the LLM understand the root cause more accurately, especially for bugs triggered by allocation failures. Changes: - Add ExtractFaultInjectionInfo() to parse FAULT_INJECTION blocks from raw kernel console output. - Capture RawOutput from test results in RunTest. - Add ReproducedFaultInjection field to the reproduce result and cache. - Include fault injection info in the debugger LLM prompt when present. - Bump cache version to invalidate entries missing this field. Fixes google#6762
| // ExtractFaultInjectionInfo extracts fault injection reports from kernel console output. | ||
| // Fault injection is an important debugging signal: it shows which specific allocation | ||
| // was forced to fail and the call path that led to it. | ||
| func ExtractFaultInjectionInfo(output []byte) string { |
There was a problem hiding this comment.
Unexport this function, it's not used outside of the package.
There was a problem hiding this comment.
Moved the fault-injection extraction into pkg/report, so the helper in pkg/aflow is gone now. Fixed in officialasishkumar@2dce7f1.
| {{.SimplifiedCRepro}} | ||
| {{if .ReproducedFaultInjection}} | ||
| The reproducer uses fault injection to force allocation failure at a specific point. | ||
| The following fault injection report(s) show what was injected: |
There was a problem hiding this comment.
I would add a sentence saying that fault injections trigger rarely executed errors handling code paths, and frequently the bug is related to these code paths. At least that's what I would say to a human debugging this.
There was a problem hiding this comment.
Added that note to the prompt, so it now explicitly says fault injection often exercises rarely hit error-handling paths and the bug is frequently there. Fixed in officialasishkumar@2dce7f1.
| // Fault injection is an important debugging signal: it shows which specific allocation | ||
| // was forced to fail and the call path that led to it. | ||
| func ExtractFaultInjectionInfo(output []byte) string { | ||
| const marker = "FAULT_INJECTION: forcing a failure" |
There was a problem hiding this comment.
Convert it to []byte here instead of converting and generating garbage in the loop below.
There was a problem hiding this comment.
Switched the marker handling to a shared []byte value, so there is no repeated conversion in the scan anymore. Fixed in officialasishkumar@2dce7f1.
| } | ||
| block = append(block, l) | ||
| } | ||
| if len(block) > 0 { |
There was a problem hiding this comment.
Can blocks be empty here? I don't see how.
| block = append(block, l) | ||
| } | ||
| if len(block) > 0 { | ||
| blocks = append(blocks, string(bytes.Join(block, []byte{'\n'}))) |
There was a problem hiding this comment.
Are real reports delimited by new line? I don't think so.
| if len(block) > 0 { | ||
| blocks = append(blocks, string(bytes.Join(block, []byte{'\n'}))) | ||
| } | ||
| if len(blocks) >= 5 { |
There was a problem hiding this comment.
I think we need to deduplicate blocks b/c for reproducers that run the program repeatedly, we can get dozens of the same report. Though not sure if all reports will be exactly the same (are there any varying fields?).
There was a problem hiding this comment.
Added deduplication after compacting the extracted fault report down to the stable marker, name, and call-trace content, so repeated runs do not spam the prompt. Fixed in officialasishkumar@2dce7f1.
| // was forced to fail and the call path that led to it. | ||
| func ExtractFaultInjectionInfo(output []byte) string { | ||
| const marker = "FAULT_INJECTION: forcing a failure" | ||
| if !bytes.Contains(output, []byte(marker)) { |
There was a problem hiding this comment.
2 important things:
- we need to symbolize the output, raw console output does not contain line numbers and is hard to interpret
- we need to use the same line context analysis logic we use in pkg/report to analyze raw output; for example, 2 threads can print fault reports at the same time, and the output may be intermixed on the line level
There was a problem hiding this comment.
Reworked this to use pkg/report for extraction and symbolization, so the fault-injection trace now goes through the same line-context handling as Linux report parsing. Fixed in officialasishkumar@2dce7f1.
|
Please rebase and remove the merge commit, we don't use merge commits. |
Extract fault injection reports from the kernel console output produced during crash reproduction, and pass them to the debugger LLM agent.
Fault injection is an important debugging signal: when the reproducer uses fault injection, the
FAULT_INJECTIONtrace in the kernel log shows exactly which allocation was forced to fail and the call path that led to it. This context helps the LLM understand the root cause more accurately, especially for bugs triggered by allocation failures.Changes:
ExtractFaultInjectionInfo()to parseFAULT_INJECTIONblocks from the raw kernel console output (up to 5 blocks, each up to 50 lines).RawOutputfrom test results inRunTest.ReproducedFaultInjectionfield to the reproduce result and cache it.Fixes #6762