fix: stop flow inference from hanging on large self-assignments#1116
fix: stop flow inference from hanging on large self-assignments#1116lewis6991 wants to merge 1 commit into
Conversation
Log a warning and return unknown when a single flow query exceeds the step budget. This keeps semantic model construction moving for pathological repeated assignments while preserving normal flow precision. Add a development/test switch that can turn the fallback off when a runaway flow query needs to be reproduced. Fixes EmmyLuaLs#1114 Assisted-by: Codex
There was a problem hiding this comment.
Here's my code review of the changes:
Issues Found:
1. Potential Logic Error: cache_options field not used in get_infer_cache
- File:
infer_cache_manager.rs - Issue: The
cache_optionsfield is stored but onlyanalysis_phaseis being modified. Other fields inCacheOptions(likedisable_flow_inference_step_budget) are being overridden/ignored. - Suggestion: Either use the full
cache_optionsor document why onlyanalysis_phaseneeds to be overridden.
2. Missing Clone implementation for CacheOptions
- File:
cache_options.rs - Issue:
CacheOptionsisClone + CopybutLuaAnalysisPhaseis onlyClone + Copy. This is fine, but ensure all consumers expect this behavior. - Suggestion: Verify that no code depends on
CacheOptionsbeing mutable after creation.
3. Potential Performance Issue: Cloning cache_options on every cache access
- File:
infer_cache_manager.rsline 22 - Issue:
let mut cache_options = self.cache_options;creates a copy on every cache lookup, which is unnecessary since it'sCopy. - Suggestion: Use
self.cache_optionsdirectly or reference it without copying.
4. Inconsistent Budget Handling
- File:
get_type_at_flow.rs - Issue: The budget check in
evaluate_walkreturnsContinueWalkwithout consuming a step, whilestart_queryconsumes a step. This could lead to inconsistent budget tracking. - Suggestion: Ensure step consumption is consistent across all code paths.
5. Missing Error Handling for Budget Exhaustion
- File:
get_type_at_flow.rs - Issue: When budget is exhausted, the function returns
LuaType::Unknownsilently (except for a warning log). This could mask real issues. - Suggestion: Consider returning a more specific error type or adding a metric/counter for budget exhaustion events.
6. Test Coverage Gap
- File:
flow.rs - Issue: The test only checks if
get_semantic_modelreturnsSome, not the actual type inference results. - Suggestion: Add assertions to verify the inferred types are correct (e.g.,
string,number,boolean).
Recommendations:
-
Document the budget constant - Add a comment explaining why
50_000was chosen and how to tune it. -
Consider making budget configurable - The
disable_flow_inference_step_budgetflag is good for debugging, but consider making the budget value configurable too. -
Add budget exhaustion metrics - Track how often the budget is exceeded to help tune the constant.
-
Review thread safety - Ensure
CacheOptionsbeingCopydoesn't cause issues in concurrent scenarios.
Overall, the changes look well-structured and address the performance issue with flow inference. The budget mechanism is a good addition to prevent infinite loops or excessive computation.
There was a problem hiding this comment.
Code Review
This pull request introduces a step budget (FLOW_INFERENCE_STEP_BUDGET) for flow inference queries to prevent semantic model construction from stalling on complex or deeply nested self-dependent assignments. It propagates CacheOptions through the analysis pipeline to allow disabling this budget during testing, and adds a regression test for issue 1114. I have no feedback to provide as there are no review comments to address.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
|
This is a more general mitigation for issues like #1114 , it loses precision but avoids flow getting stuck. #1115 is more specific solution for #1114 that doesn't lose precision. I'm not 100% sure about this. It improves user experience, but will mask issues in the flow engine. I added a flag so we can at least disable this for tests. |
|
I tested this PR locally, and it works for me. |
|
Did you try #1115 ? That is a real fix. |
I observed that the CPU usage of the emmylua_ls process rose above 100%. After a long time, the CPU remained at 100%, exactly the same as before. I think #1115 has not been resolved. |
|
Observation: After startup, the workspace analysis enters the "flow analyze" stage. The single-threaded CPU usage remains at 100% for over 5 minutes without decreasing. This is a pure CPU calculation (without file I/O). GDB backtrace (LWP 3795686, the thread consuming CPU resources): strace (3-second sampling, confirming that there is no file I/O on the pure CPU): log (The last few lines, the analysis has stopped after reaching the "flow analyze" stage and there has been no further progress): My AI's speculation on this (I'm sorry, I don't fully understand the source code; the speculation of AI may be misleading.): |
This please. |
We obviously need a reproducible example that can be run independently. In practice, this kind of freeze is usually related to the shape of the code in a single file. Of course, much of the internal code cannot be made public. If you are willing to continue helping to investigate the issue, you can follow the general approach I have suggested to others for testing: copy the project's code out, cut half of it away, open the editor to test whether it loads properly. If it does not, cut away half again. If it does load properly, then the problem may lie in the other half that was cut away. Repeat this process until you find the smallest set of one or a few Lua files that can reliably reproduce the issue. If the relevant code is not convenient to disclose, you can obfuscate it manually, remove sensitive information, keep the issue reproducible, and then package and submit those files. |
|
OK, I will find some time to reproduce it. |
This should be fixed in #1115 now. |
Problem
Solution
unknownfor active/pending queries, and keep analysis moving.Tests
cargo test -p emmylua_code_analysis test_issue_1114_repeated_self_dependent_assignments_build_semantic_model -- --nocapturecargo test -p emmylua_code_analysis flowcargo fmt --all --checkgit -c core.fsmonitor=false diff --checkFixes #1114