fix: stop flow inference from hanging on large self-assignments by lewis6991 · Pull Request #1116 · EmmyLuaLs/emmylua-analyzer-rust

lewis6991 · 2026-06-15T16:26:15Z

Problem

Repeated self-dependent assignments can make flow inference spend too long in a single query.
The issue is not limited to concat; similar repeated arithmetic or comparison assignments can hit the same path.
When this happens, semantic model construction can stall.

Solution

Add a per-flow-query step budget to the iterative flow scheduler.
When the budget is exceeded, log a warning, clear in-progress flow cache guards, cache unknown for active/pending queries, and keep analysis moving.
Add an internal development/test option to disable the fallback when a runaway query needs to be reproduced.

Tests

cargo test -p emmylua_code_analysis test_issue_1114_repeated_self_dependent_assignments_build_semantic_model -- --nocapture
cargo test -p emmylua_code_analysis flow
cargo fmt --all --check
git -c core.fsmonitor=false diff --check

Log a warning and return unknown when a single flow query exceeds the step budget. This keeps semantic model construction moving for pathological repeated assignments while preserving normal flow precision. Add a development/test switch that can turn the fallback off when a runaway flow query needs to be reproduced. Fixes EmmyLuaLs#1114 Assisted-by: Codex

github-actions

Here's my code review of the changes:

Issues Found:

1. Potential Logic Error: `cache_options` field not used in `get_infer_cache`

File: infer_cache_manager.rs
Issue: The cache_options field is stored but only analysis_phase is being modified. Other fields in CacheOptions (like disable_flow_inference_step_budget) are being overridden/ignored.
Suggestion: Either use the full cache_options or document why only analysis_phase needs to be overridden.

2. Missing `Clone` implementation for `CacheOptions`

File: cache_options.rs
Issue: CacheOptions is Clone + Copy but LuaAnalysisPhase is only Clone + Copy. This is fine, but ensure all consumers expect this behavior.
Suggestion: Verify that no code depends on CacheOptions being mutable after creation.

3. Potential Performance Issue: Cloning `cache_options` on every cache access

File: infer_cache_manager.rs line 22
Issue: let mut cache_options = self.cache_options; creates a copy on every cache lookup, which is unnecessary since it's Copy.
Suggestion: Use self.cache_options directly or reference it without copying.

4. Inconsistent Budget Handling

File: get_type_at_flow.rs
Issue: The budget check in evaluate_walk returns ContinueWalk without consuming a step, while start_query consumes a step. This could lead to inconsistent budget tracking.
Suggestion: Ensure step consumption is consistent across all code paths.

5. Missing Error Handling for Budget Exhaustion

File: get_type_at_flow.rs
Issue: When budget is exhausted, the function returns LuaType::Unknown silently (except for a warning log). This could mask real issues.
Suggestion: Consider returning a more specific error type or adding a metric/counter for budget exhaustion events.

6. Test Coverage Gap

File: flow.rs
Issue: The test only checks if get_semantic_model returns Some, not the actual type inference results.
Suggestion: Add assertions to verify the inferred types are correct (e.g., string, number, boolean).

Recommendations:

Document the budget constant - Add a comment explaining why 50_000 was chosen and how to tune it.
Consider making budget configurable - The disable_flow_inference_step_budget flag is good for debugging, but consider making the budget value configurable too.
Add budget exhaustion metrics - Track how often the budget is exceeded to help tune the constant.
Review thread safety - Ensure CacheOptions being Copy doesn't cause issues in concurrent scenarios.

Overall, the changes look well-structured and address the performance issue with flow inference. The budget mechanism is a good addition to prevent infinite loops or excessive computation.

gemini-code-assist

Code Review

This pull request introduces a step budget (FLOW_INFERENCE_STEP_BUDGET) for flow inference queries to prevent semantic model construction from stalling on complex or deeply nested self-dependent assignments. It propagates CacheOptions through the analysis pipeline to allow disabling this budget during testing, and adds a regression test for issue 1114. I have no feedback to provide as there are no review comments to address.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

lewis6991 · 2026-06-15T16:30:14Z

This is a more general mitigation for issues like #1114 , it loses precision but avoids flow getting stuck. #1115 is more specific solution for #1114 that doesn't lose precision.

I'm not 100% sure about this. It improves user experience, but will mask issues in the flow engine. I added a flag so we can at least disable this for tests.

tzssangglass · 2026-06-19T16:47:48Z

I tested this PR locally, and it works for me.

lewis6991 · 2026-06-19T16:52:29Z

Did you try #1115 ? That is a real fix.

tzssangglass · 2026-06-20T01:51:29Z

Build EmmyLuaLS based on fix: stop repeated assignments from hanging #1115
Delete the .emmy folder under the large project
Start a large project

I observed that the CPU usage of the emmylua_ls process rose above 100%. After a long time, the CPU remained at 100%, exactly the same as before.

I think #1115 has not been resolved.

lewis6991 · 2026-06-20T08:15:31Z

It's not just about the project being large. There a specific forms of code that can push analysis to work hard. #1115 fixes that for a specific case. This PR just tells analysis to give up after a while.

Are you able to provide a test case that doesn't work with #1115

tzssangglass · 2026-06-20T09:13:01Z

Observation: After startup, the workspace analysis enters the "flow analyze" stage. The single-threaded CPU usage remains at 100% for over 5 minutes without decreasing. This is a pure CPU calculation (without file I/O).

GDB backtrace (LWP 3795686, the thread consuming CPU resources):

  #0  mi_theap_malloc_zero_aligned_at_overalloc
  #1  hashbrown::raw::RawTableInner::fallible_with_capacity
  #2  hashbrown::raw::RawTable<T,A>::reserve_rehash
  #3  hashbrown::map::HashMap<K,V,S,A>::insert
  #4  emmylua_code_analysis::semantic::infer::infer_expr
  #5  emmylua_code_analysis::semantic::cache::LuaInferCache::with_replay_overlay
  #6  emmylua_code_analysis::semantic::infer::narrow::get_type_at_flow::FlowReplayQuery::replay_type
  #7  emmylua_code_analysis::semantic::infer::narrow::get_type_at_flow::FlowTypeEngine::start_expr_replay
  #8  emmylua_code_analysis::semantic::infer::narrow::infer_expr_narrow_type
  #9  emmylua_code_analysis::semantic::infer::infer_name::infer_var_ref_type
  #10 emmylua_code_analysis::semantic::infer::infer_name::infer_name_expr
  #11 emmylua_code_analysis::semantic::infer::infer_expr
  #12 emmylua_code_analysis::semantic::infer::infer_index::try_infer_expr_for_index
  #13 emmylua_code_analysis::semantic::infer::infer_index::infer_index_expr
  #14 emmylua_code_analysis::semantic::infer::infer_expr
  #15 emmylua_code_analysis::semantic::infer::infer_index::try_infer_expr_for_index
  #16 emmylua_code_analysis::semantic::infer::infer_index::infer_index_expr
  #17 emmylua_code_analysis::semantic::infer::infer_expr
  #18 emmylua_code_analysis::semantic::infer::infer_expr
  #19 emmylua_code_analysis::semantic::infer::infer_expr_list_types
  #20 emmylua_code_analysis::semantic::overload_resolve::resolve_signature
  #21 emmylua_code_analysis::semantic::infer::infer_call::infer_call_expr_func
  #22 emmylua_code_analysis::semantic::infer::infer_expr
  #23 emmylua_code_analysis::compilation::analyzer::lua::stats::analyze_local_stat
  #24 <LuaAnalysisPipeline as AnalysisPipeline>::analyze
  #25 emmylua_code_analysis::compilation::analyzer::analyze
  #26 emmylua_code_analysis::compilation::LuaCompilation::update_index
  #27 emmylua_code_analysis::EmmyLuaAnalysis::update_files_by_uri
  #28 emmylua_code_analysis::EmmyLuaAnalysis::reload_workspace_files
  #29 emmylua_ls::handlers::initialized::initialized_handler::{{closure}}

strace (3-second sampling, confirming that there is no file I/O on the pure CPU):

  % time     seconds  usecs/call     calls    errors syscall
    0.00    0.000000           0        78           write
  100.00    0.000000           0        78           total

log (The last few lines, the analysis has stopped after reaching the "flow analyze" stage and there has been no further progress):

  load files from workspace count: 9118
  update files: cost 2.480971976s
  module analyze: cost 19.9563ms
  decl analyze: cost 2.524074206s
  doc analyze: cost 436.86232ms
  flow analyze: cost 628.259041ms

My AI's speculation on this (I'm sorry, I don't fully understand the source code; the speculation of AI may be misleading.):

  Call stack by frame number, top to bottom:
  - #4 infer_expr
  - #5 try_infer_expr_for_index
  - #6 infer_index_expr
  - #7 infer_expr
  - #8 try_infer_expr_for_index
  - #9 infer_index_expr
  - #10 infer_expr
  - #12 instantiate_func_generic::infer_call_arg_type
  - #13 instantiate_func_generic
  - #14 resolve_signature

  Interpretations I added (not verified):

  - "mutual recursion" — Because infer_expr and infer_index_expr alternate in the stack, I inferred they are mutually recursive. However, a single gdb bt is a snapshot at one instant; it cannot 100% prove it's an infinite loop (it could be a very deep but finite call, or just happened to stop at this frame). To confirm it's a loop, you'd need multiple
  consecutive bt snapshots to see if the stack keeps growing or stays constant.
  - "on chained index a.b.c..." — A guess. infer_index might be analyzing something like a.b.c, but it could also be some other Lua construct triggering index inference. No evidence.
  - "triggered during generic function instantiation" — #12-#14 do contain instantiate_func_generic/resolve_signature in their names, so this is relatively reliable (the function names say so). But whether it's the trigger or just a frame on the recursion path, I cannot determine.
  - "NOT the self_dependent_assignment path that #1115 fixes" — Inference. #1115 modifies self_dependent_assignment_operator_type, and this stack doesn't show that function name, so I said "not the same path." But that's only "this frame doesn't show it" — it doesn't prove the triggering logic is entirely unrelated.

lewis6991 · 2026-06-20T10:09:31Z

Are you able to provide a test case that doesn't work with #1115

This please.

CppCXY · 2026-06-22T09:06:06Z

Observation: After startup, the workspace analysis enters the "flow analyze" stage. The single-threaded CPU usage remains at 100% for over 5 minutes without decreasing. This is a pure CPU calculation (without file I/O).

GDB backtrace (LWP 3795686, the thread consuming CPU resources):

  #0  mi_theap_malloc_zero_aligned_at_overalloc
  #1  hashbrown::raw::RawTableInner::fallible_with_capacity
  #2  hashbrown::raw::RawTable<T,A>::reserve_rehash
  #3  hashbrown::map::HashMap<K,V,S,A>::insert
  #4  emmylua_code_analysis::semantic::infer::infer_expr
  #5  emmylua_code_analysis::semantic::cache::LuaInferCache::with_replay_overlay
  #6  emmylua_code_analysis::semantic::infer::narrow::get_type_at_flow::FlowReplayQuery::replay_type
  #7  emmylua_code_analysis::semantic::infer::narrow::get_type_at_flow::FlowTypeEngine::start_expr_replay
  #8  emmylua_code_analysis::semantic::infer::narrow::infer_expr_narrow_type
  #9  emmylua_code_analysis::semantic::infer::infer_name::infer_var_ref_type
  #10 emmylua_code_analysis::semantic::infer::infer_name::infer_name_expr
  #11 emmylua_code_analysis::semantic::infer::infer_expr
  #12 emmylua_code_analysis::semantic::infer::infer_index::try_infer_expr_for_index
  #13 emmylua_code_analysis::semantic::infer::infer_index::infer_index_expr
  #14 emmylua_code_analysis::semantic::infer::infer_expr
  #15 emmylua_code_analysis::semantic::infer::infer_index::try_infer_expr_for_index
  #16 emmylua_code_analysis::semantic::infer::infer_index::infer_index_expr
  #17 emmylua_code_analysis::semantic::infer::infer_expr
  #18 emmylua_code_analysis::semantic::infer::infer_expr
  #19 emmylua_code_analysis::semantic::infer::infer_expr_list_types
  #20 emmylua_code_analysis::semantic::overload_resolve::resolve_signature
  #21 emmylua_code_analysis::semantic::infer::infer_call::infer_call_expr_func
  #22 emmylua_code_analysis::semantic::infer::infer_expr
  #23 emmylua_code_analysis::compilation::analyzer::lua::stats::analyze_local_stat
  #24 <LuaAnalysisPipeline as AnalysisPipeline>::analyze
  #25 emmylua_code_analysis::compilation::analyzer::analyze
  #26 emmylua_code_analysis::compilation::LuaCompilation::update_index
  #27 emmylua_code_analysis::EmmyLuaAnalysis::update_files_by_uri
  #28 emmylua_code_analysis::EmmyLuaAnalysis::reload_workspace_files
  #29 emmylua_ls::handlers::initialized::initialized_handler::{{closure}}

We obviously need a reproducible example that can be run independently. In practice, this kind of freeze is usually related to the shape of the code in a single file. Of course, much of the internal code cannot be made public. If you are willing to continue helping to investigate the issue, you can follow the general approach I have suggested to others for testing: copy the project's code out, cut half of it away, open the editor to test whether it loads properly. If it does not, cut away half again. If it does load properly, then the problem may lie in the other half that was cut away. Repeat this process until you find the smallest set of one or a few Lua files that can reliably reproduce the issue. If the relevant code is not convenient to disclose, you can obfuscate it manually, remove sensitive information, keep the issue reproducible, and then package and submit those files.

tzssangglass · 2026-06-23T02:06:48Z

OK, I will find some time to reproduce it.

lewis6991 · 2026-06-23T13:38:54Z

Observation: After startup, the workspace analysis enters the "flow analyze" stage. The single-threaded CPU usage remains at 100% for over 5 minutes without decreasing. This is a pure CPU calculation (without file I/O).

GDB backtrace (LWP 3795686, the thread consuming CPU resources):
  #0  mi_theap_malloc_zero_aligned_at_overalloc
  #1  hashbrown::raw::RawTableInner::fallible_with_capacity
  #2  hashbrown::raw::RawTable<T,A>::reserve_rehash
  #3  hashbrown::map::HashMap<K,V,S,A>::insert
...
...

This should be fixed in #1115 now.

github-actions Bot reviewed Jun 15, 2026

View reviewed changes

gemini-code-assist Bot reviewed Jun 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: stop flow inference from hanging on large self-assignments#1116

fix: stop flow inference from hanging on large self-assignments#1116
lewis6991 wants to merge 1 commit into
EmmyLuaLs:mainfrom
lewis6991:flow-budget-fallback

lewis6991 commented Jun 15, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

lewis6991 commented Jun 15, 2026

Uh oh!

tzssangglass commented Jun 19, 2026

Uh oh!

lewis6991 commented Jun 19, 2026

Uh oh!

tzssangglass commented Jun 20, 2026 •

edited

Loading

Uh oh!

lewis6991 commented Jun 20, 2026

Uh oh!

tzssangglass commented Jun 20, 2026

Uh oh!

lewis6991 commented Jun 20, 2026 •

edited

Loading

Uh oh!

CppCXY commented Jun 22, 2026

Uh oh!

tzssangglass commented Jun 23, 2026

Uh oh!

lewis6991 commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

lewis6991 commented Jun 15, 2026

Problem

Solution

Tests

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Issues Found:

1. Potential Logic Error: cache_options field not used in get_infer_cache

2. Missing Clone implementation for CacheOptions

3. Potential Performance Issue: Cloning cache_options on every cache access

4. Inconsistent Budget Handling

5. Missing Error Handling for Budget Exhaustion

6. Test Coverage Gap

Recommendations:

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

lewis6991 commented Jun 15, 2026

Uh oh!

tzssangglass commented Jun 19, 2026

Uh oh!

lewis6991 commented Jun 19, 2026

Uh oh!

tzssangglass commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lewis6991 commented Jun 20, 2026

Uh oh!

tzssangglass commented Jun 20, 2026

Uh oh!

lewis6991 commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CppCXY commented Jun 22, 2026

Uh oh!

tzssangglass commented Jun 23, 2026

Uh oh!

lewis6991 commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

1. Potential Logic Error: `cache_options` field not used in `get_infer_cache`

2. Missing `Clone` implementation for `CacheOptions`

3. Potential Performance Issue: Cloning `cache_options` on every cache access

tzssangglass commented Jun 20, 2026 •

edited

Loading

lewis6991 commented Jun 20, 2026 •

edited

Loading