Skip to content

WS5: LPIM / IFI / memory dumps as advise-only server-health recommendations#1076

Merged
erikdarlingdata merged 2 commits into
devfrom
feature/ws5-lpim-ifi-dump-facts
Jun 8, 2026
Merged

WS5: LPIM / IFI / memory dumps as advise-only server-health recommendations#1076
erikdarlingdata merged 2 commits into
devfrom
feature/ws5-lpim-ifi-dump-facts

Conversation

@erikdarlingdata

Copy link
Copy Markdown
Owner

What

Part of the recommendations rebuild. Surfaces three standing server-health gaps as advise-only recommendation cards on the existing recommendations path — no Apply (these need OS / service-account / investigation work, not a sp_configure flip):

Fact key Condition
CONFIG_IFI_DISABLED Instant File Initialization off
CONFIG_LPIM_DISABLED Lock Pages in Memory off
SERVER_MEMORY_DUMPS sys.dm_server_memory_dumps count > 0

Vertical slice across the Dashboard SQL collector + both apps' C#, mirroring the WS3 config-advisory pattern (CONFIG_MAXDOP / CONFIG_CTFP / …) exactly.

No new Agent job / no live runs

  • Extends the existing collect.server_properties_collector, already dispatched by the one Collection job's master collector (install/42_…, ~line 330). No job, schedule, or new proc.
  • Code + unit tests only — not run against any live SQL Server. The install/upgrade live test is for the reviewer (see "Not verified live").

Changes

Schema — 3 nullable columns on collect.server_properties (lock_pages_in_memory bit, instant_file_initialization_enabled bit, memory_dump_count integer):

  • install/02_create_tables.sql — added to the CREATE TABLE (fresh installs).
  • upgrades/2.11.0-to-2.12.0/04_add_server_health_columns.sql — idempotent guarded IF NOT EXISTS (sys.columns) … ALTER … ADD (existing installs), appended to upgrade.txt.

Collector (install/53_collect_server_properties.sql): reads LPIM from sys.dm_os_sys_info.sql_memory_model (2/3 → on, 1 → off); IFI from sys.dm_server_services.instant_file_initialization_enabled (guarded on DMV + column existence, read via dynamic SQL so older builds compile); dumps from COUNT_BIG(*) over sys.dm_server_memory_dumps (DMV-guarded). Each is engine-edition-guarded + wrapped in TRY/CATCH and left NULL when unavailable so the collector never fails; all three are folded into the row_hash so a change re-collects. Dedup/skip + collection_log behavior untouched.

Shared facts (PerformanceMonitor.Analysis, advise-only):

  • FactScorer.ScoreConfigFact — each key scores a 0.4 advisory base only when bad (IFI/LPIM Value==0, dumps Value>0), else 0.
  • InferenceEngine.ConfigAdvisoryRootKeys — each roots a standalone card below the 0.5 incident threshold (surfaces on a quiet, healthy server).
  • FactAdvice — one advice block per key (headline + investigation + copy-paste guidance; no generated Apply T-SQL).
  • No RemediationAction / handler / Apply.

Both collectors emit the facts with identical noise-control gating (Dashboard SqlServerFactCollector, Lite DuckDbFactCollector + RemoteCollectorService.ServerProperties + DuckDB schema v27 migration).

Thresholds chosen — ⚠️ please tune

  • IFI off → always advisory (low-noise, universally good advice).
  • LPIM off → only on non-Express editions with ≥ 32 GB RAM — don't flag tiny instances. The 32 GB floor is the main tuning knob (a shared LpimAdvisoryMinPhysicalMemoryMb const in both collectors).
  • Memory dumps → advisory whenever count > 0 (a dump always warrants a look).

Tests

14 new cases in Dashboard.Tests (the shared-library home for the WS3 analogues): scoring (bad → 0.4, fine → 0), rooting (each roots a standalone finding below 0.5, mirroring ServerConfigFact_RootsStandalone_BelowMinimumSeverity), and advice presence.

  • Dashboard.Tests: 420 → 434 (all pass)
  • Lite.Tests: 362 → 362 (unchanged — no regression)
  • Both apps build clean (0 Error(s), 0 Warning(s) from the changed code).

Not verified live

No run against a live SQL Server, so the actual DMV reads and the upgrade ALTER are unverified on a real instance. Version-gating: IFI guarded on DMV + column existence (+ dynamic SQL); dumps guarded on DMV existence; both skipped on Azure SQL DB (engine edition 5); LPIM left NULL on Azure. Reviewer to run the install + upgrade test.

🤖 Generated with Claude Code

…ations

Surface three standing server-health gaps as advise-only recommendation cards
on the existing recommendations path (no Apply — these need OS / service-account
/ investigation work, not a setting flip):

  - CONFIG_IFI_DISABLED  — Instant File Initialization off
  - CONFIG_LPIM_DISABLED — Lock Pages in Memory off
  - SERVER_MEMORY_DUMPS  — sys.dm_server_memory_dumps count > 0

Vertical slice across the Dashboard SQL collector and both apps' C#, mirroring
the WS3 config-advisory pattern exactly.

Schema (collect.server_properties): three new nullable columns
lock_pages_in_memory bit, instant_file_initialization_enabled bit,
memory_dump_count integer — added to the CREATE TABLE in install/02 (fresh
installs) AND via an idempotent guarded ALTER in
upgrades/2.11.0-to-2.12.0/04_add_server_health_columns.sql (existing installs).

Collector (install/53_collect_server_properties.sql): extends the existing
collect.server_properties_collector (dispatched by the one Collection job's
master collector — no new job/schedule/proc). Reads LPIM from
sys.dm_os_sys_info.sql_memory_model, IFI from
sys.dm_server_services.instant_file_initialization_enabled (DMV + column guarded,
dynamic SQL so older builds compile), dumps from sys.dm_server_memory_dumps
(DMV guarded). Each is defensive (engine-edition guard + TRY/CATCH) and left NULL
when unavailable so the collector never fails; all three are folded into the
row_hash so a change re-collects.

Facts (advise-only, mirrors WS3):
  - FactScorer.ScoreConfigFact: each key scores a 0.4 advisory base only when
    bad (IFI/LPIM Value==0, dumps Value>0), 0 otherwise.
  - InferenceEngine.ConfigAdvisoryRootKeys: each roots a standalone card below
    the 0.5 incident threshold (surfaces on a quiet, healthy server).
  - FactAdvice: one advice block per key (headline + investigation + copy-paste
    remediation guidance; no generated Apply T-SQL).

Collectors emit the facts with shared noise-control gating (Dashboard
SqlServerFactCollector + Lite DuckDbFactCollector + Lite
RemoteCollectorService.ServerProperties + DuckDB schema v27 migration):
  - IFI off: always advisory (low-noise, universally good advice).
  - LPIM off: only on non-Express editions with >= 32 GB RAM (don't flag tiny
    instances). [THRESHOLD FLAGGED FOR TUNING]
  - Memory dumps: advisory whenever count > 0.

Tests (Dashboard.Tests, shared-library home for the WS3 analogues):
14 new cases — scoring (bad -> 0.4, fine -> 0), rooting (each roots standalone
below 0.5), and advice presence. Dashboard.Tests 420 -> 434, Lite.Tests 362
(unchanged). Both apps build clean.

NOT verified live (no run against a live SQL Server): the actual DMV reads and
the upgrade ALTER are untested on a real instance — reviewer to run the
install/upgrade test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ump-facts

# Conflicts:
#	Dashboard.Tests/InferenceEngineTests.cs
@erikdarlingdata erikdarlingdata merged commit 6c76eb5 into dev Jun 8, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant