WS5: LPIM / IFI / memory dumps as advise-only server-health recommendations#1076
Merged
Conversation
…ations
Surface three standing server-health gaps as advise-only recommendation cards
on the existing recommendations path (no Apply — these need OS / service-account
/ investigation work, not a setting flip):
- CONFIG_IFI_DISABLED — Instant File Initialization off
- CONFIG_LPIM_DISABLED — Lock Pages in Memory off
- SERVER_MEMORY_DUMPS — sys.dm_server_memory_dumps count > 0
Vertical slice across the Dashboard SQL collector and both apps' C#, mirroring
the WS3 config-advisory pattern exactly.
Schema (collect.server_properties): three new nullable columns
lock_pages_in_memory bit, instant_file_initialization_enabled bit,
memory_dump_count integer — added to the CREATE TABLE in install/02 (fresh
installs) AND via an idempotent guarded ALTER in
upgrades/2.11.0-to-2.12.0/04_add_server_health_columns.sql (existing installs).
Collector (install/53_collect_server_properties.sql): extends the existing
collect.server_properties_collector (dispatched by the one Collection job's
master collector — no new job/schedule/proc). Reads LPIM from
sys.dm_os_sys_info.sql_memory_model, IFI from
sys.dm_server_services.instant_file_initialization_enabled (DMV + column guarded,
dynamic SQL so older builds compile), dumps from sys.dm_server_memory_dumps
(DMV guarded). Each is defensive (engine-edition guard + TRY/CATCH) and left NULL
when unavailable so the collector never fails; all three are folded into the
row_hash so a change re-collects.
Facts (advise-only, mirrors WS3):
- FactScorer.ScoreConfigFact: each key scores a 0.4 advisory base only when
bad (IFI/LPIM Value==0, dumps Value>0), 0 otherwise.
- InferenceEngine.ConfigAdvisoryRootKeys: each roots a standalone card below
the 0.5 incident threshold (surfaces on a quiet, healthy server).
- FactAdvice: one advice block per key (headline + investigation + copy-paste
remediation guidance; no generated Apply T-SQL).
Collectors emit the facts with shared noise-control gating (Dashboard
SqlServerFactCollector + Lite DuckDbFactCollector + Lite
RemoteCollectorService.ServerProperties + DuckDB schema v27 migration):
- IFI off: always advisory (low-noise, universally good advice).
- LPIM off: only on non-Express editions with >= 32 GB RAM (don't flag tiny
instances). [THRESHOLD FLAGGED FOR TUNING]
- Memory dumps: advisory whenever count > 0.
Tests (Dashboard.Tests, shared-library home for the WS3 analogues):
14 new cases — scoring (bad -> 0.4, fine -> 0), rooting (each roots standalone
below 0.5), and advice presence. Dashboard.Tests 420 -> 434, Lite.Tests 362
(unchanged). Both apps build clean.
NOT verified live (no run against a live SQL Server): the actual DMV reads and
the upgrade ALTER are untested on a real instance — reviewer to run the
install/upgrade test.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ump-facts # Conflicts: # Dashboard.Tests/InferenceEngineTests.cs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Part of the recommendations rebuild. Surfaces three standing server-health gaps as advise-only recommendation cards on the existing recommendations path — no Apply (these need OS / service-account / investigation work, not a
sp_configureflip):CONFIG_IFI_DISABLEDCONFIG_LPIM_DISABLEDSERVER_MEMORY_DUMPSsys.dm_server_memory_dumpscount > 0Vertical slice across the Dashboard SQL collector + both apps' C#, mirroring the WS3 config-advisory pattern (
CONFIG_MAXDOP/CONFIG_CTFP/ …) exactly.No new Agent job / no live runs
collect.server_properties_collector, already dispatched by the one Collection job's master collector (install/42_…, ~line 330). No job, schedule, or new proc.Changes
Schema — 3 nullable columns on
collect.server_properties(lock_pages_in_memory bit,instant_file_initialization_enabled bit,memory_dump_count integer):install/02_create_tables.sql— added to theCREATE TABLE(fresh installs).upgrades/2.11.0-to-2.12.0/04_add_server_health_columns.sql— idempotent guardedIF NOT EXISTS (sys.columns) … ALTER … ADD(existing installs), appended toupgrade.txt.Collector (
install/53_collect_server_properties.sql): reads LPIM fromsys.dm_os_sys_info.sql_memory_model(2/3 → on, 1 → off); IFI fromsys.dm_server_services.instant_file_initialization_enabled(guarded on DMV + column existence, read via dynamic SQL so older builds compile); dumps fromCOUNT_BIG(*)oversys.dm_server_memory_dumps(DMV-guarded). Each is engine-edition-guarded + wrapped in TRY/CATCH and left NULL when unavailable so the collector never fails; all three are folded into therow_hashso a change re-collects. Dedup/skip +collection_logbehavior untouched.Shared facts (
PerformanceMonitor.Analysis, advise-only):FactScorer.ScoreConfigFact— each key scores a 0.4 advisory base only when bad (IFI/LPIMValue==0, dumpsValue>0), else 0.InferenceEngine.ConfigAdvisoryRootKeys— each roots a standalone card below the 0.5 incident threshold (surfaces on a quiet, healthy server).FactAdvice— one advice block per key (headline + investigation + copy-paste guidance; no generated Apply T-SQL).RemediationAction/ handler / Apply.Both collectors emit the facts with identical noise-control gating (Dashboard
SqlServerFactCollector, LiteDuckDbFactCollector+RemoteCollectorService.ServerProperties+ DuckDB schema v27 migration).Thresholds chosen —⚠️ please tune
LpimAdvisoryMinPhysicalMemoryMbconst in both collectors).Tests
14 new cases in Dashboard.Tests (the shared-library home for the WS3 analogues): scoring (bad → 0.4, fine → 0), rooting (each roots a standalone finding below 0.5, mirroring
ServerConfigFact_RootsStandalone_BelowMinimumSeverity), and advice presence.0 Error(s),0 Warning(s)from the changed code).Not verified live
No run against a live SQL Server, so the actual DMV reads and the upgrade ALTER are unverified on a real instance. Version-gating: IFI guarded on DMV + column existence (+ dynamic SQL); dumps guarded on DMV existence; both skipped on Azure SQL DB (engine edition 5); LPIM left NULL on Azure. Reviewer to run the install + upgrade test.
🤖 Generated with Claude Code