perf(core): eliminate avoidable per-call allocation in hot logging/config paths#36150
perf(core): eliminate avoidable per-call allocation in hot logging/config paths#36150swicken wants to merge 4 commits into
Conversation
…nfig paths Allocation profiling (Pyroscope, TLAB profiles) attributed several hotspots to four mechanical issues in core utility code; Config.get* subtrees alone accounted for ~19% of sampled allocation in a mixed-load window. - Logger: lazy Supplier overloads (debug/info) no longer evaluate the message supplier when the level is disabled. ~900 call sites use suppliers specifically to avoid message-building cost, and all of them evaluated eagerly (including reflectionToString on Contentlets via PermissionBitFactoryImpl debug logging on the reindex path). - Config: getInt/getLong/getFloat/getBoolean env-alias probes are now guarded with containsKey instead of throwing and discarding a NoSuchElementException on every read of a non-env-overridden key (same guard getBooleanProperty(name, default) already had). - Logger.isVelocityMessage: memoized per class name; previously allocated a lowercased class name on every Class-keyed log call. - LeakyTokenBucketImpl: isEnabled/getMaximumBucketSize/getRefillPerSecond return the values resolved at construction instead of walking Config on every request-cost increment. Fixes #36145
…ization Returning constructor-captured values from the application-scoped LeakyTokenBucketImpl meant RATE_LIMIT_ENABLED / RATE_LIMIT_REFILL_PER_SECOND / RATE_LIMIT_MAX_BUCKET_SIZE changes (system table or config refresh) were ignored until restart - enabling rate limiting during an incident would silently do nothing. Memoize the Config lookups with a 30s TTL instead: runtime overrides still take effect (bounded staleness) while the per-request-cost-increment Config walk is gone. Refs #36145
- Pin the runtime-override contract with a test: a TTL-injectable constructor lets the new LeakyTokenBucketImplTest assert that a runtime Config change reaches a running bucket and that reads within the TTL serve the cached value (guards against a future cleanup reverting the suppliers to the constructor-captured fields). - Cover Config.getLongProperty missing-key default and pin the env-alias-present-but-unparseable fallback for int/long/float. - Memoize the RATE_LIMIT_ADD_HEADER_INFO read in getHeaderInfo(): it runs unconditionally per request via RequestCostFilter, same hot path. - Make the allow() debug message lazy via the new Supplier overload. - Replace the name-keyed Caffeine cache in Logger.isVelocityMessage with a ClassValue: allocation-free lookups (the capturing lambda allocated per call), per-Class keying (no FQN conflation across OSGi classloaders), and entries released with their class. Refs #36145
🤖 Codex Review —
|
|
Claude finished @swicken's task in 1m 11s —— View job Rollback Safety Analysis
Result: ✅ Safe To RollbackAll changes in this PR are pure in-memory performance optimizations with no impact on stored data, database schema, Elasticsearch mappings, or API contracts. Files analyzed:
Categories checked and cleared:
The label |
Not even sure how it came up with this :) There is no retry loop, it must be hallucinating. |
Proposed Changes
Supplieroverloads are now actually lazy — the threedebug(…, Supplier)and twoinfo(…, Supplier)overloads calledmessage.get()unconditionally; they now check the level first. ~900 call sites use suppliers specifically to skip message building, including ones that didreflectionToStringon Contentlets viaPermissionBitFactoryImpldebug logging on the reindex path.Configint/long/float/boolean getters no longer throw per read — theDOT_*env-alias probe used commons-configuration's throwing 1-arg getter wrapped in vavrTry, so every read of a non-env-overridden key allocated and discarded aNoSuchElementException(with full stack trace). Applied thecontainsKeyguard thatgetBooleanProperty(name, default)already had.Logger.isVelocityMessagememoized viaClassValue— previously allocated a lowercased class name on every Class-keyed log call;ClassValuelookups are allocation-free, keyed per actual Class (no FQN conflation across OSGi classloaders), and entries are released with their class.LeakyTokenBucketImplno longer walks Config per request-cost increment — the four config reads (enabled,refill,max bucket,add-header-info) are TTL-memoized (30s). Runtime overrides via system table / config refresh still take effect within the TTL, preserved deliberately so rate limiting can be toggled during an incident; a test with an injectable TTL pins that contract.Why
Allocation profiling of a running instance (Pyroscope TLAB profiles under mixed REST/GraphQL/page-render load) attributed the
Throwable.fillInStackTrace,Method.copy/Field.copy, andStringLatin1.toLowerCaseallocation hotspots to these four spots;Config.get*subtrees alone were ~19% of sampled allocation in a mixed window (e.g. one full Config walk — exception included — per expired session viaClickstreamListener, per 250ms reindex-pause iteration, per request-cost increment). Expected effect is reduced allocation → less GC pressure, not a single-endpoint latency win.Checklist
Additional Info
New/extended unit tests:
LoggerTest(suppliers not evaluated when level disabled),LeakyTokenBucketImplTest(runtime config override reaches a running bucket; reads within TTL stay cached),ConfigTest(getLongPropertymissing-key default; env-alias-present-but-unparseable falls back to default for int/long/float). The deprecated no-defaultConfiggetters still throwNoSuchElementExceptionfor missing base keys — that documented contract is unchanged (and pinned by existing tests).Closes #36145