minor: add projectionTrace query context for diagnosing projection match failures#19613
minor: add projectionTrace query context for diagnosing projection match failures#19613cecemei wants to merge 3 commits into
Conversation
| public static final String NO_PROJECTIONS = "noProjections"; | ||
| public static final String FORCE_PROJECTION = "forceProjections"; | ||
| public static final String USE_PROJECTION = "useProjection"; | ||
| public static final String PROJECTION_TRACE = "projectionTrace"; |
There was a problem hiding this comment.
i wonder if instead of adding a new flag if this should just check ENABLE_DEBUG context flag
There was a problem hiding this comment.
i've thought about that as well, but decided not because debug=true already triggers a lot of things at once, e.x SQL planner rule, CalciteRulesManager in particular is very noisy when it logs planner rules. It'd be very noisy if someone is just trying to understand why a projection isn't matching.
There was a problem hiding this comment.
yea, i considered the noise briefly too, but then said to myself it was ok though because the noisy SQL logs are all on the broker, while these logs are only going to be on the historicals and realtime servers, per segment.
If you really want to add a new flag, i still think we should also do this logging if debug is set, so that way someone wanting "all the logs" doesn't have to set a bunch of different flags, so the logTrace method could check either projectionTrace or debug
There was a problem hiding this comment.
good point, i added the logging if debug is set
| { | ||
| private static final Logger log = new Logger(Projections.class); | ||
|
|
||
| private static void logTrace(QueryContext context, String format, Object... args) |
There was a problem hiding this comment.
why not just pass the CursorBuildSpec into this method? it looks like all of the callers are getting it from there anyway
There was a problem hiding this comment.
a narrower input param would be more declarative? yes all callers are getting it from CursorBuildSpec.
There was a problem hiding this comment.
i guess it was just jarring to see logTrace(queryCursorBuildSpec.getQueryContext()... like 20 times in the same file with all the stuff sourced from a CursorBuildSpec.
| // match virtual columns first, which will populate the 'remapColumns' of the match builder | ||
| matchBuilder = matchQueryVirtualColumns(projection, queryCursorBuildSpec, physicalColumnChecker, matchBuilder); | ||
| if (matchBuilder == null) { | ||
| logTrace(queryCursorBuildSpec.getQueryContext(), "matchAggregateProjection: projection [%s] rejected — virtual column mismatch", projection.getName()); |
There was a problem hiding this comment.
this and some of these other top level logs seem kind of redundant since the methods they call also log when returning nulls, but i guess we sort of need these here though because otherwise i'm not sure its obvious which caller of which match method was the failing one.
This is probably ok for now since its just logs, but in the future we might consider collecting these failure reasons instead of logging them when they happen so that we can produce more concise messaging.
There was a problem hiding this comment.
we could consider collecting failure reasons when forceProjection or useProjection is set in the future, and maybe add it as a system flag.
even though many messages are added here, only 1-2 lines would be logged so it should not be too verbose.
There was a problem hiding this comment.
yea, i think its fine for now, though 'only 1-2 lines' is per projection, per segment of the query, so its going to be pretty verbose anywhere that is touching more than a few segments if there are any projections at all.
There was a problem hiding this comment.
yea it would be suboptimal if logging for a lot of segments, maybe could be helpful in the case when only several segments out of all are missing proper projections, it gives some amount of visibility and operatibility to projection at least.
Description
Debugging why
matchAggregateProjectionreturns null is difficult today because there are 14 distinct rejection sites spread across the matching pipeline, and there's no visibility into which one fired without a debugger or global DEBUG logging.This PR adds a
projectionTracequery context flag. When set to true, rejection reasons are logged at INFO level so they're visible without changing log4j config. When unset, the same messages are emitted at DEBUG level.Usage:
Rejection messages include the projection name and the specific reason (e.g., incompatible ordering, missing column, granularity mismatch), making it straightforward to identify which projection was rejected and why.
This PR has: