[IOTDB-17797] Support lateral column aliases in table SELECT list#17960
[IOTDB-17797] Support lateral column aliases in table SELECT list#17960DaZuiZui wants to merge 6 commits into
Conversation
|
I have completed the implementation for Issue #17797 Part 2: supporting lateral column aliases in the table-model SELECT list. The PR is available here: This PR adds left-to-right SELECT-list alias resolution, keeps local input columns at higher priority than SELECT aliases, avoids rewriting qualified expressions and subqueries, preserves WHERE/HAVING semantics, and keeps the existing ORDER BY alias behavior. It also includes analyzer tests and a plan-level test to ensure reused aliases do not cause duplicated projection computation. I have verified the changes with: ./mvnw spotless:apply -pl iotdb-core/datanode |
| outputPosition++; | ||
| } | ||
| } else { | ||
| SelectAliasLookup visibleAliases = visibleAliasBuilder.build(); |
There was a problem hiding this comment.
Nit / follow-up suggestion: this builds a fresh immutable SelectAliasLookup for every SingleColumn. Since the SelectAliasLookup constructor copies the whole current alias map and each alias list, analyzing a very wide SELECT list with many reusable aliases ends up doing repeated prefix copies, i.e. roughly O(N^2) alias-map copying.
This is probably fine for normal SELECT lists, but it may be worth avoiding the repeated snapshots by using an incremental/shared immutable lookup structure, or by exposing a read-only lookup view over the builder whose visible state grows left-to-right.
Could you also add IT/regression coverage for this area? In particular, it would be useful to cover a wide SELECT-list LCA chain/reuse case, plus the edge cases around delimited/quoted alias case sensitivity, named WINDOW definitions not seeing SELECT aliases, DISTINCT with LCA and no ORDER BY, and LCA references in window frame bounds.
There was a problem hiding this comment.
Thanks for the suggestion. Addressed in 3bd3a8d: SELECT-list LCA rewriting now reads from a builder-backed SelectAliasResolver while walking the SELECT list, so it avoids rebuilding immutable snapshots for each SingleColumn and only snapshots once for later clauses. I also added regression coverage for a wide SELECT-list LCA chain, delimited alias case sensitivity, named WINDOW definitions not seeing SELECT aliases, DISTINCT with LCA and no ORDER BY, and LCA references in window frame bounds.
Verified with:
./mvnw spotless:apply -pl iotdb-core/datanode
./mvnw -nsu test -pl iotdb-core/datanode -Dtest=SelectAliasReuseTest
|
Hi @JackieTien97 , I have addressed the latest review comments in PR #17960. The SELECT-list LCA alias lookup now avoids rebuilding immutable snapshots for each SingleColumn, and I added regression coverage for the wide SELECT-list LCA chain, delimited alias case sensitivity, named WINDOW definitions, DISTINCT without ORDER BY, and LCA references in window frame bounds. I verified the changes with: ./mvnw spotless:apply -pl iotdb-core/datanode |
Description
Support lateral column aliases in table SELECT lists
This PR implements Part 2 of #17797 for the table-model analyzer: later
SingleColumnSELECT items can reference aliases explicitly defined by earlierSingleColumnitems in the same SELECT list.Examples now supported:
Name-resolution behavior
The LCA rewrite is left-to-right and only applies to unqualified identifiers in later SELECT items. It does not rewrite qualified references such as
t.x, dereference expressions such asx.y, or identifiers inside subqueries.Resolution order is:
If multiple previous aliases with the same canonical name are visible and no local source column wins, the analyzer raises a clear ambiguity error.
AllColumnsandCOLUMNS(...)do not register reusable LCA aliases.GROUP BYalias reuse uses the LCA-rewritten SELECT expression, whileORDER BYkeeps the existing output-alias precedence.WHEREandHAVINGstill do not see SELECT aliases.For
CAST(value AS type), onlyvalueparticipates in LCA rewriting. Type names are not treated as alias references.Implementation notes
The original AST is kept unchanged. The analyzer records each
SingleColumn's semantic expression after LCA rewriting and uses it for type analysis, source-column tracking, output expression analysis, and GROUP BY alias reuse. Output field names without explicit aliases are still derived from the original SELECT item text rather than from the rewritten expression.Inline window specifications in later SELECT items are rewritten and registered against the rewritten
FunctionCallnodes before expression analysis. Referencing an alias whose expression contains a window function is rejected explicitly.Fixes #17797
This PR has:
Key changed/added classes (or packages if there are too many classes) in this PR
StatementAnalyzer: SELECT-list LCA rewrite, rewritten SELECT expression tracking, output-scope name handling, and window metadata registration for rewritten expressions.ExpressionRewriter/ExpressionTreeRewriter: support for leaf expression rewrite hooks used by LCA expression copying.SelectAliasReuseTest: coverage for LCA chaining, collisions, ambiguity, aggregates, GROUP BY/ORDER BY reuse, window specs, CAST type-name collisions, WHERE/HAVING isolation, and output field names.relational/analyzer/README.md: documented table-model SELECT alias resolution rules.Tests
./mvnw spotless:apply -pl iotdb-core/datanode ./mvnw test -pl iotdb-core/datanode -Dtest=SelectAliasReuseTest