[AURON #2295] Replace hardcoded config strings with typed key() references#2296
[AURON #2295] Replace hardcoded config strings with typed key() references#2296guixiaowen wants to merge 1 commit into
Conversation
|
@cxzl25 Could you please help review another issue, #2252? I noticed that many places in the codebase are still using hardcoded config strings instead of the corresponding .key() values. So I've made changes to unify them. Please take a look when you have a chance. Thank you very much for your help! |
There was a problem hiding this comment.
Pull request overview
Refactors several user-facing “conversion disabled / set config …” messages to avoid hardcoded spark.auron.* configuration key strings by referencing SparkAuronConfiguration typed config options, aligning with issue #2295’s goal of safer config key usage.
Changes:
- Updated conversion-disabled assertion text in Iceberg conversion path to reference
SparkAuronConfiguration. - Updated shuffle manager requirement message to reference
SparkAuronConfiguration. - Updated multiple “Conversion disabled …” and “Set … to enable …” messages in
AuronConvertersto referenceSparkAuronConfiguration.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 16 comments.
| File | Description |
|---|---|
| thirdparty/auron-iceberg/src/main/scala/org/apache/spark/sql/auron/iceberg/IcebergConvertProvider.scala | Adjusts conversion-disabled message to reference typed config option. |
| spark-extension/src/main/scala/org/apache/spark/sql/auron/AuronSparkSessionExtension.scala | Adjusts shuffleExchange requirement message to reference typed config option. |
| spark-extension/src/main/scala/org/apache/spark/sql/auron/AuronConverters.scala | Adjusts multiple conversion-disabled/config-hint messages to reference typed config options. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| case _: BatchScanExec => | ||
| val enabled = SparkAuronConfiguration.ENABLE_ICEBERG_SCAN.get() | ||
| assert(enabled, "Conversion disabled: auron.enable.iceberg.scan=false.") | ||
| assert(enabled, s"Conversion disabled: ${SparkAuronConfiguration.ENABLE_ICEBERG_SCAN}=false.") |
| assert( | ||
| AuronConverters.supportedShuffleManager, | ||
| "spark.auron.enable.shuffleExchange=true requires an Auron shuffle manager. " + | ||
| s"spark.${SparkAuronConfiguration.ENABLE_SHUFFLE_EXCHANGE}=true requires an Auron shuffle manager. " + |
| case _: FileSourceScanExec if !enableScan => | ||
| "Conversion disabled: spark.auron.enable.scan=false." | ||
| s"Conversion disabled: spark.${SparkAuronConfiguration.ENABLE_SCAN}=false." | ||
| case _: ProjectExec if !enableProject => | ||
| "Conversion disabled: spark.auron.enable.project=false." | ||
| s"Conversion disabled: spark.${SparkAuronConfiguration.ENABLE_PROJECT}=false." |
| case _: FilterExec if !enableFilter => | ||
| "Conversion disabled: spark.auron.enable.filter=false." | ||
| s"Conversion disabled: spark.${SparkAuronConfiguration.ENABLE_FILTER}=false." | ||
| case _: SortExec if !enableSort => | ||
| "Conversion disabled: spark.auron.enable.sort=false." | ||
| s"Conversion disabled: spark.${SparkAuronConfiguration.ENABLE_SORT}=false." |
| case _: UnionExec if !enableUnion => | ||
| "Conversion disabled: spark.auron.enable.union=false." | ||
| s"Conversion disabled: spark.${SparkAuronConfiguration.ENABLE_UNION}=false." | ||
| case _: SortMergeJoinExec if !enableSmj => | ||
| "Conversion disabled: spark.auron.enable.smj=false." | ||
| s"Conversion disabled: spark.${SparkAuronConfiguration.ENABLE_SMJ}=false." |
| case _: UnaryExecNode | ||
| if exec.getClass.getSimpleName == "WindowGroupLimitExec" && !enableWindowGroupLimit => | ||
| "Conversion disabled: spark.auron.enable.window.group.limit=false." | ||
| s"Conversion disabled: spark.${SparkAuronConfiguration.ENABLE_WINDOW_GROUP_LIMIT}=false." | ||
| case _: GenerateExec if !enableGenerate => | ||
| "Conversion disabled: spark.auron.enable.generate=false." | ||
| s"Conversion disabled: spark.${SparkAuronConfiguration.ENABLE_GENERATE}=false." |
| case _: LocalTableScanExec if !enableLocalTableScan => | ||
| "Conversion disabled: spark.auron.enable.local.table.scan=false." | ||
| s"Conversion disabled: spark.${SparkAuronConfiguration.ENABLE_LOCAL_TABLE_SCAN}=false." | ||
| case _: DataWritingCommandExec if !enableDataWriting => | ||
| "Conversion disabled: spark.auron.enable.data.writing=false." | ||
| s"Conversion disabled: spark.${SparkAuronConfiguration.ENABLE_DATA_WRITING}=false." |
| s"Conversion disabled: spark.${SparkAuronConfiguration.ENABLE_SCAN_PARQUET}=false." | ||
| } else if (!enableScanOrc) { | ||
| "Conversion disabled: spark.auron.enable.scan.orc=false." | ||
| s"Conversion disabled: spark.${SparkAuronConfiguration.ENABLE_SCAN_ORC}=false." |
| s"Set spark.${SparkAuronConfiguration.ENABLE_SCAN_PARQUET_TIMESTAMP}=true to enable timestamp support " + | ||
| "or remove timestamp columns from the query.") |
| s"Set spark.${SparkAuronConfiguration.ENABLE_SCAN_ORC_TIMESTAMP}=true to enable timestamp support " + | ||
| "or remove timestamp columns from the query.") |
|
Would it be better if there was a public method that could splice Spark prefixed keys? |
…ences
Which issue does this PR close?
Closes #2295
Rationale for this change
Background
Currently, in the AURON codebase, many feature flag configurations (e.g., auron.enable.hudi.scan, auron.enable.orc.scan, etc.) are referenced as hardcoded strings scattered throughout the implementation code.
For example:
`
case _: FileSourceScanExec if !enableScan =>
`
Problem
Using hardcoded strings for configuration keys introduces several issues:
Maintainability risk – If a configuration key name needs to be changed, developers must manually search and replace all occurrences, which is error-prone.
Typo vulnerability – Hardcoded strings are prone to spelling mistakes that cannot be caught at compile time (e.g., "auron.enable.hu di.scan" with an extra space).
Code inconsistency – Some parts of the codebase use the typed .key() approach, while others still rely on raw strings, leading to inconsistent coding patterns.
Refactoring difficulty – IDE refactoring tools cannot automatically rename hardcoded string literals, making large-scale changes tedious and risky.
What changes are included in this PR?
Scope of Changes
This refactoring should cover all auron.enable.* configuration keys in the codebase, including but not limited to:
auron.enable.hudi.scan
auron.enable.orc.scan
auron.enable.parquet.scan
Any other auron.enable.* feature flags currently defined as typed constants
Are there any user-facing changes?
Nothing.
How was this patch tested?
UT.