Skip to content

[SPARK-52812][CONNECT] Preserve spark.sql.sources.default for eager createTable(tableName, path)#56211

Open
haoyangeng-db wants to merge 1 commit into
apache:masterfrom
haoyangeng-db:spark-52812-followup-createtable-default-source
Open

[SPARK-52812][CONNECT] Preserve spark.sql.sources.default for eager createTable(tableName, path)#56211
haoyangeng-db wants to merge 1 commit into
apache:masterfrom
haoyangeng-db:spark-52812-followup-createtable-default-source

Conversation

@haoyangeng-db
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

SPARK-52812 (#56064) made Spark Connect Catalog.createTable eager by re-routing the two-argument createTable(tableName, path) overload through createTable(tableName, path, "parquet"). That hardcodes the parquet provider and drops the spark.sql.sources.default fallback that the overload previously relied on.

This PR restores the original behavior: the two-argument overload again leaves the source unset so the server resolves spark.sql.sources.default, while keeping the eager execution introduced by SPARK-52812. A regression test is added to CatalogSuite.

Why are the changes needed?

The two-argument createTable(tableName, path) overload is documented as "It will use the default data source configured by spark.sql.sources.default." After SPARK-52812 it always used parquet regardless of that configuration, contradicting its own contract and the classic Catalog behavior.

Does this PR introduce any user-facing change?

Yes, within the unreleased master branch. spark.catalog.createTable(tableName, path) on Spark Connect once again honors spark.sql.sources.default instead of always creating a parquet table. The eager-execution behavior from SPARK-52812 is preserved.

How was this patch tested?

Added a regression test in CatalogSuite that sets spark.sql.sources.default to json, writes JSON data, creates the table via the two-argument overload, and asserts the resulting table uses the json provider and is readable. The test fails on the previous hardcoded-parquet behavior.

Was this patch authored or co-authored using generative AI tooling?

Co-authored with Claude Code.

…reateTable(tableName, path)

### What changes were proposed in this pull request?
SPARK-52812 (apache#56064) made Spark Connect `Catalog.createTable` eager by re-routing the two-argument `createTable(tableName, path)` overload through `createTable(tableName, path, "parquet")`. That hardcodes the parquet provider and drops the `spark.sql.sources.default` fallback that the overload previously relied on.

This PR restores the original behavior: the two-argument overload again leaves the source unset so the server resolves `spark.sql.sources.default`, while keeping the eager execution introduced by SPARK-52812. A regression test is added to `CatalogSuite`.

### Why are the changes needed?
The two-argument `createTable(tableName, path)` overload is documented as "It will use the default data source configured by spark.sql.sources.default." After SPARK-52812 it always used parquet regardless of that configuration, contradicting its own contract and the classic Catalog behavior.

### Does this PR introduce _any_ user-facing change?
Yes, within the unreleased master branch. `spark.catalog.createTable(tableName, path)` on Spark Connect once again honors `spark.sql.sources.default` instead of always creating a parquet table. The eager-execution behavior from SPARK-52812 is preserved.

### How was this patch tested?
Added a regression test in `CatalogSuite` that sets `spark.sql.sources.default` to `json`, writes JSON data, creates the table via the two-argument overload, and asserts the resulting table uses the json provider and is readable. The test fails on the previous hardcoded-parquet behavior.

### Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Opus 4.8)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant