Skip to content

[CALCITE-7624] Support BigDecimal for FETCH and OFFSET in Enumerable#5050

Open
tkalkirill wants to merge 4 commits into
apache:mainfrom
tkalkirill:calcite-7624
Open

[CALCITE-7624] Support BigDecimal for FETCH and OFFSET in Enumerable#5050
tkalkirill wants to merge 4 commits into
apache:mainfrom
tkalkirill:calcite-7624

Conversation

@tkalkirill

Copy link
Copy Markdown

Jira Link

CALCITE-7624

Changes Proposed

This PR proposes using BigDecimal for FETCH and OFFSET, so that the following SQL queries work:

SELECT * FROM person OFFSET 1.5 ROWS FETCH NEXT 1.5 ROWS ONLY;
SELECT * FROM person ORDER BY name OFFSET 1.5 ROWS FETCH NEXT 1.5 ROWS ONLY;

SELECT * FROM person OFFSET ? ROWS FETCH NEXT ? ROWS ONLY;
SELECT * FROM person ORDER BY name OFFSET ? ROWS FETCH NEXT ? ROWS ONLY;
--?1 - int, long, BigDecimal
--?2 - int, long, BigDecimal

@tkalkirill tkalkirill changed the title [CALCITE-7592] Support BigDecimal for FETCH and OFFSET in Enumerable [CALCITE-7624] Support BigDecimal for FETCH and OFFSET in Enumerable Jun 26, 2026
@tkalkirill

tkalkirill commented Jun 26, 2026

Copy link
Copy Markdown
Author

@xiedeyantu Can you take a look?
I have a few questions:

  1. Regarding negative values, I noticed that if OFFSET/FETCH are specified as literals, a validation error occurs; however, if passed as a parameter, execution proceeds without errors. At first glance, this seems inconsistent; perhaps the parameters themselves should also be checked and trigger an error.

  2. For org.apache.calcite.linq4j.ExtendedEnumerable, should we add skip/take methods that accept long/BigDecimal arguments? In my opinion, it is worth adding them, but with default behavior that uses EnumerableDefaults.

  3. It is now possible to pass a fractional value such as 1.5 for OFFSET/FETCH. In Oracle, the fractional part is truncated (resulting in 1.0), whereas in PostgreSQL, it is rounded to the nearest integer (resulting in 2.0). I verified this behavior for both databases in https://onecompiler.com/. Currently, our implementation matches the PostgreSQL behavior. Perhaps we could handle this using a dialect-based approach?

  4. Should the documentation (visible in https://calcite.apache.org/docs/reference.html) be updated? If so, how is that done? Is it a separate task, part of this PR, or handled some other way?

  5. Regarding java.sql.ParameterMetaData: there is a test that checks for the parameters types org.apache.calcite.test.JdbcTest#checkPreparedOffsetFetch- shouldn't that be changed to DECIMAL/NUMERIC? On one hand, this might break backward compatibility; on the other, it introduces inconsistency. I'm not sure how to handle this.

  6. What about the adapters do they need to support BigDecimal? In my view, they do.

Of course, I could create a task for each question and resolve them there. Is that standard practice, or should I address them via the dev list first?

@mihaibudiu

Copy link
Copy Markdown
Contributor

In general we should push as much as possible in the validation and code generation layers.
If the runtime does not support fractional values, the code generation should generate a rounding operation.
I would pick a reasonable implementation and do that, if other people need different semantics, they can file issues or override behaviors. Every new configuration flag increases exponentially the test surface, so I would avoid them unless they are really needed.

@xiedeyantu

Copy link
Copy Markdown
Member

@tkalkirill Thanks for spinning this feature out into a separate piece of work!

For the first point, I agree with Mihai’s suggestion—we should intercept invalid formats as early as possible in the pre-processing logic, and we can also refer to how errors are gracefully handled in the existing execution flow. This would also serve as good groundwork for your earlier PR.

For the second point, I think we should add something similar to what you implemented in your previous PR. I’m wondering whether we should deprecate the old approach and support only  BigDecimal  going forward? This might be worth continuing to discuss in the Jira ticket.

On the third point, I also agree with Mihai. We could consider following PostgreSQL’s behavior, since Calcite often takes Postgres as a reference first; if Postgres doesn’t support something, then other mainstream implementations are considered. That said, if you believe another approach is more reasonable, as long as we document the rationale clearly where needed, that should be acceptable.

For the fourth point, I do think it makes sense to allow updates.

Regarding the fifth point, I personally feel we should unify the behavior—widening the type shouldn’t break compatibility.

On the sixth point, for adapters, I think we could either apply different validation logic per backend or let each backend handle its own checks. Both options seem reasonable to me; let’s see if others have better ideas. If not, as long as we provide accurate documentation, that should be fine.

@sonarqubecloud

Copy link
Copy Markdown

@tkalkirill

Copy link
Copy Markdown
Author

@xiedeyantu I’ve tried to make the requested changes on all points. We are continuing to discuss the second point in the ticket.

result.format);

Expression v = builder.append("child", result.block);
Expression roundingPolicyExp = getRoundingPolicy(implementor);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need such a complex strategy; we can control the rounding method using expressions instead. We simply need to select a standard approach and document it with a comment. I suspect @mihaibudiu would also prefer a simpler implementation.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this approach, users can quite easily override this behavior.
For instance, someone might need it to work like it does in Oracle.
If that can also be achieved using expressions, please let me know how.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't your previous PR add support for expressions(like limit round(?))? My point is that we can support various types of logic using any syntax we choose; if you want to include this feature, I certainly wouldn't object. We could also get Mihai's opinion on this.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every flag like this increases the space of configurations to test. I would keep this one simple by hardwiring the rounding policy.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiedeyantu And how can we override the expression later if necessary?
@mihaibudiu I propose to leave the option to override the rounding policy.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I really meant was that if we could support expressions in limit and offset, we could do a lot of things, even support partial rounding logic, thus eliminating the need for additional configuration options. As for specific implementation details, I haven't considered that in detail yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants