Skip to content

fix(query): clarify condition resolution semantics for label queries#2994

Open
contrueCT wants to merge 3 commits into
apache:masterfrom
contrueCT:task/improve-condition-query-semantics
Open

fix(query): clarify condition resolution semantics for label queries#2994
contrueCT wants to merge 3 commits into
apache:masterfrom
contrueCT:task/improve-condition-query-semantics

Conversation

@contrueCT
Copy link
Copy Markdown
Contributor

Purpose of the PR

ConditionQuery.condition() currently mixes several different meanings in one API, including:

  • no condition
  • conflicting conditions resolved to empty
  • a unique resolved value
  • a raw multi-value result
  • an exception for ambiguous resolved values

This PR keeps the legacy condition() behavior unchanged, adds explicit condition-resolution APIs, and migrates the high-risk LABEL call sites to use the clearer semantics.

Main Changes

  • Add explicit condition-resolution APIs to ConditionQuery
    • containsCondition(Object key)
    • conditionValues(Object key)
    • conditionValue(Object key)
  • Keep the legacy condition() method backward-compatible
  • Document the semantic differences between the legacy API and the new explicit APIs
  • Migrate LABEL-related high-risk callers to the new APIs in:
    • graph/index transactions
    • serializers
    • traversers
    • in-memory / hstore paths
  • Preserve the old behavior for non-LABEL legacy usages in this first step

Verifying these changes

Added and extended regression coverage for the new semantics:

  • QueryTest#testConditionWithoutLabel
  • QueryTest#testConditionWithEqAndIn
  • QueryTest#testConditionWithSingleInValues
  • QueryTest#testConditionWithConflictingEqAndIn
  • QueryTest#testConditionWithMultipleMatchedInValues

Added a targeted regression for the label-index fallback path:

  • VertexCoreTest#testCollectMatchedIndexesByJointLabelsWithIndexedProperties

This test verifies:

  • a multi-label query can conservatively fall back and still match the indexed label
  • conflicting label conditions produce no matched indexes

Existing label-query regressions were also rechecked to ensure no behavior regression:

  • EdgeCoreTest#testQueryInEdgesOfVertexByLabels
  • EdgeCoreTest#testQueryInEdgesOfVertexByConflictingLabels
  • EdgeCoreTest#testQueryInEdgesOfVertexBySortkey
  • VertexCoreTest#testQueryByJointLabels
  • Trivial rework / code cleanup without any test coverage. (No Need)
  • Already covered by existing tests, such as (please modify tests here).
  • Need tests and can be verified as follows:
    • mvn -pl hugegraph-server/hugegraph-test -am -P core-test,memory -DfailIfNoTests=false -Dtest='QueryTest#testConditionWithoutLabel+testConditionWithEqAndIn+testConditionWithSingleInValues+testConditionWithConflictingEqAndIn+testConditionWithMultipleMatchedInValues' test
    • mvn -pl hugegraph-server/hugegraph-test -am -P core-test,memory -DfailIfNoTests=false -Dtest='EdgeCoreTest#testQueryInEdgesOfVertexByLabels+testQueryInEdgesOfVertexByConflictingLabels+testQueryInEdgesOfVertexBySortkey' test
    • mvn -pl hugegraph-server/hugegraph-test -am -P core-test,memory -DfailIfNoTests=false -Dtest='VertexCoreTest#testQueryByJointLabels+testCollectMatchedIndexesByJointLabelsWithIndexedProperties' test

Does this PR potentially affect the following parts?

Documentation Status

  • Doc - TODO
  • Doc - Done
  • Doc - No Need

@dosubot dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Apr 13, 2026
@contrueCT contrueCT changed the title improve(query): clarify condition resolution semantics for label queries fix(query): clarify condition resolution semantics for label queries Apr 19, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 6, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.25%. Comparing base (4704354) to head (e16a8ce).
⚠️ Report is 7 commits behind head on master.

Additional details and impacted files
@@              Coverage Diff              @@
##             master    #2994       +/-   ##
=============================================
+ Coverage     35.62%   93.25%   +57.63%     
+ Complexity      333       65      -268     
=============================================
  Files           801        9      -792     
  Lines         67595      267    -67328     
  Branches       8790       22     -8768     
=============================================
- Hits          24082      249    -23833     
+ Misses        40955        8    -40947     
+ Partials       2558       10     -2548     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

contrueCT added 2 commits May 26, 2026 20:26
Add explicit condition resolution APIs to ConditionQuery while preserving the legacy condition() behavior. Introduce containsCondition(Object), conditionValues(Object), and conditionValue(Object) so callers can distinguish missing, empty, unique, and multi-value results without overloading null semantics.

Migrate LABEL-specific consumers in graph/index transactions, serializers, traversers, and stores to use the new APIs for unique-label resolution and conservative fallback behavior. Extend QueryTest and VertexCoreTest to cover absent, conflicting, and multi-value label conditions as well as collectMatchedIndexes() behavior for multi-label and conflicting label queries.
@contrueCT contrueCT force-pushed the task/improve-condition-query-semantics branch from e16a8ce to 4c42786 Compare May 26, 2026 12:28
@contrueCT contrueCT force-pushed the task/improve-condition-query-semantics branch from 4c42786 to cc9af24 Compare May 26, 2026 12:29
Copy link
Copy Markdown
Member

@imbajin imbajin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one correctness issue in the latest revision. The CI failures were posted separately as a PR-level reminder.

CI/status checks are failing on the latest head (cc9af24929e42af1c90e1f55f3e60adc351e0318). Could you check the failed jobs before the next review round?

Failed checks include:

List<? extends SchemaLabel> schemaLabels;
if (label != null) {
// Query has LABEL condition
if (hasLabel && labels.isEmpty()) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

‼️ This now treats non-EQ/IN label predicates as an empty label intersection.

containsCondition(HugeKeys.LABEL) returns true for any top-level LABEL relation, but conditionValues(HugeKeys.LABEL) only collects EQ/IN relations. For a query such as has(T.label, P.neq("author")).has("name", ...), TraversalUtil can produce a Condition.neq(HugeKeys.LABEL, ...); hasLabel becomes true, labels is empty, and this branch returns no matched indexes immediately.

Before this change, query.condition(HugeKeys.LABEL) would not resolve a single label for this shape, so collectMatchedIndexes() fell back to scanning candidate schema labels and let the remaining conditions filter results. With the new branch, indexed user-property queries combined with a non-EQ/IN label predicate can incorrectly fail with no matched index / no result path.

Please distinguish "LABEL EQ/IN conditions resolved to an empty intersection" from "LABEL exists but has no EQ/IN-resolvable value". The latter should keep the conservative fallback, or add regression coverage for a label neq / non-EQ label predicate combined with an indexed user property.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Improve]: clarify ConditionQuery.condition() semantics for missing, conflicting, and multi-value conditions

2 participants