branch-4.0: [feature](cloud) Add table-level event-driven warm up#64544
Open
bobhan1 wants to merge 2 commits into
Open
branch-4.0: [feature](cloud) Add table-level event-driven warm up#64544bobhan1 wants to merge 2 commits into
bobhan1 wants to merge 2 commits into
Conversation
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Contributor
Author
|
run buildall |
Contributor
FE UT Coverage ReportIncrement line coverage |
Contributor
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
Issue Number: None
Problem Summary:
This PR adds table-level event-driven cloud warm-up support and improves
active incremental warm-up progress observability.
Before this change, event-driven warm-up was only controlled at
compute-group granularity. Once a load-event warm-up job was enabled for
a source and target compute group pair, all source-side table writes
could trigger warm-up to the target compute group. That is inefficient
for workloads where only selected core tables, high-frequency query
tables, or selected async materialized views need to stay warm.
This PR lets users define the warm-up scope with `ON TABLES` when
creating an event-driven load warm-up job. FE persists the normalized
table filter in the warm-up job, resolves matched table ids dynamically,
sends the table ids to BE, and lets BE filter warm-up rowsets by table
id.
User-visible behavior:
- `WARM UP ... ON TABLES` supports table-level event-driven warm-up.
- Table filters support `INCLUDE` and `EXCLUDE` rules.
- Rules support `*` and `?` wildcards, for example `db.table`, `db.*`,
`*.orders_*`, and `log_db.log_?`.
- `INCLUDE` defines the candidate warm-up scope, and `EXCLUDE` removes
tables from that included scope.
- Rules are canonicalized before duplicate checks, so semantically
equivalent filters do not create duplicate jobs just because rule order
differs.
- Matching covers both regular OLAP tables and async materialized views.
- Matched table ids are refreshed as tables or async materialized views
are created, dropped, or renamed.
- The same source compute group can create independent table-level
warm-up jobs to different target compute groups with different table
filters.
- `SHOW WARM UP JOB` exposes the table-level job type, table filter,
matched tables, and SyncStats.
- `SHOW WARM UP JOB` list output keeps compact SyncStats, while
single-job lookup keeps detailed windowed SyncStats.
Example:
```sql
WARM UP COMPUTE GROUP query_cg WITH COMPUTE GROUP write_cg
ON TABLES (
INCLUDE 'core_db.config',
INCLUDE 'report_db.monthly_*',
INCLUDE '*.sales_*',
EXCLUDE '*.*_archive'
)
PROPERTIES (
"sync_mode" = "event_driven",
"sync_event" = "load"
);
```
Conflict and virtual compute group behavior:
- Table-level load-event warm-up and cluster-level load-event warm-up
are mutually exclusive for the same source and target compute group
pair.
- If a conflicting job already exists, creation returns an error that
includes the conflicting job id; table-level conflicts also include the
table filter.
- Duplicate checks within the same job type still follow the existing
duplicate-check logic.
- VCG-managed cluster-level load-event warm-up creation does not fail on
conflict. Because VCG jobs are created by the MS HTTP API path, FE
cancels existing table-level load-event warm-up jobs with the same
source and target first, then recreates the VCG-managed cluster-level
job.
- Manually creating a table-level load-event warm-up job is rejected
only when both source and target compute groups are owned by the same
VCG.
- SQL still cannot use a virtual compute group directly as the source or
target compute group.
Warm-up progress observation:
- BE records per-job windowed requested, finished, and failed warm-up
statistics.
- BE exposes per-job warm-up statistics through
`/api/warmup_event_driven_stats`.
- FE aggregates BE statistics and caches the aggregated result in the
warm-up job.
- SyncStats includes source-side and target-side warm-up size/count
progress across windows.
- SyncStats includes trigger-time progress, so users can observe whether
the target compute group is behind the latest source-side warm-up
trigger.
- FE `/metrics` exposes per-job active warm-up metadata, synchronized
size, and trigger gap metrics for cloud event-driven warm-up jobs.
Support table-level event-driven cloud warm-up with `ON TABLES` filters
and per-job warm-up sync statistics.
- Test
- [x] Regression test
- [x] Unit Test
- [x] Manual test
- [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
- [ ] Previous test can cover this change.
- [ ] No code files have been changed.
- [ ] Other reason
- Behavior changed:
- [ ] No.
- [x] Yes. `WARM UP` supports table-level `ON TABLES` filters for
event-driven load warm-up, and warm-up job output/metrics expose table
filter, matched tables, SyncStats, and trigger-gap information.
- Does this need documentation?
- [ ] No.
- [x] Yes. apache/doris-website#3829
0516b80 to
d1ed1e8
Compare
Contributor
Author
|
run buildall |
Contributor
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
Contributor
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Backport #63832 to branch-4.0.
This adds table-level
ON TABLESsupport for event-driven cloud warm-up, including FE parser/command handling, warm-up job table filters, BE warm-up stats plumbing, and the HTTP stats action needed by the regression coverage.For this branch-4.0 backport,
CloudMetaMgr::commit_rowsetwas adjusted to taketable_idas an explicit parameter from the caller. Call sites now pass the value from the owning tablet instead of deriving it from rowset metadata.Validation
git diff --check./run-be-ut.sh --run --filter='CloudWarmUpManagerFilterTest.*:CloudWarmUpManagerTest.*:MBvarWindowedAdderTest.*' -j100./run-fe-ut.sh --run org.apache.doris.cloud.OnTablesFilterTest,org.apache.doris.cloud.CloudWarmUpJobTableFilterTest,org.apache.doris.cloud.WarmUpClusterOnTablesParseTest,org.apache.doris.cloud.WarmUpStatsTest,org.apache.doris.cloud.CacheHotspotManagerTableFilterTest,org.apache.doris.cloud.catalog.CloudInstanceStatusCheckerTest,org.apache.doris.metric.MetricsTest./build.sh --be --fe --cloud -j100docker build -f docker/runtime/doris-compose/Dockerfile -t bh-cluster-2 .env -u HTTP_PROXY -u HTTPS_PROXY -u http_proxy -u https_proxy -u ALL_PROXY -u all_proxy ./run-regression-test.sh --run -d regression-test/suites/cloud_p0/cache/multi_cluster/warm_up/on_tables -g docker -runMode=cloud -dockerSuiteParallel 1