feat: Release Tracks — floating Docker tags (latest/standard/trailing) promotion engine (#36160)#36161
feat: Release Tracks — floating Docker tags (latest/standard/trailing) promotion engine (#36160)#36161sfreudenthaler wants to merge 18 commits into
Conversation
(cherry picked from commit db22adbc8975bee8e43e2a7143fb978d45f2a1d0)
(cherry picked from commit 50591149121ec21388c99f1534ac9bcdcb8b8822)
(cherry picked from commit 295eab6f1e3b6a1b7adb3111a88e3e083cbaf0c9)
(cherry picked from commit 8d3884d56d6745585a716b1b46682634edf63db4)
(cherry picked from commit b1936fc5c88ee4d66d416cb37ec7055ac78672aa)
…tags The previous test named 'test_list_tags_paginates_and_returns_name_digest' had next=null and did not exercise the pagination loop. Replace it with a test that serves two mocked responses (page 1 with next pointing to page 2, page 2 with next=null) and asserts tags from both pages appear in the result. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> (cherry picked from commit acbe5a4e62db88e6bcb36f1b6b3b210315c509cf)
(cherry picked from commit 7c8da73bfbbfc840004c865a5d982fabbe8ed441)
(cherry picked from commit f5c72e96613b362946dfe5da57cfd2f4acfc86e1)
(cherry picked from commit 073f0b99109d1d61d9b0e653eb1e37ce5334a61c)
…uard env vars - Move --apply flag from top-level parser onto each subparser (promote and admin) so `evergreen-tracks promote --repo foo/bar --apply` works as expected - Add tests/test_cli.py with 25 tests covering parser behaviour, cmd_promote, and all cmd_admin paths (taint/untaint/hold/release-hold, guards, return codes) - Replace bare os.environ[] lookups for DOCKER_USERNAME/DOCKER_TOKEN with os.environ.get() guards that return code 2 with a clear error message instead of crashing with an unhandled KeyError Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> (cherry picked from commit 41543da383b66b3402e828d2d93d329fd61ce659)
(cherry picked from commit 2ca500f882b9dc0fd36c5d078217c02c3d073d14)
(cherry picked from commit 4dd9ea1c72b6234b595c87bcea5d78da6ef8cc2f)
…cile held tracks Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> (cherry picked from commit 8ed72a269078cc8686ff8f5854d8d91d7f1c7afb)
…nt no-op A promote run with a held track and no other moves previously produced no output. Log "held at <marker>, skipping promotion" for consistent held tracks and always report when idle. Adds a caplog test. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> (cherry picked from commit 046a73ba1d6f7b789d831e136117f9f84b2bf0be)
…36160) - Promote/admin workflows default to dotcms/dotcms. - Daily cron promotion is gated behind the EVERGREEN_TRACKS_APPLY repo variable (dry-run until explicitly enabled) so merging does not start mutating production tags. - Workflow inputs moved into env to avoid expression injection in run steps. - Document Release Tracks in the root README, including the CalVer-date age rationale and the taint/hold model. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
|
Claude finished @sfreudenthaler's task in 1m 3s —— View job Rollback Safety Analysis
Result: ✅ Safe To RollbackAll 20 changed files are pure additions (+1659 / -0) in three buckets:
Category-by-category verdict:
Rolling back dotCMS from a release that included this PR to the previous release is fully safe: the promotion engine lives entirely in CI/CD and leaves no persistent state in the database, Elasticsearch, or any other shared datastore. The only external side-effect is Docker Hub tag mutations, which are controlled by the separate Label added: AI: Safe To Rollback |
🤖 Codex Review —
|
| revision = 3 | ||
| requires-python = ">=3.12" | ||
|
|
||
| [[package]] |
There was a problem hiding this comment.
Legal Risk
certifi 2026.5.20 was released under the MPL-2.0 license, a license that
has been flagged by your organization for consideration.
Recommendation
While merging is not directly blocked, it's best to pause and consider what it means to use this license before continuing. If you are unsure, reach out to your security team or Semgrep admin to address this issue.
The promote workflow had no concurrency control, so a scheduled run and a manual promote/admin dispatch could overlap. Both read live registry state and then apply tag moves, so concurrent runs risk acting on stale state and overwriting a hold/taint or moving a track on outdated data. Add a workflow-level concurrency group keyed by workflow + ref with cancel-in-progress: false so runs queue rather than abort mid-promote. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Addressed in 15f171a. Added a workflow-level |
wezell
left a comment
There was a problem hiding this comment.
Wait, so :latest is going to become 2 weeks old? Should we maintain latest as is and add another tag called... :current? latest is a well known convention.
after live discussion decision is to hook our releaser into this as a on-demand invoke that way
|
…line Addresses @wezell's review: `latest` is a well-known convention and must not lag behind a GA. Per the live decision, hook the releaser into the evergreen-tracks engine so latest moves immediately, and make that engine the single controller of latest/standard/trailing. - cli: add `--tracks` subset filter to `promote` so a caller can scope to one track (e.g. `--tracks latest`). - cicd_6-release.yml: stop moving latest via deploy-docker (`latest: false`) and add a `promote-latest` job that invokes the engine on-demand once the release images are published, for dotcms/dotcms and dotcms/dotcms-dev. Applies unconditionally for a real latest release from main on dotcms/core (restores always-on behavior; NOT behind EVERGREEN_TRACKS_APPLY, which still gates only the aged standard/trailing tracks during rollout). - Serialize all registry mutations under one static concurrency group (`evergreen-tracks-registry`) shared by promote, admin, and the release promote-latest job, closing the remaining cross-workflow race. - Tests + README updated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Good catch —
So |
The gate's only durable value was a global pause switch, which is already covered by `hold` (per-track freeze), `taint` (block a release), and disabling the scheduled workflow. `latest` no longer depends on it (it's release-driven), and standard/trailing have no consumers yet so applying the daily cron is low-risk. Cron now always applies; manual dispatch keeps its dry-run default. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Resolves #36160 · Epic #35693
Proposed Changes
cicd/evergreen-tracks/(managed withuv) that advances three floating Docker tags —latest/standard/trailing— across the linear GA CalVer stream by release age (newest GA, ~14d, ~28d; thresholds configurable).<version>_tainted(forward-only block — a bad release can't propagate to more conservative tracks) and<track>_hold(sticky manual freeze). No separate datastore; the audit trail is the Actions run logs.docker buildx imagetools create(no layer re-push). Age is read from the CalVer date in the version, not build/publish date, so emergency backports of older releases can't be swept into a future promotion.cicd_evergreen-tracks-promote.yml(daily cron + dispatch) andcicd_evergreen-tracks-admin.yml(manual taint/hold).README.md.Checklist
cd cicd/evergreen-tracks && uv run pytest.Additional Info
Tag control:
latestis moved on-demand by the release pipeline (promote-latestjob incicd_6-release.yml,--tracks latest) the moment a GA's images publish, fordotcms/dotcmsanddotcms/dotcms-dev— the olddeploy-docker latest: truepath is unwired (latest: false). The daily cron agesstandard/trailingforward and always applies (no separate enable gate). To pause promotion, disable the scheduled workflow orholdthe track; to block a bad release,taintit. All registry mutations (release-driven latest, cron, admin) serialize under one concurrency groupevergreen-tracks-registry.Credential scope (important): promotion needs only write, but
untaintandrelease-holdcall the Hub delete API — theDOCKER_USERNAME/DOCKER_TOKENused here must have Read/Write/Delete scope, or those two admin actions fail. (Verified both ways: write-only 403s on delete; RWD succeeds.)Validation: the full lifecycle (promote-by-age, taint→skip, hold→freeze, release-hold→resume, untaint→restore, teardown) was exercised end-to-end against the
dotcms/dotcms-testsandbox and verified by digest. The live smoke can't run incore-workflow-testCI (it intentionally carries no live Docker secrets), so it should run here incoreCI / on dispatch.Security notes: free-text workflow inputs are passed via
env:(not interpolated intorun:) to avoid expression injection; no tokens are logged; least-privilegepermissions: contents: readon the promote workflow.Out of scope (per epic): LTS-line tracks, Java-variant track tags, the Cloud control-plane UI, and update cadence changes.
🤖 Generated with Claude Code