ci: stop #freenet-dev alert spam from chronic gateway backpressure#57
Open
sanity wants to merge 1 commit into
Open
ci: stop #freenet-dev alert spam from chronic gateway backpressure#57sanity wants to merge 1 commit into
sanity wants to merge 1 commit into
Conversation
Both demo-maintenance workflows have been paging #freenet-dev on nearly every run for weeks. The cause is upstream Freenet gateway capacity, not anything these workflows can fix: the gateway behind FREENET_GIT_WS_URL returns "contract queue full, try again later" on the repo-state probe and "put timed out" on the PUTs. The freenet-git-side robustness fixes (#4291 mirror stabilization, #4253 queue-full amplification) already shipped in v0.2.64/v0.2.69 and did not resolve it, so the alerts are non-actionable noise that trains everyone to ignore the channel. - rescue-demos: pause the schedule (workflow_dispatch retained). Every scheduled run is a guaranteed no-op plus two 🚨 alerts/day. Re-enable once the gateway can service rescue again (e.g. after the 0.2.79 #4499 event-loop wedge fix propagates): dispatch manually, confirm green, uncomment the cron. - mirror-repo (reusable): classify the push failure. Transient gateway backpressure now suppresses the Matrix alert (the next push / daily safety-net cron is the retry); genuine failures, job-timeout cancellations, and pre-push step failures still alert. One edit covers freenet-core, freenet-stdlib, and the self-mirror. The run still goes red in the Actions tab on a transient failure (honest: the demo URL is behind); only the channel page is suppressed. [AI-assisted - Claude] Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01P5syMzUfC5Zk4fv5ivaYrx
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
rescue-demosand the reusablemirror-repoworkflow have been paging#freenet-devon nearly every run for weeks. Confirmed root cause across recent runs: upstream Freenet gateway backpressure, not anything these workflows can fix. The gateway behindFREENET_GIT_WS_URLreturns:contract queue full, try again lateron the repo-state probe (per-contract fair-queue saturation, #4251), andput timed out after N peer attempt(s)on the PUTs.The freenet-git-side robustness fixes that targeted exactly this (#4291 mirror stabilization → v0.2.69, #4253 queue-full amplification → v0.2.64) already shipped and did not resolve it. So the alerts are non-actionable noise — classic alert fatigue that trains everyone to ignore the channel.
Approach
schedule:(keepworkflow_dispatch). Every scheduled run is currently a guaranteed no-op plus two 🚨/day. Re-enable once the gateway can service rescue again (e.g. after the 0.2.79#4499event-loop wedge fix propagates): dispatch manually, confirm green, uncomment the cron.contract queue full/try again later/put timed out/host backpressure or timeout) setstransient=trueand the Matrix-notify step is skipped — the next push or daily safety-net cron is the retry. Genuine failures (auth, pack corruption, helper bugs), job-timeout cancellations, and pre-push step failures still alert.The run still goes red in the Actions tab on a transient failure (honest — the demo URL is behind); only the channel page is suppressed.
Note
The caller repos (
freenet-core,freenet-stdlib) pinmirror-repo.ymlto a fixed commit SHA, so the mirror transient-guard does not take effect for them until their pinned SHA is bumped to this commit. This PR stops the rescue spam on merge; stopping the per-repo mirror spam needs a follow-up SHA bump in each caller (or pausing their dailyschedule:).Testing
yamllint/python -c yaml.safe_loadparse clean on both files.shellcheck+ behavior tests: transient-backpressure log →transient=true(alert suppressed); clean push →transient=false; genuine failure →transient=false(alert fires).[AI-assisted - Claude]
🤖 Generated with Claude Code