Skip to content

Fix Julia cache flood in CI workflows#14410

Merged
cderv merged 2 commits intomainfrom
feature/julia-cache-cleanup
Apr 22, 2026
Merged

Fix Julia cache flood in CI workflows#14410
cderv merged 2 commits intomainfrom
feature/julia-cache-cleanup

Conversation

@cderv
Copy link
Copy Markdown
Member

@cderv cderv commented Apr 22, 2026

The repository hit its 10 GB GH Actions cache cap (9.81 GB actual), with 9.42 GB dominated by julia-actions/cache entries. 37 julia caches on refs/heads/main alone accounted for 7.78 GB, evicting useful caches for other CI workflows (R renv, deno_std).

Root cause: per-run accumulation on main

julia-actions/cache@v2 is a composite action whose post-save and delete-old-caches steps do not run reliably. Each cache key contains run_id and run_attempt, so every CI run creates a new ~250 MB entry and older entries are never deleted. 37 entries were observed on main across only 4 days.

v3 is a JavaScript rewrite with a proper post hook. Combined with the actions: write permission already granted in test-smokes.yml, the action now keeps only the newest entry per (workflow, os) tuple. v3 also uses Node.js 24 directly, dropping the transitive Node 20 dependency from v2's bundled actions/cache — partially reduces the remaining scope in #14201 but does not close it.

Root cause: fork PR cleanup silently failed

cleanup-caches.yml triggered on pull_request: closed, but pull_request events from forks ship a read-only GITHUB_TOKEN regardless of the permissions: block. Verified on merged PR #14374: all 8 gh cache delete calls returned HTTP 403: Resource not accessible by integration, leaving 1.98 GB orphaned. set +e swallowed the failures so the run reported success.

pull_request_target runs in the base-branch context with full write permissions. Safe here: the workflow does not check out PR code — it only calls gh cache list/delete using the PR number from the event payload. The fix only applies to PR closes after this merges, since pull_request_target reads its workflow from the base branch.

A one-off manual cleanup of pre-existing stale caches (32 entries, ~7.7 GB) has already been run; this PR prevents recurrence.

Refs #14201

cderv added 2 commits April 22, 2026 17:22
v2 is a composite action whose post-save and delete-old-caches steps
don't run reliably, letting per-run julia caches (~250 MB each)
accumulate on refs/heads/main until the repository hits its 10 GB
cache cap. v3 is a JavaScript rewrite with a proper post hook. It also
uses Node.js 24 directly, dropping the latent Node 20 exposure from
v2's transitive actions/cache dependency.
pull_request events from forks ship a read-only GITHUB_TOKEN regardless
of the permissions: block, so gh cache delete fails with HTTP 403 and
the fork PR's ~1-2 GB of caches leak into the repo's 10 GB cache budget.
Observed on PR #14374 (8/8 deletes returned 403, run succeeded because
set +e swallowed the failures).

pull_request_target runs in the base-branch context with full write
permissions. Safe for this workflow: no PR code is checked out, the
steps only call gh cache list/delete.
@cderv cderv merged commit 2b025d5 into main Apr 22, 2026
91 of 93 checks passed
@cderv cderv deleted the feature/julia-cache-cleanup branch April 22, 2026 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant