Skip to content

src, scripts: use kv to cache directories#756

Open
flakey5 wants to merge 3 commits intomainfrom
flakey5/159
Open

src, scripts: use kv to cache directories#756
flakey5 wants to merge 3 commits intomainfrom
flakey5/159

Conversation

@flakey5
Copy link
Copy Markdown
Member

@flakey5 flakey5 commented Nov 15, 2025

Makes use of KV for directory listings. This aims to replace the build-r2-symlinks.mjs file with two different ones:

  • build-directory-cache.mjs - This reads the entire dist-prod bucket and caches it into the KV namespace. This is pretty much only for the initial setup of the cache and for anytime we need to rebuild all of it.
  • update-directory-cache.mjs - This reads only the necessary paths from the dist-prod bucket and updates them in the KV namespace. This will be the one that we will call in the workflow

A USE_KV environment variable is added to the worker that acts as a toggle between KV and S3 listings. This acts as a safeguard in case something goes wrong while we're still testing it, so we can easily switch between the two. Once we're more confident in KV, we can remove it + the entire S3Provider from the worker.

Closes #159
Closes #314

TODO:

@flakey5 flakey5 force-pushed the flakey5/159 branch 2 times, most recently from 0ade6ef to 7fb3816 Compare January 6, 2026 04:00
@flakey5 flakey5 marked this pull request as ready for review January 6, 2026 04:28
@flakey5 flakey5 requested a review from a team as a code owner January 6, 2026 04:28
Comment thread wrangler.jsonc Outdated
Comment thread .github/workflows/update-links.yml
@ovflowd
Copy link
Copy Markdown
Member

ovflowd commented Jan 6, 2026

cc @flakey5 some of the CI steps are failing

Comment thread lib/listR2Directory.mjs
Copy link
Copy Markdown
Member

@ovflowd ovflowd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't finish the review yet, but leaving initial comments here :)

Comment thread lib/README.md
Comment thread lib/listR2Directory.mjs
Comment thread e2e-tests/directory.test.ts Outdated
Comment thread e2e-tests/util.ts Outdated
Comment thread e2e-tests/util.ts Outdated
Comment thread scripts/utils/addSymlinksToDirectoryCache.mjs Outdated
Comment thread scripts/utils/addSymlinksToDirectoryCache.mjs
Comment thread scripts/utils/addSymlinksToDirectoryCache.mjs
Comment thread scripts/utils/addSymlinksToDirectoryCache.mjs Outdated
Comment thread scripts/utils/addSymlinksToDirectoryCache.mjs
Comment thread src/providers/r2Provider.ts Outdated
@flakey5 flakey5 requested a review from a team February 7, 2026 19:42
Comment thread e2e-tests/util.ts Outdated
Comment thread scripts/utils/addSymlinksToDirectoryCache.mjs Outdated
Comment thread src/providers/r2Provider.ts
Comment thread wrangler.jsonc Outdated
Comment thread lib/listR2Directory.mjs
Comment thread scripts/utils/kv.mjs Outdated
Comment thread scripts/utils/r2.mjs
@cursor
Copy link
Copy Markdown

cursor bot commented Apr 9, 2026

PR Summary

Medium Risk
Introduces a new KV-backed directory listing path plus automation that writes large batches into a production KV namespace; mistakes could cause stale/incorrect directory listings or failed release automation, though a USE_KV toggle and S3 fallback reduce blast radius.

Overview
Moves directory listings toward a KV-backed cache (with a safety toggle). The worker Env gains DIRECTORY_CACHE and USE_KV, and R2Provider.readDirectory() can now serve listings from a new KvProvider (with a temporary Sentry-reported parity check against the existing cached/S3 results). S3Provider is simplified to delegate listing to a shared lib/listR2Directory.mjs helper, and contentTypeOverrides is converted from JSON to a typed TS constant.

Replaces the old symlink/listing generation script with KV cache build/update tooling and CI automation. Adds scripts/build-directory-cache.mjs (full rebuild + optional purge) and scripts/update-directory-cache.mjs (incremental per-release update), plus supporting utilities for listing R2, computing latestVersions.json, and injecting synthetic symlinks into cached listings (also updating cachedDirectories.json, docsDirectory.json, and fileSymlinks.json). GitHub Actions gains a new Build Directory Cache workflow and updates Update Redirect Links to require a version input and run the KV update script; Wrangler config and Vitest/miniflare are updated to bind the KV namespace, with new unit/e2e tests covering KV directory behavior.

Reviewed by Cursor Bugbot for commit 42eae4e. Bugbot is set up for automated code reviews on this repo. Configure here.

Comment thread scripts/utils/getLatestVersionMapping.mjs Fixed
Comment thread scripts/utils/addSymlinksToDirectoryCache.mjs Fixed
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

Autofix Details

Bugbot Autofix prepared fixes for all 3 issues found in the latest run.

  • ✅ Fixed: Version argument crashes before null check executes
    • I now validate the raw version argument before calling toLowerCase() so missing input throws the intended explicit error.
  • ✅ Fixed: Subdirectories cached without trailing slashes in e2e tests
    • The e2e cache population now appends / to each subdirectory name so test data matches production directory-cache shape.
  • ✅ Fixed: latest key incorrectly added as directory symlink
    • I restricted added directory symlinks to keys starting with latest-, which excludes both latest and node-latest.tar.gz.

Create PR

Or push these changes by commenting:

@cursor push 29199a84ef
Preview (29199a84ef)
diff --git a/e2e-tests/util.ts b/e2e-tests/util.ts
--- a/e2e-tests/util.ts
+++ b/e2e-tests/util.ts
@@ -47,7 +47,9 @@
   });
 
   const cachedDirectory: ReadDirectoryResult = {
-    subdirectories: Object.keys(directory.subdirectories),
+    subdirectories: Object.keys(directory.subdirectories).map(
+      subdirectory => `${subdirectory}/`
+    ),
     files,
     hasIndexHtmlFile,
     lastModified: new Date(),

diff --git a/scripts/update-directory-cache.mjs b/scripts/update-directory-cache.mjs
--- a/scripts/update-directory-cache.mjs
+++ b/scripts/update-directory-cache.mjs
@@ -20,12 +20,13 @@
 import { createCloudflareClient, writeKeysToKv } from './utils/kv.mjs';
 import { createS3Client, listR2DirectoryRecursive } from './utils/r2.mjs';
 
-const VERSION = process.argv[2].toLowerCase();
-
-if (!VERSION) {
+const VERSION_ARG = process.argv[2];
+if (!VERSION_ARG) {
   throw new TypeError('version missing from args');
 }
 
+const VERSION = VERSION_ARG.toLowerCase();
+
 if (!VERSION.startsWith('v')) {
   throw new TypeError(
     'provided version not in correct format, expected vX.X.X'

diff --git a/scripts/utils/addSymlinksToDirectoryCache.mjs b/scripts/utils/addSymlinksToDirectoryCache.mjs
--- a/scripts/utils/addSymlinksToDirectoryCache.mjs
+++ b/scripts/utils/addSymlinksToDirectoryCache.mjs
@@ -55,8 +55,8 @@
 ) {
   releaseDirectory.subdirectories.push(
     ...Object.keys(latestVersionMap)
-      // We only want directories, remove the only file in the map
-      .filter(version => version !== NODE_LATEST_FILE_NAME)
+      // We only want the `latest-*` directory symlinks.
+      .filter(version => version.startsWith('latest-'))
       .map(version => `${version}/`)
   );
   releaseDirectory.subdirectories.sort();

diff --git a/scripts/utils/addSymlinksToDirectoryCache.test.ts b/scripts/utils/addSymlinksToDirectoryCache.test.ts
--- a/scripts/utils/addSymlinksToDirectoryCache.test.ts
+++ b/scripts/utils/addSymlinksToDirectoryCache.test.ts
@@ -97,6 +97,7 @@
   const latestVersionMap: LatestVersionMapping = {
     'latest-v20.x': 'v20.1.2',
     'latest-v22.x': 'v22.2.1',
+    latest: 'v22.2.1',
     'node-latest.tar.gz': 'latest/node-v22.2.1',
   };

This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.

Comment thread scripts/update-directory-cache.mjs
Comment thread e2e-tests/util.ts
Comment thread scripts/utils/addSymlinksToDirectoryCache.mjs
Copy link
Copy Markdown
Member

@ovflowd ovflowd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said before, it is hard to review so many complex changes in a timely manner and one pass.

@flakey5 can you do me a favor? Can you address BugBot's reviews and maybe also ask Claude Code to scrutinize your code and do a 2nd pass in terms of code smell and what not? And then let's get this merged.

Comment thread scripts/utils/r2.mjs Outdated
Comment thread scripts/utils/getLatestVersionMapping.mjs
Comment thread scripts/utils/getLatestVersionMapping.mjs Fixed
Comment thread scripts/utils/getLatestVersionMapping.mjs Outdated
flakey5 added a commit to flakey5/node that referenced this pull request Apr 16, 2026
Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com>
flakey5 added a commit to flakey5/node that referenced this pull request Apr 16, 2026
Comment thread .github/workflows/update-links.yml
Comment thread scripts/constants.mjs Outdated
Comment thread scripts/utils/kv.mjs Outdated
nodejs-github-bot pushed a commit to nodejs/node that referenced this pull request Apr 17, 2026
Re nodejs/release-cloudflare-worker#159 nodejs/release-cloudflare-worker#756

Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com>
PR-URL: #62777
Reviewed-By: Claudio Wunder <cwunder@gnome.org>
Reviewed-By: Rich Trott <rtrott@gmail.com>
Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com>
Comment thread scripts/utils/addSymlinksToDirectoryCache.mjs
Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com>
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 42eae4e. Configure here.

if (!sourceFile) {
throw new TypeError(
`symlink '${symlink}' points to invalid file '${sourceFile}'`
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error message displays undefined instead of file path

Low Severity

The error message in addStaticFileSymlinksToCache interpolates sourceFile (which is undefined at this point, since the !sourceFile check just passed) instead of the actual target path. The message will always read "points to invalid file 'undefined'". The variable sourceFilePath (or fileSymlinks[symlink]) holds the meaningful path that was looked up and not found.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 42eae4e. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Serve /docs from Worker Using KV for listing directories

4 participants