Skip to content

Validate downloaded theme assets against extension and MIME allowlists#852

Open
mikachan wants to merge 33 commits into
trunkfrom
fix/validate-downloaded-theme-assets
Open

Validate downloaded theme assets against extension and MIME allowlists#852
mikachan wants to merge 33 commits into
trunkfrom
fix/validate-downloaded-theme-assets

Conversation

@mikachan

Copy link
Copy Markdown
Member

Add a per-sink extension allowlist and post-download MIME validation to the four places the plugin calls download_url() and writes the response to disk: media downloads from template/pattern content (add_media_to_local), font downloads from user global styles (copy_font_assets_to_theme), and the zip-export equivalents of both (add_media_to_zip, add_activated_fonts_to_zip). Also rejects multi-extension polyglot URLs (evil.php.jpg).

To test, ensure the following tests pass:

  npm run test:unit:php -- --filter "Test_Create_Block_Theme_(Media|Fonts|Zip)"

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens Create Block Theme’s asset-downloading code paths by adding pre-download URL extension allowlists and post-download content-type validation for media and font assets, including the zip-export flows, to reduce the risk of downloading/extracting unsafe payloads.

Changes:

  • Add URL allowlist validators to reject disallowed and multi-extension polyglot URLs before calling download_url().
  • Add post-download validation: MIME allowlist for media and magic-byte validation for font files.
  • Add PHPUnit coverage for the new validators and for “skip without downloading” behavior; update contributor test commands in docs.

Reviewed changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
includes/create-theme/theme-media.php Adds media URL allowlist + post-download MIME allowlist; wires checks into local media download flow.
includes/create-theme/theme-fonts.php Adds font URL allowlist + magic-byte validation; wires checks into font asset copying.
includes/create-theme/theme-zip.php Wires the same validations into zip export for media and activated fonts.
tests/test-theme-media.php Adds tests for media URL/file validation and for short-circuiting before HTTP.
tests/test-theme-fonts.php Adds tests for font URL/file validation and for short-circuiting before HTTP.
tests/test-theme-zip.php Adds zip-specific test ensuring disallowed media URLs don’t trigger HTTP.
tests/data/evil.php.txt Adds malicious PHP fixture used by validator tests.
AGENTS.md Updates documented npm commands for running PHP unit tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread includes/create-theme/theme-zip.php
Comment thread includes/create-theme/theme-zip.php
Comment thread includes/create-theme/theme-zip.php Outdated
Comment thread includes/create-theme/theme-media.php
Comment thread includes/create-theme/theme-media.php
Comment thread includes/create-theme/theme-fonts.php
Comment thread includes/create-theme/theme-zip.php Outdated
Comment thread includes/create-theme/theme-media.php Outdated
mikachan and others added 12 commits June 16, 2026 14:01
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Two recent Copilot suggestion commits accidentally removed structural
code while making single-line changes:
- c045da7 ("Pass $font_src to download_url") dropped the `} else {`
  and the `is_wp_error( $tmp_file )` guard.
- 617f6ec ("Stream downloaded media into zip") dropped the closing
  brace of the `foreach` in add_media_to_zip().

Both broke PHP parse on every CI matrix. Restoring the missing
statements so the file is valid again.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Two more Copilot suggestion commits dropped surrounding context lines:
- ad31fe4 ("Strip query string from filename before renaming downloaded
  media") removed the closing brace of the foreach loop in
  add_media_to_local(), causing fatal lint errors.
- 860f1bc ("Strip query string from font URL before deriving filename
  in zip") removed the `$font_family_dir_name = sanitize_title(...)`
  assignment, leaving an undefined variable used immediately below.
`$font_filename = basename( $font_face['src'] )` threw TypeError on
PHP 8+ when src was already an array (valid per Theme JSON spec) and
was redefined inside the inner src loop anyway. The adjacent
`$font_dir = wp_get_font_dir()` was likewise redundant — also
redefined inside the inner loop. Removing both.
wp_check_filetype_and_ext() was inconsistent with the URL allowlist:
SVG isn't in WP Core's default mime registry at all, and WP maps wmv
to video/x-ms-wmv / avi to video/avi while the post-check allowlist
had only video/x-msvideo. Legitimate SVG/WMV/AVI URLs passed preflight
and then silently failed post-download.

Switching media to magic-byte verification (matching the font sink)
removes the dependency on WP Core's mime registry and libmagic
versions. Recognised formats: JPEG, PNG, GIF, WebP, SVG, MP4/M4V/MOV/
3GP/3G2 (via ftyp), WebM, OGV, WMV (ASF), AVI (RIFF), MPEG.
@scruffian

Copy link
Copy Markdown
Contributor

I found 3 issues:

  1. The temp file is now deleted before ZipArchive has read it.
  2. URL rewriting still uses the raw URL basename, so query-string slashes can produce broken or dangerous-looking asset references even though the download sink now uses the parsed path. For example http://example.com/cat.jpg?/evil.php now becomes cat.jpg for validation/download, but exported template markup points at /assets/images/evil.php.
  3. is_allowed_media_file() does not verify that the downloaded bytes match the URL extension. e.g. a .jpg URL with SVG/MP4/WebM bytes passes and gets saved as .jpg.
    '
    I added unit tests to demonstrate the issues (but not fixes)

mikachan added 2 commits June 16, 2026 17:30
Three fixes for issues found via regression tests in 8da66ff:

1. is_allowed_media_file() now requires the downloaded bytes to match
   the URL extension specifically — a .jpg URL with SVG/MP4/anything-else
   bytes is rejected. Previously any allowed magic was accepted, so a
   browser MIME-sniffing the on-disk .jpg containing SVG could render
   it as SVG (with all the script-execution surface that brings).

2. make_relative_media_url() now derives the filename from the parsed
   URL path instead of raw basename(). A URL like cat.jpg?/evil.php
   previously produced `/assets/images/evil.php` in exported template
   markup because basename() treats query-string slashes as path
   separators.

3. add_media_to_zip() reads the downloaded bytes into memory and
   unlinks the tmp file BEFORE adding to the archive via
   addFromStringToTheme(). The previous `addFileToTheme() + unlink`
   sequence broke because ZipArchive defers reading files added via
   addFile() until close() — by which time the tmp file was gone.
@mikachan

Copy link
Copy Markdown
Member Author

Thanks! These should all be fixed now.

  1. add_media_to_zip() now reads the downloaded bytes into memory and unlinks the tmp file before adding via addFromStringToTheme().
  2. make_relative_media_url() now uses basename(wp_parse_url($url, PHP_URL_PATH)) instead of raw basename($url). The example cat.jpg?/evil.php now produces /assets/images/cat.jpg as expected.
  3. is_allowed_media_file() now switches on the URL extension and requires the bytes to match that specific format. A .jpg URL with SVG/MP4/anything-else bytes is rejected so we never persist a file whose extension lies about its content.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 9 changed files in this pull request and generated 4 comments.

Comment thread tests/test-theme-zip.php
Comment thread includes/create-theme/theme-zip.php
Comment thread includes/create-theme/theme-media.php
Comment thread includes/create-theme/theme-fonts.php Outdated
mikachan and others added 3 commits June 17, 2026 13:39
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
parse_url() omits the 'port' key when the URL has no explicit port
(e.g. `http://localhost/foo`). The retry-on-localhost block in both
add_media_to_zip() and add_media_to_local() read $parsed_url['port']
unconditionally, raising an undefined-index notice when the initial
download failed for such a URL. Guard with isset() on both keys.
Previously any recognised font magic was accepted, so a .woff2 URL
returning TTF/WOFF/OTF/EOT bytes would pass and be saved under the
.woff2 filename. This defeated the extension+MIME allowlist intent
and would produce broken assets in the exported theme/zip (browsers
would fail to load a WOFF2-named file containing TTF content).

The check now switches on the URL extension and validates the magic
matches THAT specific format. Mirrors the same fix already applied
to is_allowed_media_file().
@scruffian

Copy link
Copy Markdown
Contributor

is_allowed_media_file has an mpeg branch, but is_allowed_media_url and the folder mapping only allow mpg. A template containing a .mpeg video would get rewritten to a local asset URL, then skipped before download, leaving the exported theme pointing at a missing file.

Comment thread includes/create-theme/theme-zip.php Outdated
@unlink( $tmp_file );
continue;
}
$zip->addFileToTheme( $tmp_file, $font_face_path );

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have the same problem as in add_media_to_zip, where unlink is being called before the sources are added to the zip, so we need the same fix here too I think.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Applied the same fix here in ef205f7.

is_allowed_media_file() accepted both .mpg and .mpeg, but
is_allowed_media_url() and get_media_folder_path_from_url() only
listed .mpg. A template with a .mpeg video would get rewritten to a
local asset URL, then skipped before download — leaving the exported
theme pointing at a missing file.

Add .mpeg to the URL allowlist and the folder mapping so all three
lists agree (matching WP Core's wp_get_mime_types which maps
mpeg|mpg|mpe to video/mpeg).

Also drops a pre-existing duplicate 'ogv' entry in the folder mapping
that was noticed while editing.
@mikachan

Copy link
Copy Markdown
Member Author

is_allowed_media_file has an mpeg branch, but is_allowed_media_url and the folder mapping only allow mpg.

Good spot, thank you. This should be fixed in 375663c.

mikachan added 2 commits June 17, 2026 16:20
Same fix as add_media_to_zip: ZipArchive::addFile() defers reading
the file until close() is called, so unlinking the download_url tmp
file immediately after addFileToTheme() left an empty entry in the
final archive when the zip was finalised.

Read the bytes into memory, unlink, then addFromStringToTheme() —
mirrors the pattern in add_media_to_zip(). Only the remote-download
branch is affected; the local-copy branch points at a permanent
file in the font library so its addFileToTheme() call stays.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants