[BlockMarkupURLProcessor] Support URLs in CSS #195
Conversation
components/DataLiberation/BlockMarkup/class-blockmarkupurlprocessor.php
Outdated
Show resolved
Hide resolved
components/DataLiberation/BlockMarkup/class-blockmarkupurlprocessor.php
Outdated
Show resolved
Hide resolved
components/DataLiberation/BlockMarkup/class-blockmarkupurlprocessor.php
Outdated
Show resolved
Hide resolved
|
This seems to be looking good. |
|
@dmsnell would you mind taking a look at this one? |
dmsnell
left a comment
There was a problem hiding this comment.
Thanks for the ping, @adamziel. I started doing review but it quickly turned into constant guessing which code was intentional and which was generated, making for a dizzying review experience. I hope I haven’t misjudged, but there are a number of fairly evident questions with the code.
It would help me to know how much effort you put into reviewing this so I can better asses the level of attention to give it.
It’s nice having the URLs inside of style attribute values remapped.
|
Thank you @dmsnell! I've spent some time simplifying the initial LLM implementation, but it seems like I haven't gone deeply enough. You've asked some great questions that I should have caught earlier. Let me do another pass the slow, methodical way before asking you for another review. |
Added memory peak usage checks and assertions for CSS URL processing.
|
With the dedicated CSSProcessor class, this PR became much simpler. It's now a pretty natural extension of the BlockMarkupURLProcessor that merely adds another specialized handler for another subsyntax. Thank you @dmsnell for reviewing and giving me that extra push! |
Adds support for rewriting URLs inside CSS syntax, e.g. here: ```html <div style="background-image:url(/wp-content/uploads/2025/09/image-2-766x1024.jpeg)"> ``` Before this PR, the `style` attributes in, e.g., the cover block were skipped by the URL rewriter and continued pointing to the old site. Fixes #223 ## Implementation details This PR backports `CSSProcessor`, `CSSURLProcessor`, and a few related PRs around Unicode handling from the WordPress/php-toolkit repo: * WordPress/php-toolkit#197 * WordPress/php-toolkit#195 * WordPress/php-toolkit#199 * WordPress/php-toolkit#200 * WordPress/php-toolkit#201 * WordPress/php-toolkit#202 Note the CSSProcessor and CSSURLProcessor are tested against 300 test cases containing various tricky inputs, quoted and unquoted URLs, strings, comments, unicode escape sequences, and more. ## Testing instructions This PR comes with a new test case specifically for various tricky CSS inputs. You're also welcome to try and import a WXR file that contains an inline background-image reference and confirm the URL is correctly rewritten.
Description
Adds support for migrating URLs within CSS syntax in the
styleHTML attribute during WXR imports. For example, this markup:Would be rewritten as:
Motivation
When importing WordPress sites via WXR, URLs embedded in CSS (like
background: url("/old-site.com/image.jpg")) need to be migrated to the new site. Previously, these URLs were missed, leading to broken images and assets after import.Cover blocks are a good example. Without this PR, the background image in this cover block would not be rewritten:
Implementation
The implementation introduces a new
CSSUrlProcessorclass that can parse CSSurl()functions, handle CSS escape sequences, and efficiently skip over large data URIs. It uses the same design principles asWP_HTML_Tag_Processor: simple state-machine API, no regexps, minimal allocations. TheCSSUrlProcessoris integrated withBlockMarkupURLProcessorand can be used as follows:Testing instructions