Skip to content

HTML API: Refactor wp_kses_hair() #9248

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: trunk
Choose a base branch
from

Conversation

dmsnell
Copy link
Member

@dmsnell dmsnell commented Jul 11, 2025

Trac ticket: Core-63694

Replaces #7407, dmsnell#5
Coordination in #9256

wp_kses_hair() is built around an impressive state machine for parsing the $attr of an HTML tag, that is, the span of text after the tag name and before the closing >. Unfortunately, that parsing code doesn’t fully-implement the HTML specification and may be prone to mis-parsing.

This patch replaces the existing state machine with a straight-forward use of the HTML API to parse the attributes for us, constructing a shell take for the $attr string and reading the attributes structurally. This shell is necessary because a previous stage of the pipeline has already separated what it thinks is the so-called “attribute list” from a tag.

Dependencies

Copy link

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Core Committers: Use this line as a base for the props when committing in SVN:

Props dmsnell.

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 11, 2025
Trac ticket: Core-63694

`wp_kses_hair()` is built around an impressive state machine for parsing
the `$attr` of an HTML tag, that is, the span of text after the tag name
and before the closing `>`. Unfortunately, that parsing code doesn’t
fully-implement the HTML specification and may be prone to mis-parsing.

This patch replaces the existing state machine with a straight-forward
use of the HTML API to parse the attributes for us, constructing a shell
take for the `$attr` string and reading the attributes structurally.
This shell is necessary because a previous stage of the pipeline has
already separated what it thinks is the so-called “attribute list” from
a tag.

Props: dmsnell
@dmsnell dmsnell force-pushed the html-api/refactor-wp-kses-hair-take-3 branch from 68c7746 to b476339 Compare July 11, 2025 22:37
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 11, 2025
Trac ticket: Core-63694

`wp_kses_hair()` is built around an impressive state machine for parsing
the `$attr` of an HTML tag, that is, the span of text after the tag name
and before the closing `>`. Unfortunately, that parsing code doesn’t
fully-implement the HTML specification and may be prone to mis-parsing.

This patch replaces the existing state machine with a straight-forward
use of the HTML API to parse the attributes for us, constructing a shell
take for the `$attr` string and reading the attributes structurally.
This shell is necessary because a previous stage of the pipeline has
already separated what it thinks is the so-called “attribute list” from
a tag.

Props: dmsnell
@dmsnell dmsnell force-pushed the html-api/refactor-wp-kses-hair-take-3 branch from b476339 to 6146ecd Compare July 11, 2025 22:45
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 11, 2025
Trac ticket: Core-63694

`wp_kses_hair()` is built around an impressive state machine for parsing
the `$attr` of an HTML tag, that is, the span of text after the tag name
and before the closing `>`. Unfortunately, that parsing code doesn’t
fully-implement the HTML specification and may be prone to mis-parsing.

This patch replaces the existing state machine with a straight-forward
use of the HTML API to parse the attributes for us, constructing a shell
take for the `$attr` string and reading the attributes structurally.
This shell is necessary because a previous stage of the pipeline has
already separated what it thinks is the so-called “attribute list” from
a tag.

Props: dmsnell
@dmsnell dmsnell force-pushed the html-api/refactor-wp-kses-hair-take-3 branch from 6146ecd to d64f56e Compare July 11, 2025 22:46
Copy link

Test using WordPress Playground

The changes in this pull request can previewed and tested using a WordPress Playground instance.

WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser.

Some things to be aware of

  • The Plugin and Theme Directories cannot be accessed within Playground.
  • All changes will be lost when closing a tab with a Playground instance.
  • All changes will be lost when refreshing the page.
  • A fresh instance is created each time the link below is clicked.
  • Every time this pull request is updated, a new ZIP file containing all changes is created. If changes are not reflected in the Playground instance,
    it's possible that the most recent build failed, or has not completed. Check the list of workflow runs to be sure.

For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation.

Test this pull request with WordPress Playground.

dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 12, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 13, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 13, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 13, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 13, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 13, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 13, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 13, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 13, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 13, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 13, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 13, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 13, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 13, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 13, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 14, 2025
@dmsnell dmsnell force-pushed the html-api/refactor-wp-kses-hair-take-3 branch from d64f56e to d8147fb Compare July 14, 2025 17:07
Trac ticket: Core-63694

`wp_kses_hair()` is built around an impressive state machine for parsing
the `$attr` of an HTML tag, that is, the span of text after the tag name
and before the closing `>`. Unfortunately, that parsing code doesn’t
fully-implement the HTML specification and may be prone to mis-parsing.

This patch replaces the existing state machine with a straight-forward
use of the HTML API to parse the attributes for us, constructing a shell
take for the `$attr` string and reading the attributes structurally.
This shell is necessary because a previous stage of the pipeline has
already separated what it thinks is the so-called “attribute list” from
a tag.

Props: dmsnell
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 14, 2025
@dmsnell dmsnell force-pushed the html-api/refactor-wp-kses-hair-take-3 branch from d8147fb to d119749 Compare July 14, 2025 17:23
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 15, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 15, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 15, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 15, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 15, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 15, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 15, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 15, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 15, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 15, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 15, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 15, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 15, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 15, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 15, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 15, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 15, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 15, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 15, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 15, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 15, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 15, 2025
dmsnell added a commit to dmsnell/wordpress-develop that referenced this pull request Jul 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant