Invalid ID attribute with php8.4's \Dom\HTMLDocument #18316

edent · 2025-04-12T21:05:17Z

Description

The following code:

<?php
$html = `<p id="example ">`
$dom = \Dom\HTMLDocument::createFromString($html, LIBXML_NOERROR, "UTF-8");
echo $dom->saveHTML();

Resulted in this output:

<html><head></head><body><p id="example "></p></body></html>

But I expected this output instead:

<html><head></head><body><p id="example"></p></body></html>

As per https://html.spec.whatwg.org/multipage/dom.html#global-attributes:the-id-attribute-2

When specified on HTML elements, the id attribute value must be unique amongst all the IDs in the element's tree and must contain at least one character. The value must not contain any ASCII whitespace.

Further detail at https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/Global_attributes/id#syntax

I'd suggest trimming whitespace from the IDs. It may also be sensible to do that on the other attributes. Of course, the behaviour does depend on how closely you want to follow the input's mistakes. If the intention is to closely replicate the input (whether their original code was valid or not) then please close this bug.

PHP Version

PHP 8.4.6 (cli) (built: Apr 11 2025 02:19:14) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.4.6, Copyright (c) Zend Technologies
with Zend OPcache v8.4.6, Copyright (c), by Zend Technologies

Operating System

Pop!_OS 22.04 LTS

nielsdos · 2025-04-13T09:15:38Z

The docs you cited describe how the developer should author HTML documents, not how it should be parsed.
The parser spec does not contain a rule to strip the whitespace and diverging from that would be dangerous.
Browsers also don't do this:

dom=(new DOMParser).parseFromString('<p id="example ">', 'text/html')
dom.querySelector('p') // <p id="example ">
dom.querySelector('p').id // "example "

edent · 2025-04-13T09:58:36Z

Thank you - that's a very clear explanation.

edent added Bug Status: Needs Triage labels Apr 12, 2025

devnexen added the Extension: dom label Apr 12, 2025

devnexen assigned nielsdos Apr 12, 2025

nielsdos closed this as not planned Apr 13, 2025

nielsdos added Status: Invalid and removed Status: Needs Triage labels Apr 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Invalid ID attribute with php8.4's \Dom\HTMLDocument #18316

Invalid ID attribute with php8.4's \Dom\HTMLDocument #18316

edent commented Apr 12, 2025

nielsdos commented Apr 13, 2025

edent commented Apr 13, 2025

Invalid ID attribute with php8.4's \Dom\HTMLDocument #18316

Invalid ID attribute with php8.4's \Dom\HTMLDocument #18316

Comments

edent commented Apr 12, 2025

Description

PHP Version

Operating System

nielsdos commented Apr 13, 2025

edent commented Apr 13, 2025