-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
gh-102555 Increase HTML standard compliance for closing comment tags #117406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
We might as well handle the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. Before reviewing and merging, tests should be added.
@@ -9,7 +9,7 @@ | |||
|
|||
_declname_match = re.compile(r'[a-zA-Z][-_.a-zA-Z0-9]*\s*').match | |||
_declstringlit_match = re.compile(r'(\'[^\']*\'|"[^"]*")\s*').match | |||
_commentclose = re.compile(r'--\s*>') | |||
_commentclose = re.compile(r'--!?>') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would leave the \s*
, even though I should double check what the HTML5 specs say exactly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would leave the
\s*
, even though I should double check what the HTML5 specs say exactly.
I provided the links to HTML5 specification earlier and "\s*" mentioned nowhere, moreover, my tests with latest versions of Firefox and Chrome has shown that it's in fact an incorrect behaviour and is not considered a closing tag by modern browsers. Thus I see no reason in keeping it (nor spec, nor common practice).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://html.spec.whatwg.org/#comment-end-state is the section of the specs I was looking for. It does indeed mention the !
but not the spaces, so updating the code accordingly sounds good to me.
Do you want to add tests to check these (-->
, --!>
, -- >
, --x>
, --->
, etc.) cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://html.spec.whatwg.org/#comment-end-state is the section of the specs I was looking for. It does indeed mention the
!
but not the spaces, so updating the code accordingly sounds good to me.Do you want to add tests to check these (
-->
,--!>
,-- >
,--x>
,--->
, etc.) cases?
I am thinking about improving the solution to even include <!-->
, unexpected EOF and similar other test cases (that were mentioned in a similar PR), but at the moment, unfortunately, I am lacking time to work on this PR. Hopefully, in the week (or at the weekend at worst) I can add the test cases and change a few other parts of the code to handle even wider variety of edge cases.
@ezio-melotti EOF edge-case (described here: https://html.spec.whatwg.org/multipage/parsing.html#parse-error-eof-in-comment ) appears to be a bit more complicated, I will try to resolve it today as well anyway. Currently added short comment ( |
New EOF behaviour seems to be consistent with chromium-based browser. There is still an edge case with EOF-ending abrupt comment case, but the case is relatively hard to handle and quite rare (html after tag that starts with http://www.w3.org/TR/html5/tokenization.html#bogus-comment-state Ready for review. |
@ezio-melotti it would be nice if you can review the change soon :) |
I would like to give this a proper review before merging, but unfortunately I probably won't have time to look at this until the end of May. If I haven't replied by then, feel free to ping me again, and thanks for working on this! |
Thanks for updating me on the review status, glad to know that it's planned 👍🏻 |
Hi. It's the last day of May, so I decided to ping you :) |
@ezio-melotti I wonder if you might have some time this time? :) |
@ezio-melotti I appreciate your earlier willingness to review the PR, and I understand that sometimes time is tight. Would you like to review it now? :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I completely forgot about this issue and this PR and created #135664.
The part related to unclosed comments was already included in #135464.
There are few flaws in this PR.
<!--!>
and<!---!>
should not be parsed as abruptly ended empty comments.--!>
does not work here.- When unclosed comment ends with
-
,--
or--!
just before EOF, they should not be appended to the comment token's data.
You can try to fix these issues in this PR, but I'll just add you as co-author of #135664 -- this will be simpler.
Thanks for deep dive. I double-checked the specs and confirm that you are right. I think that we can just go with your PR as it would be much simpler. Thanks for the honor of co-authorship :) |
Now code follows the recommendation.