You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
RFC 3966 "The tel URI for Telephone Numbers" defines the URI scheme "tel". The OWASP sanitizer currently does not handle these correctly: both escaping certain characters when it "MUST NOT", or not escaping characters when it "MUST". Testcases at the end.
The most important difference is described in these two sentences of Section 3:
If the reserved characters "+", ";", "=", and "?" are used as delimiters between components of
the "tel" URI, they MUST NOT be percent encoded.
These characters MUST be percent encoded if they appear in tel URI parameter values.
Unfortunately, determining whether one of these characters is used as a delimiter or as a parameter value is not straightforward, and probably requires parsing the full syntax (also described in Section 3 in ABNF form).
Basic example: href="tel:+1234567890" -> href="tel:+1234567890": The leading plus is a separator and "MUST NOT" be percent encoded according to RFC 3966
But it depends on where the character is in the URI, e.g.: href="tel:tel:890;phone-context=;+123-4-567" -> href="tel:tel:890;phone-context=+123-4-567": The equals is a separator here and "MUST NOT" be escaped but the plus in the parameter is not a separator and thus "MUST" be percent encoded according to RFC 3966
This does matter in the real world: While browsers (tested Chrome & Firefox) will e.g. still recognize some misencoded telephone numbers, E-Mail clients (Outlook, Apple Mail, Gmail) or telephony software (MS Teams) are more restrictive and no longer recognize the numbers. In any case I would argue that a HTML sanitizer library should probably aim for correctness here, even if it still works in some browsers.
Test cases demonstrating the issue, with comments:
Using .allowUrlProtocols("tel") on the current main branch:
See testcase for main: jmiserez@a0f5bd4
Simply add the testcases to your SanitizersTest and run.
I already alluded to this before in a comment last year (#21 (comment)), but did not follow up then with proper testcases. I hope this report is slightly more helpful in understanding the issue.
The text was updated successfully, but these errors were encountered:
jmiserez
changed the title
tel URIs: Incorrect escaping due to missing RFC 3966 "tel:" URI support & parsing
tel URIs: Incorrect escaping due to missing RFC 3966 "tel:" URI syntax support/parsing
Apr 25, 2022
RFC 3966 "The tel URI for Telephone Numbers" defines the URI scheme "tel". The OWASP sanitizer currently does not handle these correctly: both escaping certain characters when it "MUST NOT", or not escaping characters when it "MUST". Testcases at the end.
This is not due to a bug in the OWASP sanitizer, but due to missing support for "tel" URI syntax described in Section 3 of RFC 3966. While the "tel" URI syntax is based on the generic syntax outlined in RFC 2396 "Uniform Resource Identifiers (URI): Generic Syntax", there are important differences.
The most important difference is described in these two sentences of Section 3:
Unfortunately, determining whether one of these characters is used as a delimiter or as a parameter value is not straightforward, and probably requires parsing the full syntax (also described in Section 3 in ABNF form).
Basic example:
href="tel:+1234567890" -> href="tel:+1234567890"
: The leading plus is a separator and "MUST NOT" be percent encoded according to RFC 3966But it depends on where the character is in the URI, e.g.:
href="tel:tel:890;phone-context=;+123-4-567" -> href="tel:tel:890;phone-context=+123-4-567"
: The equals is a separator here and "MUST NOT" be escaped but the plus in the parameter is not a separator and thus "MUST" be percent encoded according to RFC 3966This does matter in the real world: While browsers (tested Chrome & Firefox) will e.g. still recognize some misencoded telephone numbers, E-Mail clients (Outlook, Apple Mail, Gmail) or telephony software (MS Teams) are more restrictive and no longer recognize the numbers. In any case I would argue that a HTML sanitizer library should probably aim for correctness here, even if it still works in some browsers.
Test cases demonstrating the issue, with comments:
.allowUrlProtocols("tel")
on the current main branch:See testcase for
main
: jmiserez@a0f5bd4See testcases for
integrate-url-classifier
: jmiserez@b321f6bSimply add the testcases to your SanitizersTest and run.
I already alluded to this before in a comment last year (#21 (comment)), but did not follow up then with proper testcases. I hope this report is slightly more helpful in understanding the issue.
The text was updated successfully, but these errors were encountered: