Percent encoding | in paths
#3479
Unanswered
nathaniel-daniel
asked this question in
Potential Issue
Replies: 1 comment
-
|
Should I open an issue/pr? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
OS: Windows 11
python --version:Python 3.12.8httpxversion:0.28.1I believe the
|should be percent encoded in paths, which is not currently the case. If I'm understanding RFC3986 correctly, path characters arepchar, which can beunreserved,pct-encoded,sub-delims,":", or"@".unreservedcan be composed ofALPHA,DIGIT,"-",".","_", or"~".pct-encodedis the percent encoding sequences.sub-delimscan be"!","$","&","'","(",")","*","+",",",";", or"=". Nowhere in this set is the|character present, meaning it has to be percent-encoded.Simplifying my problem,
httpxseems to call its internalurlparsefunction to process urls. So, here's an example using that function. This function normally percent-encodes characters as needed, like spaces:will return
However, this does not happen for
|:will return
In Firefox and Google Chrome,
|is percent-encoded:will return
In the
requestslibrary,|is also percent-encoded:will return
The
rfc3986library also percent encodes|:will return
Using
urllibitself,|also seems to be percent-encoded for path components:will return
'/%7C'I'm fairly certain that I've interpreted this RFC right, and I think that
|should be excluded from thePATH_SAFEset here. Here is its current value:"!$%&'()*+,-./0123456789:;=@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_abcdefghijklmnopqrstuvwxyz|~".Potential Fix: nathaniel-daniel@a2f327f
Beta Was this translation helpful? Give feedback.
All reactions