Skip to content

Numerical IP addresses are not canonicalized #68

@gflohr

Description

@gflohr

I found the below with URI version 1.76.

These (valid!) URLs should all be equivalent:

But URI->new(...)->canonical() does not modify any of them, and consequently URI->new('http://120.144.171.205/')->eq(...) reports false for all of them. Is that on purpose? If http://localhost/ and http://lOcAlHoSt are equal, then http://127.0.0.1 and http://0x7f.0.0.01 should be equal as well because they point to the same resource.

On the other hand, all of these URLs are invalid because of integer overflow but they are accepted by URI->new():

A similar problem appears to exist with IPv6 addresses. These are equal:

  • http://[::1]/
  • http://[::0000:1]/
  • http://[::0000:0001]/

I think, the following decimal to hex normalizations should also be performed by URI->canonical():

  • http://[::ffff:192.168.66.77] => http://[::ffff:c0a8:424d]/
  • http://[::ffff:0:192.168.66.77] => http://[::ffff:0:c0a8:424d]/
  • http://[64:ff9b::192.168.66.77] => http://[64:ff9b::c0a8:424d]/

And these IPv6 addresses are invalid:

  • http://[::10000]/ (overflow)
  • http://[f:f:f:f:f:f:f:f:f]/ (9 instead of a maximum of 8 groups)
  • http://[f::f::f]/ (subsequent zero groups may only be compressed once)

My blog post http://www.guido-flohr.net/the-gory-details-of-url-validation/ contains a little bit more information about the topic.

The following list of URLs may be more convenient for testing browser behavior:

If you have a web server running on port 8080, all of these URLs should open the exact same page and show the canonical form http://127.0.0.1:8080/ in the browser address bar.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions