-
Notifications
You must be signed in to change notification settings - Fork 136
Move and refactor remove_base to unresolve function in iri_resolver #216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
0995d3b
Move unresolve function
mielvds c0c4147
Remove incorrect test, throw error when base IRI is invalid and impro…
mielvds 6616de0
Adjust jsonld.py to use unresolve
mielvds 3335511
Simplify unresolve code by using stdlib url(un)parse
mielvds db3a2d1
Add type annotation for parsed_iri
mielvds 67b8ffc
Append to changelog
mielvds 9ae042c
Relative IRIs must not have the form of a keyword
mielvds 4c79719
Remove succeeding test from skipped tests
mielvds File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,7 +1,11 @@ | ||
| """ | ||
| The functions 'remove_dot_segments()', 'resolve()' and 'is_character_allowed_after_relative_path_segment()' are direct ports from [relative-to-absolute-iri.js](https://github.com/rubensworks/relative-to-absolute-iri.js) | ||
| - The functions 'remove_dot_segments()', 'resolve()' and 'is_character_allowed_after_relative_path_segment()' are direct ports from [relative-to-absolute-iri.js](https://github.com/rubensworks/relative-to-absolute-iri.js) (c) Ruben Taelman <[email protected]> | ||
| - The 'unresolve()' function is a move and rename of the 'remove_base()' function from 'jsonld.py' | ||
| """ | ||
|
|
||
| from urllib.parse import ParseResult, urlparse, urlunparse | ||
|
|
||
|
|
||
| def is_character_allowed_after_relative_path_segment(ch: str) -> bool: | ||
| """Return True if a character is valid after '.' or '..' in a path segment.""" | ||
| return not ch or ch in ('#', '?', '/') | ||
|
|
@@ -204,4 +208,81 @@ def resolve(relative_iri: str, base_iri: str = None) -> str: | |
| relative_iri = base_path + relative_iri | ||
| relative_iri = remove_dot_segments(relative_iri) | ||
|
|
||
| return base_iri[:base_slash_after_colon_pos] + relative_iri | ||
| return base_iri[:base_slash_after_colon_pos] + relative_iri | ||
|
|
||
| def unresolve(absolute_iri: str, base_iri: str = ""): | ||
| """ | ||
| Unresolves a given absolute IRI to an IRI relative to the given base IRI. | ||
|
|
||
| :param base: the base IRI. | ||
| :param iri: the absolute IRI. | ||
|
|
||
| :return: the relative IRI if relative to base, otherwise the absolute IRI. | ||
| """ | ||
| # skip IRI processing | ||
| if not base_iri: | ||
| return absolute_iri | ||
|
|
||
| base = urlparse(base_iri) | ||
|
|
||
| if not base.scheme: | ||
| raise ValueError(f"Found invalid baseIRI '{base_iri}' for value '{absolute_iri}'") | ||
|
|
||
| # compute authority (netloc) and strip default ports | ||
| base_authority = parse_authority(base) | ||
|
|
||
| rel = urlparse(absolute_iri) | ||
| # compute authority (netloc) and strip default ports | ||
| rel_authority = parse_authority(rel) | ||
|
|
||
| # schemes and network locations (authorities) don't match, don't alter IRI | ||
| if not (base.scheme == rel.scheme and base_authority == rel_authority): | ||
| return absolute_iri | ||
|
|
||
| # remove path segments that match (do not remove last segment unless there | ||
| # is a hash or query | ||
| base_segments = remove_dot_segments(base.path).split('/') | ||
| iri_segments = remove_dot_segments(rel.path).split('/') | ||
| last = 0 if (rel.fragment or rel.query) else 1 | ||
| while (len(base_segments) and len(iri_segments) > last and | ||
| base_segments[0] == iri_segments[0]): | ||
| base_segments.pop(0) | ||
| iri_segments.pop(0) | ||
|
|
||
| # use '../' for each non-matching base segment | ||
| rval = '' | ||
| if len(base_segments): | ||
| # don't count the last segment (if it ends with '/' last path doesn't | ||
| # count and if it doesn't end with '/' it isn't a path) | ||
| base_segments.pop() | ||
| rval += '../' * len(base_segments) | ||
|
|
||
| # prepend remaining segments | ||
| rval += '/'.join(iri_segments) | ||
|
|
||
| # relative IRIs must not have the form of a keyword | ||
| if rval and rval[0] == '@': | ||
| rval = './' + rval | ||
|
|
||
| # build relative IRI using urlunparse with empty scheme/netloc | ||
| return urlunparse(('', '', rval, '', rel.query or '', rel.fragment or '')) or './' | ||
|
|
||
| def parse_authority(parsed_iri: ParseResult) -> str: | ||
| """ | ||
| Compute authority (netloc) and strip default ports | ||
|
|
||
| :param parsed_iri: Description | ||
| :return: Description | ||
| :rtype: str | ||
| """ | ||
| base_authority = parsed_iri.netloc or None | ||
|
|
||
| try: | ||
| base_port = parsed_iri.port | ||
| except Exception: | ||
| base_port = None | ||
|
|
||
| if base_authority is not None and base_port is not None: | ||
| if (parsed_iri.scheme == 'https' and base_port == 443) or (parsed_iri.scheme == 'http' and base_port == 80): | ||
| base_authority = base_authority.rsplit(':', 1)[0] | ||
| return base_authority | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should class-based or function-based
pytesttestcases be chosen as uniform practice for this project?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally prefer class-based, but I'm fine with both. The only reason why the specs are function-based, is because I couldn't figure out how to make it work otherwise :) What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must confess I never saw
pytestclass based tests in prod code. Everyone just writes functions afterunittestceased to be de facto standard. But that's just my anecdotal evidence.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have, so I guess it's personal preference. I suggest we keep it for now, being
tests/test_*.pyfor everylib/*.py(with the exception oftest_manifests.pyofc)... and having a bigger discussion on this some time later