Skip to content

Update tests to Unicode 16.0 #1045

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 8, 2025
Merged

Update tests to Unicode 16.0 #1045

merged 3 commits into from
May 8, 2025

Conversation

hsivonen
Copy link
Collaborator

@hsivonen hsivonen commented May 8, 2025

This updates the tests to Unicode 16.0. The test harness needs changes, because the earlier test suite had a bug concerning trailing dots. Now the test suite matches the spec text, but the deprecated idna API retains the behavior that was written to the old test suite bug.

It is somewhat unfortunate the test suite is in this repo, but whether the code performs Unicode 16.0 behavior is up to the dependencies. Therefore, the expected landing sequence is this:

  1. This PR (hopefully!) gets approved.
  2. Publish idna_adapter 1.2.1 from its main branch.
  3. Publish idna_mapping 1.1.0 from its main branch.
  4. Land this PR.

@hsivonen hsivonen requested a review from valenting May 8, 2025 07:22
@hsivonen
Copy link
Collaborator Author

hsivonen commented May 8, 2025

And the tests here are, of course, failing, because the new versions of the dependencies haven't been published, yet.

Also, once idna_adapter 1.2.1 is published, the rust-url CI Rust version with default idna_adapter needs to be raised to 1.82.

@hsivonen
Copy link
Collaborator Author

hsivonen commented May 8, 2025

And, of course ICU4X 1.x doesn't work with Unicode 16.0 test data, so that can't be tested.

@hsivonen hsivonen marked this pull request as draft May 8, 2025 08:00
@hsivonen
Copy link
Collaborator Author

hsivonen commented May 8, 2025

And, indeed, there are enough changes that the old test suite does not pass with Unicode 16.0 implementation internals.

@hsivonen hsivonen marked this pull request as ready for review May 8, 2025 08:29
@hsivonen
Copy link
Collaborator Author

hsivonen commented May 8, 2025

Timings building reqwest trunk on M3 Pro:

Back end Debug Release
ICU4X 2.0 6.0 s 7.7 s
ICU4X 1.5 6.2 s 8.1 s
unicode-rs 5.2 s 7.1 s
no-unicode 5.0 s 6.8 s

@hsivonen
Copy link
Collaborator Author

hsivonen commented May 8, 2025

Updating from ICU4X 1.5 (Unicode 15.1) to 2.0 (Unicode 16.0) increases the Brotli-compressed wasm-opt-optimized wasm footprint of rust-url by 2273 bytes.

@hsivonen
Copy link
Collaborator Author

hsivonen commented May 8, 2025

ICU4X 1.5:

test to_ascii_already_puny_label ... bench:         114 ns/iter (+/- 1)
test to_ascii_cow_hyphen         ... bench:          30 ns/iter (+/- 1)
test to_ascii_cow_leading_digit  ... bench:          57 ns/iter (+/- 0)
test to_ascii_cow_plain          ... bench:          11 ns/iter (+/- 0)
test to_ascii_cow_punycode_ltr   ... bench:         253 ns/iter (+/- 7)
test to_ascii_cow_punycode_mixed ... bench:         155 ns/iter (+/- 6)
test to_ascii_cow_punycode_rtl   ... bench:         242 ns/iter (+/- 4)
test to_ascii_cow_unicode_ltr    ... bench:         293 ns/iter (+/- 14)
test to_ascii_cow_unicode_mixed  ... bench:         208 ns/iter (+/- 9)
test to_ascii_cow_unicode_rtl    ... bench:         283 ns/iter (+/- 3)
test to_ascii_merged             ... bench:         227 ns/iter (+/- 4)
test to_ascii_puny_label         ... bench:         133 ns/iter (+/- 1)
test to_ascii_simple             ... bench:          28 ns/iter (+/- 1)
test to_unicode_ascii            ... bench:          26 ns/iter (+/- 0)
test to_unicode_merged_label     ... bench:         257 ns/iter (+/- 1)
test to_unicode_puny_label       ... bench:         120 ns/iter (+/- 1)

ICU4X 2.0:

test to_ascii_already_puny_label ... bench:         104 ns/iter (+/- 3)
test to_ascii_cow_hyphen         ... bench:          26 ns/iter (+/- 0)
test to_ascii_cow_leading_digit  ... bench:          52 ns/iter (+/- 1)
test to_ascii_cow_plain          ... bench:           8 ns/iter (+/- 0)
test to_ascii_cow_punycode_ltr   ... bench:         223 ns/iter (+/- 12)
test to_ascii_cow_punycode_mixed ... bench:         141 ns/iter (+/- 1)
test to_ascii_cow_punycode_rtl   ... bench:         225 ns/iter (+/- 13)
test to_ascii_cow_unicode_ltr    ... bench:         261 ns/iter (+/- 3)
test to_ascii_cow_unicode_mixed  ... bench:         197 ns/iter (+/- 4)
test to_ascii_cow_unicode_rtl    ... bench:         261 ns/iter (+/- 4)
test to_ascii_merged             ... bench:         199 ns/iter (+/- 4)
test to_ascii_puny_label         ... bench:         122 ns/iter (+/- 0)
test to_ascii_simple             ... bench:          23 ns/iter (+/- 0)
test to_unicode_ascii            ... bench:          22 ns/iter (+/- 0)
test to_unicode_merged_label     ... bench:         202 ns/iter (+/- 2)
test to_unicode_puny_label       ... bench:         112 ns/iter (+/- 1)

Copy link

codecov bot commented May 8, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Please upload report for BASE (main@7cff874). Learn more about missing BASE report.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1045   +/-   ##
=======================================
  Coverage        ?   80.11%           
=======================================
  Files           ?       24           
  Lines           ?     4355           
  Branches        ?        0           
=======================================
  Hits            ?     3489           
  Misses          ?      866           
  Partials        ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@hsivonen hsivonen added this pull request to the merge queue May 8, 2025
Merged via the queue into servo:main with commit 68f151c May 8, 2025
22 of 33 checks passed
@hsivonen hsivonen deleted the unicode16 branch May 8, 2025 15:57
@hsivonen hsivonen mentioned this pull request May 9, 2025
kodiakhq bot pushed a commit to pdylanross/fatigue that referenced this pull request Aug 21, 2025
Bumps url from 2.5.4 to 2.5.5.

Release notes
Sourced from url's releases.

v2.5.5
What's Changed

ci: downgrade crates when building for Rust 1.67.0 by @​mxinden in servo/rust-url#1003
ci: run unit tests with sanitizers by @​mxinden in servo/rust-url#1002
fix small typo by @​hkBst in servo/rust-url#1011
chore: fix clippy errors on main by @​dsherret in servo/rust-url#1019
perf: remove heap allocation in parse_query by @​dsherret in servo/rust-url#1020
perf: slightly improve parsing a port by @​dsherret in servo/rust-url#1022
perf: improve to_file_path() by @​dsherret in servo/rust-url#1018
perf: make parse_scheme slightly faster by @​dsherret in servo/rust-url#1025
update LICENSE-MIT by @​wmjae in servo/rust-url#1029
perf: url encode path segments in longer string slices by @​dsherret in servo/rust-url#1026
Disable the default features on serde by @​rilipco in servo/rust-url#1033
docs: base url relative join by @​tisonkun in servo/rust-url#1013
perf: remove heap allocation in parse_host by @​dsherret in servo/rust-url#1021
Update tests to Unicode 16.0 by @​hsivonen in servo/rust-url#1045
Add some some basic functions to Mime by @​mrobinson in servo/rust-url#1047
ran cargo clippy --fix -- -Wclippy::use_self by @​mrobinson in servo/rust-url#1048
Fix MSRV and clippy CI by @​Manishearth in servo/rust-url#1058
Update Url::domain docs to show that it includes subdomain by @​supercoolspy in servo/rust-url#1057
set_hostname should error when encountering colon ':' by @​edgul in servo/rust-url#1060
version bump to 2.5.5 by @​edgul in servo/rust-url#1061

New Contributors

@​mxinden made their first contribution in servo/rust-url#1003
@​hkBst made their first contribution in servo/rust-url#1011
@​wmjae made their first contribution in servo/rust-url#1029
@​rilipco made their first contribution in servo/rust-url#1033
@​tisonkun made their first contribution in servo/rust-url#1013
@​supercoolspy made their first contribution in servo/rust-url#1057

Full Changelog: servo/[email protected]



Commits

a40f904 version bump to 2.5.5 (#1061)
cf305db set_hostname should error when encountering colon ':' (#1060)
88826bd Update Url::domain docs to show that it includes subdomain (#1057)
c3bbf66 Fix MSRV and clippy CI (#1058)
dbd5261 ran cargo clippy --fix -- -Wclippy::use_self (#1048)
9f6e92e Add some some basic functions to Mime (#1047)
68f151c Update tests to Unicode 16.0 (#1045)
7cff874 perf: remove heap allocation in parse_host (#1021)
968e862 docs: base url relative join (#1013)
2ce2e12 Disable the default features on serde. (#1033)
Additional commits viewable in compare view




Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot merge will merge this PR after your CI passes on it
@dependabot squash and merge will squash and merge this PR after your CI passes on it
@dependabot cancel merge will cancel a previously requested merge and block automerging
@dependabot reopen will reopen this PR if it is closed
@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
@dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants