-
Notifications
You must be signed in to change notification settings - Fork 20
Add new iterators: core::str::{chars_uppercase,chars_lowercase}
#58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Essentially, this is useful because the naïve solution here -- something like We handle this in It also adds an iterator over the uppercased chars. This is probably mostly for symmetry, since currently, there is no situation where this differs from |
Hi! I stumbled across this issue and I think a crate I maintain is relevant here: https://crates.io/crates/roe. Roe is a no std crate that does case mapping for ascii/full/turkic modes to upper/lower/title case. I find it hard to grasp what std does in its string lowercase operation. It mentions the Unicode lowercase property and maybe some special cases for Greek. Case mapping and case folding are Unicode transforms that the "to lowercase" operation in std differs from. I'd love for case mapping to be better supported in core, but would prefer that it be via a Unicode Case Mapping API. |
I think this assumes "full" case mapping mode (or something like it). Turkic mapping mode would map |
This issue is about adding new iterators that keep existing behavior from You seem interested in locale-specific mapping, that requires designing new APIs for that. This is right repo for that discussion, you can create an issue with problem description and API suggestions. |
Yeah, we shouldn't add any locale-dependent mappings. (But this is a good argument that yes, it would be completely legal for Unicode to add a locale-independent uppercase mapping to SpecialCasing.txt in some future version). |
I realize this is quite old, but it doesn't appear to be implemented yet. What about adding |
I'm planning on getting to this soon (I already have a patch, I'll finish it up this weekend). I think the approach that got accepted in the ACP is fine but don't feel that strongly. |
I just worry about the ballooning number of |
Eh, I've seen a few people use |
That's a good point, but I think we can just add a clippy lint for that pattern and clearly document A clippy lint would be valuable regardless of approach, actually. |
Closing this as this was already seconded. |
Proposal
Problem statement
There is no way to itererate over lowercase/uppercase chars of a
&str
without creating full copy of it withstr::to_lowercase
or
str::to_uppercase
.But there exists a bad way:
str::chars
withchar::to_lowercase
. The problem is that the latter API cannot match output ofstr::to_lowercase
orstr::to_uppercase
as some Unicode conversions require context and that API cannot provide it.Motivation, use-cases
Reason for avoiding call to
str::to_lowercase()
andstr::to_uppercase
:no_std
Reasons to avoid
char::to_uppercase
andchar::to_lowercase
:Solution sketches
Solution A:
str::chars_lowercase() -> CharsLowercase
str::chars_uppercase() -> CharsUppercase
Solution B:
char::to_lowercase_with_context(&str, usize) -> ToLowercase
char::to_uppercase_with_context(&str, usize) -> ToUppercase
Links and related work
PR for solution A: rust-lang/rust#98490
The text was updated successfully, but these errors were encountered: