Skip to content

Commit 4f087ac

Browse files
committed
A type representing an owned C-compatible wide string
1 parent 5b785bc commit 4f087ac

File tree

1 file changed

+72
-0
lines changed

1 file changed

+72
-0
lines changed

text/0000-cwstring.md

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
- Feature Name: A type representing an owned C-compatible wide string
2+
- Start Date: 2016-10-20
3+
- RFC PR:
4+
- Rust Issue:
5+
6+
# Summary
7+
8+
Add CWideString/CWideStr for more simple interaction with not well-formed UTF-16
9+
external API (for example: with Windows API).
10+
11+
# Motivation
12+
13+
This RFC born from issue: [rust-lang/rust#36671](https://github.com/rust-lang/rust/issues/36671)
14+
15+
Many of Windows API use not well-formed UTF-16 strings. Some of this API use
16+
null-terminated strings.
17+
18+
Rust lack simple null-terminated UTF-16 string conversions and now working with this
19+
API need too many boilerplate copy-paste code like:
20+
```
21+
unsafe fn from_wide_null(ptr: *const u16) -> OsString {
22+
let mut len = 0;
23+
while *ptr.offset(len) != 0 {
24+
len += 1;
25+
}
26+
OsStringExt::from_wide(slice::from_raw_parts(ptr, len as usize))
27+
}
28+
29+
fn to_wide_null(s: &str) -> Vec<u16> {
30+
self.encode_utf16().chain(Some(0)).collect()
31+
}
32+
```
33+
34+
OsString also can't be used with Windows API directly, because it stores string internally
35+
in WTF-8 encoding.
36+
37+
So, this RFC try to add simple and effective way to work with UTF-16 string.
38+
39+
Also it can be usefull:
40+
41+
* Inside Rust OsString implementatino on Windows platform;
42+
* With Java FFI (Java internally use UTF-16 strings).
43+
44+
# Detailed design
45+
46+
Copy CStr/CString as CWideStr/CWideString with using u16 instead of u8.
47+
48+
A preliminary implementation can be found at: https://github.com/bozaro/rust-cwstring
49+
50+
# Drawbacks
51+
52+
This classes is not generally platform specific, but it mostly useful on Windows.
53+
54+
Also CWideString name can be confused, because it use differ element size then
55+
`std::wstring` in C++ world (`std::wstring` type defined as `std::basic_string<wchar_t>`,
56+
but `sizeof(wchar_t)` is platform depended: 4 bytes on Linux, 2 bytes on Windows).
57+
58+
# Alternatives
59+
60+
Keep all as is.
61+
62+
# Unresolved questions
63+
64+
I try implement CWideStr/CWideString and got some issues/questions:
65+
66+
* NulError is copied as WideNulError, but it breaking currect code or name convensions.
67+
* into_string method is removed, because unlike CString, he does not give the performance
68+
profit. Also IntoStringError is not copied from c_str.
69+
* I don't find good name for `u16` method like `to_bytes`.
70+
* memchr and strlen replaced by wmemchr and wstrlen failback implementation.
71+
* May be need better implementation fmt::Debug for CWideStr.
72+

0 commit comments

Comments
 (0)