Skip to content

Commit 6ceb9b4

Browse files
committed
auto merge of #16824 : steveklabnik/rust/string_guide_improvements, r=alexcrichton
A few steps toward #15994
2 parents a1f4973 + 8ddb9c7 commit 6ceb9b4

File tree

1 file changed

+86
-3
lines changed

1 file changed

+86
-3
lines changed

src/doc/guide-strings.md

Lines changed: 86 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -92,9 +92,33 @@ fn foo(s: String) {
9292
```
9393

9494
If you have good reason. It's not polite to hold on to ownership you don't
95-
need, and it can make your lifetimes more complex. Furthermore, you can pass
96-
either kind of string into `foo` by using `.as_slice()` on any `String` you
97-
need to pass in, so the `&str` version is more flexible.
95+
need, and it can make your lifetimes more complex.
96+
97+
## Generic functions
98+
99+
To write a function that's generic over types of strings, use [the `Str`
100+
trait](http://doc.rust-lang.org/std/str/trait.Str.html):
101+
102+
```{rust}
103+
fn some_string_length<T: Str>(x: T) -> uint {
104+
x.as_slice().len()
105+
}
106+
107+
fn main() {
108+
let s = "Hello, world";
109+
110+
println!("{}", some_string_length(s));
111+
112+
let s = "Hello, world".to_string();
113+
114+
println!("{}", some_string_length(s));
115+
}
116+
```
117+
118+
Both of these lines will print `12`.
119+
120+
The only method that the `Str` trait has is `as_slice()`, which gives you
121+
access to a `&str` value from the underlying string.
98122

99123
## Comparisons
100124

@@ -121,6 +145,65 @@ fn compare(string: String) {
121145
Converting a `String` to a `&str` is cheap, but converting the `&str` to a
122146
`String` involves an allocation.
123147

148+
## Indexing strings
149+
150+
You may be tempted to try to access a certain character of a `String`, like
151+
this:
152+
153+
```{rust,ignore}
154+
let s = "hello".to_string();
155+
156+
println!("{}", s[0]);
157+
```
158+
159+
This does not compile. This is on purpose. In the world of UTF-8, direct
160+
indexing is basically never what you want to do. The reason is that each
161+
character can be a variable number of bytes. This means that you have to iterate
162+
through the characters anyway, which is a O(n) operation.
163+
164+
To iterate over a string, use the `graphemes()` method on `&str`:
165+
166+
```{rust}
167+
let s = "αἰθήρ";
168+
169+
for l in s.graphemes(true) {
170+
println!("{}", l);
171+
}
172+
```
173+
174+
Note that `l` has the type `&str` here, since a single grapheme can consist of
175+
multiple codepoints, so a `char` wouldn't be appropriate.
176+
177+
This will print out each character in turn, as you'd expect: first "α", then
178+
"ἰ", etc. You can see that this is different than just the individual bytes.
179+
Here's a version that prints out each byte:
180+
181+
```{rust}
182+
let s = "αἰθήρ";
183+
184+
for l in s.bytes() {
185+
println!("{}", l);
186+
}
187+
```
188+
189+
This will print:
190+
191+
```{notrust,ignore}
192+
206
193+
177
194+
225
195+
188
196+
176
197+
206
198+
184
199+
206
200+
174
201+
207
202+
129
203+
```
204+
205+
Many more bytes than graphemes!
206+
124207
# Other Documentation
125208

126209
* [the `&str` API documentation](/std/str/index.html)

0 commit comments

Comments
 (0)