@@ -92,9 +92,33 @@ fn foo(s: String) {
9292```
9393
9494If you have good reason. It's not polite to hold on to ownership you don't
95- need, and it can make your lifetimes more complex. Furthermore, you can pass
96- either kind of string into ` foo ` by using ` .as_slice() ` on any ` String ` you
97- need to pass in, so the ` &str ` version is more flexible.
95+ need, and it can make your lifetimes more complex.
96+
97+ ## Generic functions
98+
99+ To write a function that's generic over types of strings, use [ the ` Str `
100+ trait] ( http://doc.rust-lang.org/std/str/trait.Str.html ) :
101+
102+ ``` {rust}
103+ fn some_string_length<T: Str>(x: T) -> uint {
104+ x.as_slice().len()
105+ }
106+
107+ fn main() {
108+ let s = "Hello, world";
109+
110+ println!("{}", some_string_length(s));
111+
112+ let s = "Hello, world".to_string();
113+
114+ println!("{}", some_string_length(s));
115+ }
116+ ```
117+
118+ Both of these lines will print ` 12 ` .
119+
120+ The only method that the ` Str ` trait has is ` as_slice() ` , which gives you
121+ access to a ` &str ` value from the underlying string.
98122
99123## Comparisons
100124
@@ -121,6 +145,65 @@ fn compare(string: String) {
121145Converting a ` String ` to a ` &str ` is cheap, but converting the ` &str ` to a
122146` String ` involves an allocation.
123147
148+ ## Indexing strings
149+
150+ You may be tempted to try to access a certain character of a ` String ` , like
151+ this:
152+
153+ ``` {rust,ignore}
154+ let s = "hello".to_string();
155+
156+ println!("{}", s[0]);
157+ ```
158+
159+ This does not compile. This is on purpose. In the world of UTF-8, direct
160+ indexing is basically never what you want to do. The reason is that each
161+ character can be a variable number of bytes. This means that you have to iterate
162+ through the characters anyway, which is a O(n) operation.
163+
164+ To iterate over a string, use the ` graphemes() ` method on ` &str ` :
165+
166+ ``` {rust}
167+ let s = "αἰθήρ";
168+
169+ for l in s.graphemes(true) {
170+ println!("{}", l);
171+ }
172+ ```
173+
174+ Note that ` l ` has the type ` &str ` here, since a single grapheme can consist of
175+ multiple codepoints, so a ` char ` wouldn't be appropriate.
176+
177+ This will print out each character in turn, as you'd expect: first "α", then
178+ "ἰ", etc. You can see that this is different than just the individual bytes.
179+ Here's a version that prints out each byte:
180+
181+ ``` {rust}
182+ let s = "αἰθήρ";
183+
184+ for l in s.bytes() {
185+ println!("{}", l);
186+ }
187+ ```
188+
189+ This will print:
190+
191+ ``` {notrust,ignore}
192+ 206
193+ 177
194+ 225
195+ 188
196+ 176
197+ 206
198+ 184
199+ 206
200+ 174
201+ 207
202+ 129
203+ ```
204+
205+ Many more bytes than graphemes!
206+
124207# Other Documentation
125208
126209* [ the ` &str ` API documentation] ( /std/str/index.html )
0 commit comments