Skip to content

Commit b8e80e1

Browse files
committed
"improved" it
1 parent 6d76008 commit b8e80e1

File tree

1 file changed

+52
-0
lines changed

1 file changed

+52
-0
lines changed

user-guide/type-mapping/characters.qmd

+52
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,34 @@ let bytes = bytes.as_bytes().to_owned();
3434
bytes
3535
```
3636

37+
38+
Let us investigate the address of these two identical snippets of data
39+
40+
```{extendrsrc}
41+
#[extendr]
42+
fn hello_world() -> &'static str {
43+
let hello_world = "Hello World!";
44+
rprintln!("Address of the Rust `hello_world`: {:p}", hello_world.as_ptr());
45+
hello_world
46+
}
47+
```
48+
49+
50+
```{r}
51+
hello_world()
52+
```
53+
54+
And the adress of `hello_world`, once it is part of the R runtime:
55+
56+
```{r}
57+
.Internal(inspect(hello_world()))
58+
```
59+
60+
::::: {.callout-note}
61+
The return type of `hello_world` need not be `'static str`. The life-time can be made
62+
arbitrary, such as `fn hello_world<'a>() -> &'a str`.
63+
:::
64+
3765
A `character`-vector in R could be compared to a `Vec<String>` in Rust. However, there is an important distinction, that we'll illustrate with an example.
3866

3967
```{extendr}
@@ -82,10 +110,34 @@ Thus, `[&str]` and `character` behave similarly. Let's investigate `&[String]`:
82110
.collect::<Vec<_>>()
83111
```
84112

113+
<!-- @co-authors: the snippet below is an alternative to the above snippet -->
114+
115+
```{extendr, echo=FALSE, eval=FALSE}
116+
let sample_states = [
117+
"Texas",
118+
"Maine",
119+
"Maine",
120+
"Idaho",
121+
"Maine",
122+
"Maine",
123+
];
124+
let mut state_ptrs = Vec::with_capacity(sample_states.len());
125+
let mut state_strings = Vec::with_capacity(sample_states.len());
126+
for state in sample_states {
127+
let mut x_string = String::with_capacity(5);
128+
x_string.push_str(state);
129+
state_ptrs.push(format!("{:p}", x_string.as_ptr()));
130+
state_strings.push(x_string);
131+
}
132+
state_ptrs
133+
```
134+
85135
The memory addresses of all the items are different, even for those entries that have the same value.
86136

87137
Thus, R's `character` is actually more resembling that of `[&str]`, rather than a container of `String`.
88138

139+
<!-- TODO: mention that direct indexing in utf-8 is difficult... -->
140+
89141
The R runtime performs [string interning](https://en.wikipedia.org/wiki/String_interning) to
90142
all of its string elements. This means, that whenever R encounters a new string,
91143
it adds it to its internal string intern pool. Therefore, it is unsound to

0 commit comments

Comments
 (0)