forked from Tazinho/Advanced-R-Solutions
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path05-Environments.Rmd
397 lines (331 loc) · 17.7 KB
/
05-Environments.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
# Environments
## Environment basics
1. __<span style="color:red">Q</span>__: List three ways in which an environment differs from a list.
__<span style="color:green">A</span>__: The most important differences are:
* environments have reference semantics
* environments have parents
* environments are not ordered
* elements of environments need to be (uniquely) named
2. __<span style="color:red">Q</span>__: If you don't supply an explicit environment, where do `ls()` and `rm()`
look? Where does `<-` make bindings? The
__<span style="color:green">A</span>__: `ls()` and `rm` look in their calling environments which they find by `as.environment(-1)`.
From the book:
> Assignment is the act of binding (or rebinding) a name to a value in an environment.
From `` ?`<-` ``:
> The operators `<-` and `=` assign into the environment in which they are evaluated. The operator `<-` can be used anywhere, whereas the operator `=` is only allowed at the top level (e.g., in the complete expression typed at the command prompt) or as one of the subexpressions in a braced list of expressions.
3. __<span style="color:red">Q</span>__: Using `parent.env()` and a loop (or a recursive function), verify that the
ancestors of `globalenv()` include `baseenv()` and `emptyenv()`. Use the
same basic idea to implement your own version of `search()`.
__<span style="color:green">A</span>__: We can print the ancestors for example by using a recursive function:
```{r, eval = TRUE}
ancestors <- function(env = globalenv()){
if (identical(env, emptyenv())) {
print(environmentName(env))}
else {
print(environmentName(env))
ancestors(parent.env(env))
}
}
ancestors()
```
To implement a new version of `search()` we use a while statement:
```{r, eval = TRUE}
search2 <- function(env = globalenv()){
envs <- character()
while (!identical(env, emptyenv())) {
ename <- environmentName(env)
if (ename == "base") ename <- "package:base"
if (ename == "R_GlobalEnv") ename <- ".GlobalEnv"
envs <- c(envs, ename)
env <- parent.env(env)
}
return(envs)
}
search2()
# visual check that results are identical to the original search() function
search()
```
## Recursing over environments
1. __<span style="color:red">Q</span>__: Modify `where()` to find all environments that contain a binding for
`name`.
__<span style="color:green">A</span>__: We look at the source code of the original `pryr::where()`:
```{r, eval = FALSE}
pryr::where
function (name, env = parent.frame())
{
stopifnot(is.character(name), length(name) == 1)
env <- to_env(env)
if (identical(env, emptyenv())) { # "base case"
stop("Can't find ", name, call. = FALSE)
}
if (exists(name, env, inherits = FALSE)) { # "success case"
env
}
else { # "recursive case"
where(name, parent.env(env)) # we will copy this line in the success case
}
}
```
Since `where()` stops searching when a match appears, we copy the recursive call in the `else` block to the block of the matching ("success") case, so that our new function `where2` will look for a binding within the complete search path. We also need to pay attention to other details. We have to take care to save the bindings in an object, while not overriding it in our recursive calls. So we create a list object for that and define a new function within `where2()` that we call `where2.internal`. `where2.internal()` will do the recursive work and whenever it finds a binding it will write it via `<<-` to the especially created list in its enclosing environment:
```{r, eval = FALSE}
where2 <- function(name, env = parent.frame()){
# we need to collect all environments where name has a binding
env_list <- list()
# since our function will be recursive and env_list would be overwritten
# when it is inside the recursive function, we put it on the outside of
# the recursive function and concatenate every binding environment
# that we find via the `<<-` operator on its end.
# In the following we start by defining the recursive function:
where2.internal <- function(name, env = parent.frame()) {
stopifnot(is.character(name), length(name) == 1)
env <- pryr:::to_env(env) # note that we need to call to_env via pryr:::
# when we reach the empty environment, we return all binding environments
# (if we found some) if we found no bindings, we give the same error message
# as pryr::where does
if (identical(env, emptyenv())) {
if (length(env_list) != 0){
return(env_list)
}
stop("Can't find ", name, call. = FALSE)
}
if (exists(name, env, inherits = FALSE)) {
# this is a case where we find a binding. the main difference to
# pryr::where is that we don't return immediately. Instead we save
# the binding environment to env_list and call where2.internal again
env_list <<- c(env_list, env)
where2.internal(name, parent.env(env))
} else {
where2.internal(name, parent.env(env))
}
}
# as a last step we just call where2.internal() to start the recursion
where2.internal(name, env = parent.frame())
}
```
Note that `where2.internal()` still provides the same structure as `pryr::where` does and you can also divide it in "base case", "success case" and "recursive case".
2. __<span style="color:red">Q</span>__: Write your own version of `get()` using a function written in the style
of `where()`.
__<span style="color:green">A</span>__: Note that `get()` provides a bit more arguments than our following version, but it should be easy to build up on that. However, we can change `pryr::where` to `get2()` with just changing one line of code (and the function name for the recursive call):
```{r}
get2 <- function(name, env = parent.frame())
{
stopifnot(is.character(name), length(name) == 1)
env <- pryr:::to_env(env)
if (identical(env, emptyenv())) {
stop("Can't find ", name, call. = FALSE)
}
if (exists(name, env, inherits = FALSE)) {
# we cancel env and substitute it with the following line, where we subset
# the environment (like a list) by the name of our object and return it
return(env[[name]])
}
else {
get2(name, parent.env(env))
}
}
```
3. __<span style="color:red">Q</span>__: Write a function called `fget()` that finds only function objects. It
should have two arguments, `name` and `env`, and should obey the regular
scoping rules for functions: if there's an object with a matching name
that's not a function, look in the parent. For an added challenge, also
add an `inherits` argument which controls whether the function recurses up
the parents or only looks in one environment.
__<span style="color:green">A</span>__: We can build up our function on the implementation of `get2()` in the last exercise. We only need to add a check via `is.function()`, change the name (also in the recursive call) and the error message:
```{r}
fget2 <- function(name, env = parent.frame()){
stopifnot(is.character(name), length(name) == 1)
env <- pryr:::to_env(env)
if (identical(env, emptyenv())) {
stop("Could not find function called ", name, call. = FALSE) #
}
# here we add the is.function() check
if (exists(name, env, inherits = FALSE) && is.function(env[[name]])) {
return(env[[name]])
}
else {
fget2(name, parent.env(env))
}
}
```
Note that this function is almost the same as the implementation of `pryr::fget()`:
```{r}
pryr::fget
```
We add an `inherits` parameter as described in the exercise:
```{r}
fget3 <- function(name, env = parent.frame(), inherits = TRUE){
stopifnot(is.character(name), length(name) == 1)
env <- pryr:::to_env(env)
if (identical(env, emptyenv())) {
stop("Could not find function called ", name, call. = FALSE)
}
if (exists(name, env, inherits = FALSE) && is.function(env[[name]])) {
return(env[[name]])
}
# after the environment, which is specified in the env parameter, is checked
# we stop our function in case the new inherits parameter is set to FALSE
if(inherits == FALSE){
stop("Could not find function called ", name," within ",
environmentName(env),
call. = FALSE)
}
else {
fget3(name, parent.env(env))
}
}
```
4. __<span style="color:red">Q</span>__: Write your own version of `exists(inherits = FALSE)` (Hint: use `ls()`.)
Write a recursive version that behaves like `exists(inherits = TRUE)`.
__<span style="color:green">A</span>__: We write two versions. `exists2()` will be the case `inherits = FALSE` and `exists3()` `inherits = TRUE`:
```{r}
exists2 <- function(name, env = parent.frame()){
stopifnot(is.character(name), length(name) == 1)
env <- pryr:::to_env(env)
name %in% ls(env, sorted = FALSE) # set sorted to FALSE for a small speedup
}
exists3 <- function(name, env = parent.frame()){
stopifnot(is.character(name), length(name) == 1)
env <- pryr:::to_env(env)
if (identical(env, emptyenv())) {
return(FALSE)
}
if (name %in% ls(env, sorted = FALSE)){
TRUE
}
else {
exists3(name, parent.env(env))
}
}
```
## Function environments
1. __<span style="color:red">Q</span>__: List the four environments associated with a function. What does each one
do? Why is the distinction between enclosing and binding environments
particularly important?
__<span style="color:green">A</span>__:
* Enclosing: where the function is created
* Binding: where the function was assigned
* Execution: a temporary environment which is created when the function is executed
* Calling: the environment from where the function was called
The difference between binding and enclosing environment is important, because of R's lexical scoping rules. If R can't find an object in the current environment while executing a function, it will look for it in the enclosing environment.
2. __<span style="color:red">Q</span>__: Draw a diagram that shows the enclosing environments of this function:
```{r, eval = FALSE}
f1 <- function(x1) {
f2 <- function(x2) {
f3 <- function(x3) {
x1 + x2 + x3
}
f3(3)
}
f2(2)
}
f1(1)
```
__<span style="color:green">A</span>__:
```{r, echo = FALSE}
knitr::include_graphics("diagrams/enclosing.png", dpi = 96)
```
3. __<span style="color:red">Q</span>__: Expand your previous diagram to show function bindings.
__<span style="color:green">A</span>__:
```{r, echo = FALSE}
knitr::include_graphics("diagrams/binding.png", dpi = 96)
```
4. __<span style="color:red">Q</span>__: Expand it again to show the execution and calling environments.
__<span style="color:green">A</span>__:
```{r, echo = FALSE}
knitr::include_graphics("diagrams/calling_execution.png", dpi = 96)
```
5. __<span style="color:red">Q</span>__: Write an enhanced version of `str()` that provides more information
about functions. Show where the function was found and what environment
it was defined in.
__<span style="color:green">A</span>__: Additionally we provide the function type in the sense of `pryr::ftype`. We use functions from the `pryr` package, since it provides helpers for all requested features:
```{r}
fstr <- function(object){
if(!is.function(object)){stop("fstr works only for functions")}
object_str <- lazyeval::expr_text(object)
flist <- list(ftype = pryr::ftype(object),
where = pryr::where(object_str),
enclosing_env = pryr::enclosing_env(object),
args = pryr::fun_args(object)
)
return(flist)
}
```
Note that we wanted to have non standard evaluation like the original `str()` function. Since `pryr::where()` doesn't support non standard evaluation, we needed to catch the name of the supplied `object`. Therefore we used `expr_text()` from the [lazyeval](https://github.com/hadley/lazyeval) package. As a result, `fstr(object = packagename::functionname)` will result in an error in contrast to `str()`.
## Binding names to values
1. __<span style="color:red">Q</span>__: What does this function do? How does it differ from `<<-` and why
might you prefer it?
```{r, error = TRUE}
rebind <- function(name, value, env = parent.frame()) {
if (identical(env, emptyenv())) {
stop("Can't find ", name, call. = FALSE)
} else if (exists(name, envir = env, inherits = FALSE)) {
assign(name, value, envir = env)
} else {
rebind(name, value, parent.env(env))
}
}
rebind("a", 10)
a <- 5
rebind("a", 10)
a
```
__<span style="color:green">A</span>__: The function does "more or less" the same as `<<-`. Additionally to `<<-` it has an `env` argument, but this is not a big advantage, since also `assign()` provides this functionality. The main difference is that `rebind()` only does an assignment, when it finds a binding in one of the parent environments of `env`. Whereas:
> If `<<-` doesn't find an existing variable, it will create one in the global environment. This is usually undesirable, because global variables introduce non-obvious dependencies between functions.
2. __<span style="color:red">Q</span>__: Create a version of `assign()` that will only bind new names, never
re-bind old names. Some programming languages only do this, and are known
as [single assignment languages][single assignment].
__<span style="color:green">A</span>__: We take the formals from `assign()`'s source code and define our new function. If `x` already exists, we give a message and return `NULL` (since this is the same as `return()`). Otherwise we let the body of the `assign()` function do the work:
```{r}
assign_non_existant <- function(x, value, pos = -1, envir = as.environment(pos),
inherits = FALSE, immediate = TRUE) {
if(exists(x)){
message("No new assignment: '", x, "' already exists")
return(NULL)}
.Internal(assign(x, value, envir, inherits))
}
```
Note that `.Internal(assign(x, value, envir, inherits))`, is not inside an `else` block or any other function. This is important. Otherwise we would change more of `assign()` than we want (in case of the assignment of a new function, the enclosing environment for that function would differ).
3. __<span style="color:red">Q</span>__: Write an assignment function that can do active, delayed, and locked
bindings. What might you call it? What arguments should it take? Can you
guess which sort of assignment it should do based on the input?
__<span style="color:green">A</span>__: The following might be no optimal solution, but we can at least handle two of three cases via if statements. The problem already occured in the last exercise, were we had to do an assignment in an if statement and did a workaround. This workaround only works for one assignment (for logical reasons). We still use the workaround for the "delay case", but we found a solution for the other two cases. The main aspect in it is to unify the environment were `assign()`, `makeActiveBinding()` and `delayedAssign()` act. We also had to test that cases like this
```{r}
makeActiveBinding(sym = "test1",
fun = function() function(x, y = sample(1:3, 1)){x^y},
env = parent.frame())
```
work with our new function and our function creates bindings (and so enclosing environments) in the same places as `assign()` would do, also when used inside funceions.
The usage of `pryr:::to_env()` simplified this process a lot:
```{r, eval = FALSE}
# found at https://github.com/hadley/pryr/blob/master/R/utils.r
function(x, quiet = FALSE) {
if (is.environment(x)) {
x
} else if (is.list(x)) {
list2env(x)
} else if (is.function(x)) {
environment(x)
} else if (length(x) == 1 && is.character(x)) {
if (!quiet) message("Using environment ", x)
as.environment(x)
} else if (length(x) == 1 && is.numeric(x) && x > 0) {
if (!quiet) message("Using environment ", search()[x])
as.environment(x)
} else {
stop("Input can not be coerced to an environment", call. = FALSE)
}
}
```
We used all these thoughts to create the following function:
```{r}
special_assign <- function(x, value, atype, envir = pryr:::to_env(parent.frame())){
if(atype == "locked"){
assign(x, value, envir = envir, inherits)
lockBinding(sym = x, env = envir)
}
if(atype == "active"){makeActiveBinding(sym = x, fun = value, env = envir)}
if(atype != "delayed"){stop("atype must be `locked`, `active` or `delayed`")}
delayedAssign(x, value, eval.env = environment(), assign.env = envir)
}
```
At the moment we have no idea for a good default guess routine, so that a specific `atype` of assignment would be done based on the input.