Imagine the following scenario:
y_global <- 42
add_nums <- function(x, y, z) { x + y_global + z }
add_nums(1,2,3)
## [1] 46
46?! What’s happening here is that R is using a variable outside of
add_nums
scope. R does not have strict pragma such as other languages
(e.g. use strict
in Perl) and this can be a bit bothersome.
And as a consequence this happens to me way more often than I’d like to admit,
and I think this can be really dangerous, as the error is silent.
I came across this StackOverflow
thread
that suggests a workaround to wrap a function with local
. It’s a
clever idea but I think it’s a bit messy and has a level of complexity
that could be simplified. So I came up with a strategy to avoid these
situations, which is to encapsulate functions in their own environment
detached from global_env
.
Here, I have a helper function that I use in most of the analyses I do.
encapsulate <- function(fxn) {
nenv <- new.env(parent = baseenv())
environment(fxn) <- nenv
return(fxn)
}
Using it is really simple, just wrap encapsulate
around the function
definition. As an example:
y_global <- 42
add_nums <- encapsulate(function(x, y, z) {x + y_global + z})
add_nums(1,2,3) # ERROR
## Error in add_nums(1, 2, 3): object 'y_global' not found
# fixing the issue
add_nums <- encapsulate(function(x, y, z) {x + y + z})
add_nums(1,2,3) # SUCCESS
## [1] 6
The benefits of using such approach is that [1] promote functional
programming by disallowing global variables to be used inside function
body and [2] package functions must be namespaced explicitly (e.g.
dplyr::select
). The only downfall that I can think of for such an approach is the overhead
of environmental space which could potentially be a problem when loading large data in memory with little RAM,
and having a bit more keystrokes in your code.
Now one thing to keep in mind is that if you are writing R packages, the check process will flag this kind of error with no shenanigans like this. Also note that the auto search up the environment path is a key feature of the R programming language…
The takeaway here is to not expect static code checking tools to find all your mistakes. Check your code with tests! Having test functions built using the testthat
package has a lot of benefits and highlighting such issues so they don’t get pass the debugger.