Encapsulating functions to promote stricter functional programming

The silent error…

“Encapsulation hides information not to faciliate fraud, but to prevent mistakes.”
-Bjarne Stroustrup, original implementor of C++

Imagine the following scenario:

y_global <- 42
add_nums <- function(x, y, z) { x + y_global + z }
add_nums(1,2,3)
## [1] 46

46?! What’s happening here is that R is using a variable outside of add_nums scope. R does not have strict pragma such as other languages (e.g. use strict in Perl) and this can be a bit bothersome. And as a consequence this happens to me way more often than I’d like to admit, and I think this can be really dangerous, as the error is silent.

I came across this StackOverflow thread that suggests a workaround to wrap a function with local. It’s a clever idea but I think it’s a bit messy and has a level of complexity that could be simplified. So I came up with a strategy to avoid these situations, which is to encapsulate functions in their own environment detached from global_env.

Here, I have a helper function that I use in most of the analyses I do.

encapsulate <- function(fxn) {
  nenv <- new.env(parent = baseenv())
  environment(fxn) <- nenv
  return(fxn)
}

Using it is really simple, just wrap encapsulate around the function definition. As an example:

y_global <- 42
add_nums <- encapsulate(function(x, y, z) {x + y_global + z})
add_nums(1,2,3) # ERROR
## Error in add_nums(1, 2, 3): object 'y_global' not found
# fixing the issue
add_nums <- encapsulate(function(x, y, z) {x + y + z})
add_nums(1,2,3) # SUCCESS
## [1] 6

The benefits of using such approach is that [1] promote functional programming by disallowing global variables to be used inside function body and [2] package functions must be namespaced explicitly (e.g. dplyr::select). The only downfall that I can think of for such an approach is the overhead of environmental space which could potentially be a problem when loading large data in memory with little RAM, and having a bit more keystrokes in your code.

Now one thing to keep in mind is that if you are writing R packages, the check process will flag this kind of error with no shenanigans like this. Also note that the auto search up the environment path is a key feature of the R programming language…

The takeaway here is to not expect static code checking tools to find all your mistakes. Check your code with tests! Having test functions built using the testthat package has a lot of benefits and highlighting such issues so they don’t get pass the debugger.

Avatar
Shan Sabri

Using novel high-throughput approaches to understand and engineer the immune system to fight cancer.

Related