Handling R errors the rlang way

Custom conditions, subclasses and more!

Every day we deal with errors, warnings and messages while writing, debugging or reviewing code. The three types belong to conditions in R. You might hope to see as few of them as possible, but actually they are so helpful when they describe the problem concisely and refer to its source. So if you write functions or code for yourself or others, it is a good practice to spend more time in writing descriptive conditions. I, personally, was not following this advice all the time, but as I am getting into this habit, I learnt more about the condition handling system in R and also about the improvements rlang provides.

In this post I will highlight the basics of condition handling. Then we’ll see the benefits of custom conditions provided by rlang. By the end, we will be able to understand how to generate custom conditions and throw errors with more details as shown in the following example.

Conditions in base R

in R conditions are regular objects and they mainly include:

  • error: signaled by stop()
  • warning: generated by warning()
  • message: generated by message()

Let’s see an example with a simple function; my_sqrt() that raises an error when a negative number is passed to it.

## define my_sqrt() that only takes positive numbers
my_sqrt <- function(x){
  if((x) < 0) {
    stop("x must be positive")
  } else {
    sqrt(x)
  }
}

Now if you pass -1 to my_sqrt(), it will exit and show you the message which you specified inside stop().

## pass -ve number to my_sqrt()
my_sqrt(-1)
Error in my_sqrt(-1): x must be positive

But how can we handle conditions and decide what to do when they are generated?

Condition handling

tryCatch is one of the ways to inspect condition objects and control what happens when a condition is signaled. For instance, we can define an error handler to decide what happens when my_sqrt() fails. Here, function(cnd) cnd, the error handler passed to the error argument inside tryCatch() says “catch the error object and return it”.

If you inspect the returned value sqrt_cnd you can see a list with:

  • message: the error message you defined in my_sqrt.
  • call: the function call that raised this error.
## define an error handler to return the error object when an error is thrown
sqrt_cnd <- tryCatch(error = function(cnd) cnd, my_sqrt(-1))

str(sqrt_cnd)
List of 2
 $ message: chr "x must be positive"
 $ call   : language my_sqrt(-1)
 - attr(*, "class")= chr [1:3] "simpleError" "error" "condition"

Since the error handler is a normal function, you can decide what to return other than the message and call. For instance, instead of catching the error and throwing it, you can return a value; here 0.

## define an error handler to return 0 when an error is thrown
sqrt_cnd <- tryCatch(error = function(x) 0, my_sqrt(-1))

str(sqrt_cnd)
 num 0

But What if we had a chain of functions, where a function calls another one?

Conditions with a chain of functions

In practice, we usually write functions that call other functions and we might get lost if we don’t have an easy way to find the source of the error or decide what to do when it is thrown.

To see this case, let’s define two functions for demo purposes:

  • get_val(): which return the random value if positive and raises an error if negative. (this is to simulate random inputs or fetching data from users, database, etc.)
## define get_val() to simulate random input values
get_val <- function(){
  val <- runif(1, -10, 10)
  if (val < 1){
    stop("Can't get val")
  } else {
    val
  }
}
  • double_value(): which calls get_val() and multiplies the returned value by 2.
## Note that `mult_val()` it is not a very practical example,
## because the function doesn't do a single task related to its name, 
## but I am just using it for demo purposes
mult_val <- function(mult_by = 2){
  x <- get_val()
  x*mult_by
}

In case val is negative in get_val(), an error will be thrown as follows:

## in case val negative 
get_val()
Error in get_val(): Can't get val

Similarly, when we call mult_val(), the error will jump and and we will see an error message.

## in case val negative 
mult_val()
Error in get_val(): Can't get val

In both cases, we have the same error message and we have no info about the value of val that caused the error.

So is there a way to see more info about the error, like the exact value of val?, or could we write more detailed messages?

Conditions in rlang

In principle, it is possible to create custom condition objects to pass more meta-data about the error. But in base R, it is kind of confusing compared to what rlang provides. I had to look up the base R way and check some examples every time I wanted to handle such cases!

So rlang provides functions that correspond to base R ones as follows:

rlang base R
abort() stop()
warn() warning()
inform() message()

rlang functions are designed to deal with condition objects and create custom ones easily, unlike base R functions that are focused on messages.

Custom conditions

abort() versus stop()

To clarify the difference, let’s modify get_val() and use abort() instead of stop(). Here you can see three arguments passed to abort():

  • message: the error message which is similar to the one passed to stop() in the previous example.
  • .subclass: a subclass of the condition to differentiate errors.
  • val: the particular value that caused the error.

You can pass more values to abort(), and it will return a custom error object with a list of all these values.

## define get_val() to simulate random input values
get_val <- function(){
  val <- runif(1, -10, 10)
  if (val < 1){
    rlang::abort(message = "Can't get val", 
                 .subclass ="get_val_error", 
                 val = val)
  } else {
    val
  }
}

To inspect the custom error object returned by get_val(), you can use tryCatch() and assign the value to custom_cnd.

## define an error handler to return the custom error object 
custom_cnd <- tryCatch(error = function(cnd) cnd, get_val())

Notice that:

  • the error object has the main classes in addition to the defined subclass get_val_error.
  • the value val which caused the error is available and you can access it using custom_cnd$val.
## inspect custom_cnd
str(custom_cnd, max.level = 1)
List of 5
 $ message: chr "Can't get val"
 $ call   : NULL
 $ trace  :List of 3
  ..- attr(*, "class")= chr "rlang_trace"
 $ parent : NULL
 $ val    : num -2.86
 - attr(*, "class")= chr [1:4] "get_val_error" "rlang_error" "error" "condition"

So here’s a quick comparison between the error object returned by rlang::abort() in this example and the one returned by stop() in the previous section.

So now we have more meta-data about the error and a specific subclasse. How can we use this with chained functions?

Error messages (Catch, modify, rethrow)

Let’s say, we want to get a more precise message when we call mult_val() that calls get_val(). For instance, a message like:

“Can’t calculate value because get_val() raised an error as val was negative (-1.5648)”

We can define an error handler get_val_handler() to access the values returned in the custom error object thrown by get_val() then return a message based on these values.

What get_val_handler() basically does is to:

  • define a basic error message “Can’t calculate value”, that will be shown anyways.
  • check the class of the error object returned by get_val(). If the class belongs to a specific subclass get_val_error, the message gets modified to include the value of val.
  • return an error object with the final message and a subclass mult_val_error.
## define an error handler to modify the message
get_val_handler <- function(cnd) {
  msg <- "Can't calculate value"
  
  if (inherits(cnd, "get_val_error")) {
    msg <- paste0(msg, " as `val` passed to `get_val()` equals (", cnd$val,")")
  }
  
  rlang::abort(msg, "mult_val_error")
}

So now if you use get_val_handler with get_val() inside mult_val(), you basically say:

“If you catch an error from get_val(), get the value of val that caused the error and add it to the error message that will be returned by mult_val()

## use get_val_handler() with mult_val()
mult_val <- function(mult_by = 2){
  x <- tryCatch(error = get_val_handler, get_val())
  x*mult_by
}

And here you can see an example of the modified error message including the value of val that caused the error, which you couldn’t have access to earlier with the default stop() function.

mult_val()
Error: Can't calculate value as `val` passed to `get_val()` equals (-2.8569)
Call `rlang::last_error()` to see a backtrace

If you want to inspect the error object returned by mult_val(), you can see the details including the new subclass mult_val_error, to which this error object belongs.

## define an error handler to return the error object
modified_cnd <- tryCatch(error = function(cnd) cnd, mult_val())

str(modified_cnd, max.level = 1)
List of 4
 $ message: chr "Can't calculate value as `val` passed to `get_val()` equals (-2.8569)"
 $ call   : NULL
 $ trace  :List of 3
  ..- attr(*, "class")= chr "rlang_trace"
 $ parent : NULL
 - attr(*, "class")= chr [1:4] "mult_val_error" "rlang_error" "error" "condition"

Conclusion

Conditions can be our friends and guides through debugging and code review processes. How useful they are depends on how clear and concise the info they give. rlang provides an easy way to deal with custom conditions. It allows us to pass meta-data about the conditions, which helps in better reporting and handling. The previous examples showed the differences between rlang and base r conditions. They also demonstrated how to deal with custom conditions. Most importantly, we saw how to catch, modify and rethrow an error in chained functions. So what remains is to make use of this flexibility to handle conditions and write more informative messages.

Extra Resources

Notes

  • The version of rlang used here is rlang_0.2.2.9001. I am not sure if everything works in the same way in earlier versions.

  • The functions used in the examples are not perfect since they are not pure, they don’t perform a single clear task and their names do not reflect their purpose. However, they were just used for demo purposes.

comments powered by Disqus