Chapter 3 Debugging

Under the default settings, conventional debugging tools such as traceback(), debug(), browser() and other popular debugging techniques may not provide useful information on why a given target is failing. Not even .Last.error or .Last.error.trace from callr are automatically informative. However, targets provides its own extensive support for debugging and troubleshooting errors. This chapter demonstrates the techniques.

3.1 Error messages

The metadata in _targets/meta/meta contains error messages and warning messages from when each target last ran. tar_meta() can retrieve these clues.

tar_meta(fields = error, complete_only = TRUE)
tar_meta(fields = warnings, complete_only = TRUE)

3.2 Environment browser

By default, tar_make() runs in a reproducible background process, so debug() and browser() do not interrupt the pipeline. To use the environment browser your the main session, restart R and supply callr_function = NULL to tar_make(). callr_function = NULL risks invalidating your hard-earned results, so only use it after you have just restarted R and only use it for debugging.

# Restart R first...
tar_make(names = target_to_debug, callr_function = NULL)
#> debugging in: custom_function_called_from_a_target()
#> Browse[1]>

3.3 The debug option

targets has a more convenient way to launch the environment browser from inside a target:

  1. In _targets.R, write a call to tar_option_set() with debug equal to the target name. Consider also setting cue equal to tar_cue(mode = "never") so tar_make() reaches the target you want to debug more quickly.
  2. Launch a fresh clean new interactive R session with the _targets.R script in your working directory.
  3. Run targets::tar_make() (or targets::tar_make_clustermq(), or targets::tar_make_future()) with callr_function = NULL.
  4. When targets reaches the target you selected to debug, your R session will start an interactive debugger, and you should see Browse[1]> in your console. Run targets::tar_name() to verify that you are debugging the correct target.
  5. Interactively run any R code that helps you troubleshoot the problem. For example, if the target invokes a function f(), enter debug(f) and then c to immediately enter the function’s calling environment where all its arguments are defined.

To try it out yourself, write the following _targets.R file.

# _targets.R
tar_option_set(debug = "b")
f <- function(x, another_arg = 123) x + another_arg
  tar_target(a, 1),
  tar_target(b, f(a))

Then, call tar_make(callr_function = NULL) to drop into a debugger at the command of b.

# R console
tar_make(callr_function = NULL)
#> ● run target a
#> ● run target b
#> Called from: eval(expr, envir)

When the debugger launches, run targets::tar_name() to confirm you are running the correct target.

Browse[1]> targets::tar_name()
#> [1] "b"

In the debugger, the dependency targets of b are available in the current environment, and the global objects and functions are available in the parent environment.

Browse[1]> ls()
#> [1] "a"
Browse[1]> a
#> [1] 1
Browse[1]> ls(parent.env(environment()))
#> [1] "f"
Browse[1]> f(1)
#> [1] 124

Enter debug(f) to debug the function f(), and press c to enter the function’s calling environment where another_arg is defined.

Browse[1]> debug(f)
Browse[1]> c
#> debugging in: f(a)
#> debug at _targets.R#3: x + another_arg
Browse[2]> ls()
#> [1] "another_arg" "x"   
Browse[2]> another_arg
#> [1] 123

3.4 Workspaces

Workspaces are a persistent alternative to the environment browser. A workspace is a special lightweight reference file that lists the elements of a target’s runtime environment. Using tar_workspace(), you can recover a target’s workspace and locally debug it even if the pipeline is not running. If you tell targets to record workspaces in advance, you can preempt errors and debug later at your convenience. Here is how:

  1. Set error = "workspace" in tar_option_set() or tar_target(). Then, tar_make() and friends will save a workspace file for every target that errors out.
  2. In the workspaces argument of tar_option_set(), specify the targets for which you want to save workspaces. Then, run tar_make() or similar. A workspace file will be saved for each existing target, regardless of whether the target runs or gets skipped in the pipeline.

Here is an example of (1).

# _targets.R file:
options(tidyverse.quiet = TRUE)
options(crayon.enabled = FALSE)
tar_option_set(error = "workspace")
f <- function(arg, value, ...) {
  stopifnot(arg < 4)
  tar_target(x, seq_len(4)),
    f(arg = x, value = "succeeded", a = 1, b = 2, key = "my_api_key"),
    pattern = map(x) # The branching chapter describes patterns.
# R console:
#> ● run target x
#> ● run branch y_29239c8a
#> ● run branch y_7cc32924
#> ● run branch y_bd602d50
#> ● run branch y_05f206d7
#> x error branch y_05f206d7
#> ● save workspace y_05f206d7
#> Error : x < 4 is not TRUE .
#> Error: callr subprocess failed: x < 4 is not TRUE .

One of the y_******* targets errored out.

failed <- tar_meta(fields = error) %>%
  na.omit() %>%

#> [1] "y_05f206d7"

tar_workspace() reads the special metadata in the workspace file and then loads the target’s dependencies from various locations in _targets/objects and/or the cloud. It also sets the random number generator seed to the seed of the target, loads the required packages, and runs _targets.R to load other global object dependencies such as functions.


We now have the dependencies of y_05f206d7 in memory, which allows you to try out any failed function calls in your local R session. 2 3

#> [1] 4
f(arg = 0, value = "my_value", a = 1, b = 2, key = "my_api_key")
#> [1] "my_value"
f(arg = x, value = "my_value", a = 1, b = 2, key = "my_api_key")
#> Error in f(x) : x < 4 is not TRUE

Keep in mind that that although the dependencies of y_05f206d7 are in memory, the arguments of f() are not.

#> Error: object 'arg' not found
#> Error: object 'value' not found

The workspace also has a useful traceback, and you can retrieve it with tar_traceback(). The last couple lines of the traceback are unavoidably cryptic, but they do sometimes contain useful information.

tar_traceback(y_05f206d7, characters = 77)
#> [1] "f(arg = x, value = \"succeeded\", a = 1, b = 2, key = \"my_api_key\")"           
#> [2] "stopifnot(arg < 4)"             
#> [3] "stop(simpleError(msg, call = if (p <- sys.parent(1))"              
#> [4] "(function (condition) \n{\n    state$error <- build_message(condition)\n    stat"

3.5 Tradeoffs

For small to medium-sized workloads, the environment browser and the debug option are usually the best choices. These techniques immediately direct control to prewritten function calls and get you as close to the error as possible. However, this may not always be feasible in large distributed workloads, e.g. tar_make_clustermq(), where most of your targets are not even running on the same computer as your main R process. For those complicated situations where it is not possible to access the R interpreter, workspaces are ideal because they store a persistent reproducible runtime state that you can recover locally.

  1. In addition, current random number generator seed (.Random.seed) is also the value y_05f206d7 started with.↩︎

  2. When you are finished debugging, you can remove all workspace files with tar_destroy(destroy = "workspaces").↩︎

Copyright Eli Lilly and Company