Chapter 4 Debugging

Under the default settings, conventional debugging tools such as traceback(), debug(), browser() and other popular debugging techniques may not provide useful information on why a given target is failing. Not even .Last.error or .Last.error.trace from callr are automatically informative. However, targets provides its own extensive support for debugging and troubleshooting errors. This chapter demonstrates the techniques.

4.1 Error messages

The metadata in _targets/meta/meta contains error messages and warning messages from when each target last ran. tar_meta() can retrieve these clues.

tar_meta(fields = error, complete_only = TRUE)
tar_meta(fields = warnings, complete_only = TRUE)

4.2 Environment browser

By default, tar_make() runs in a reproducible background process, so debug() and browser() do not interrupt the pipeline. To use the environment browser your the main session, restart R and supply callr_function = NULL to tar_make(). callr_function = NULL risks invalidating your hard-earned results, so only use it after you have just restarted R and only use it for debugging.

# Restart R first...
debug(custom_function_called_from_a_target)
tar_make(names = target_to_debug, callr_function = NULL)
#> debugging in: custom_function_called_from_a_target()
#> Browse[1]>

4.3 The debug option

targets has a more convenient way to launch the environment browser from inside a target:

  1. In the target script file (default: _targets.R) write a call to tar_option_set() with debug equal to the target name.
  2. Launch a fresh clean new interactive R session with the target script file (default: _targets.R) script in your working directory.
  3. Run targets::tar_make() (or targets::tar_make_clustermq(), or targets::tar_make_future()) with callr_function = NULL. If you are using targets version 0.5.0.9000 or above, consider also setting shortcut to TRUE and supplying the target name to names.7 This allows tar_make() to reach the desired target more quickly.
  4. When targets reaches the target you selected to debug, your R session will start an interactive debugger, and you should see Browse[1]> in your console. Run targets::tar_name() to verify that you are debugging the correct target.
  5. Interactively run any R code that helps you troubleshoot the problem. For example, if the target invokes a function f(), enter debug(f) and then c to immediately enter the function’s calling environment where all its arguments are defined.

To try it out yourself, write the following target script file file.

# _targets.R file
library(targets)
tar_option_set(debug = "b")
f <- function(x, another_arg = 123) x + another_arg
list(
  tar_target(a, 1),
  tar_target(b, f(a))
)

Then, call tar_make(callr_function = NULL) to drop into a debugger at the command of b.

# R console
tar_make(callr_function = NULL, names = any_of("b"), shortcut = TRUE)
#> ● run target b
#> Called from: eval(expr, envir)
Browse[1]>

When the debugger launches, run targets::tar_name() to confirm you are running the correct target.

Browse[1]> targets::tar_name()
#> [1] "b"

In the debugger, the dependency targets of b are available in the current environment, and the global objects and functions are available in the parent environment.

Browse[1]> ls()
#> [1] "a"
Browse[1]> a
#> [1] 1
Browse[1]> ls(parent.env(environment()))
#> [1] "f"
Browse[1]> f(1)
#> [1] 124

Enter debug(f) to debug the function f(), and press c to enter the function’s calling environment where another_arg is defined.

Browse[1]> debug(f)
Browse[1]> c
#> debugging in: f(a)
#> debug at _targets.R#3: x + another_arg
Browse[2]> ls()
#> [1] "another_arg" "x"   
Browse[2]> another_arg
#> [1] 123

4.4 Workspaces

Workspaces are a persistent alternative to the environment browser. A workspace is a special lightweight reference file that lists the elements of a target’s runtime environment. Using tar_workspace(), you can recover a target’s workspace and locally debug it even if the pipeline is not running. If you tell targets to record workspaces in advance, you can preempt errors and debug later at your convenience. To enable workspaces, use the workspace_on_error and workspaces arguments of tar_option_set(). These arguments set the conditions under which workspace files are saved. For example, tar_option_set(workspace_on_error = TRUE, workspaces = c("x", "y")) tells tar_make() and friends to save a workspace for a target named x, a target named y, and every target that throws and error. Example in a pipeline:

# _targets.R file:
options(tidyverse.quiet = TRUE)
library(targets)
options(crayon.enabled = FALSE)
tar_option_set(workspace_on_error = TRUE, packages = "tidyverse")
f <- function(arg, value, ...) {
  stopifnot(arg < 4)
}
list(
  tar_target(x, seq_len(4)),
  tar_target(
    y,
    f(arg = x, value = "succeeded", a = 1, b = 2, key = "my_api_key"),
    pattern = map(x) # The branching chapter describes patterns.
  )
)
# R console:
tar_make()
#> ● run target x
#> ● run branch y_29239c8a
#> ● run branch y_7cc32924
#> ● run branch y_bd602d50
#> ● run branch y_05f206d7
#> x error branch y_05f206d7
#> ● save workspace y_05f206d7
#> Error : x < 4 is not TRUE .
#> Error: callr subprocess failed: x < 4 is not TRUE .

One of the y_******* targets errored out.

failed <- tar_meta(fields = error) %>%
  na.omit() %>%
  pull(name)

print(failed)
#> [1] "y_05f206d7"

tar_workspace() reads the special metadata in the workspace file and then loads the target’s dependencies from various locations in _targets/objects and/or the cloud. It also sets the random number generator seed to the seed of the target, loads the required packages, and runs the target script file (default: _targets.R) to load other global object dependencies such as functions.

tar_workspace(y_05f206d7)

We now have the dependencies of y_05f206d7 in memory, which allows you to try out any failed function calls in your local R session. 8 9

print(x)
#> [1] 4
f(arg = 0, value = "my_value", a = 1, b = 2, key = "my_api_key")
#> [1] "my_value"
f(arg = x, value = "my_value", a = 1, b = 2, key = "my_api_key")
#> Error in f(x) : x < 4 is not TRUE

Keep in mind that that although the dependencies of y_05f206d7 are in memory, the arguments of f() are not.

arg
#> Error: object 'arg' not found
value
#> Error: object 'value' not found

The workspace also has a useful traceback, and you can retrieve it with tar_traceback(). The last couple lines of the traceback are unavoidably cryptic, but they do sometimes contain useful information.

tar_traceback(y_05f206d7, characters = 77)
#> [1] "f(arg = x, value = \"succeeded\", a = 1, b = 2, key = \"my_api_key\")"           
#> [2] "stopifnot(arg < 4)"             
#> [3] "stop(simpleError(msg, call = if (p <- sys.parent(1)) sys.call(p)))"              
#> [4] "(function (condition) \n{\n    state$error <- build_message(condition)\n    stat"

4.5 Tradeoffs

For small to medium-sized workloads, the environment browser and the debug option are usually the best choices. These techniques immediately direct control to prewritten function calls and get you as close to the error as possible. However, this may not always be feasible in large distributed workloads, e.g. tar_make_clustermq(), where most of your targets are not even running on the same computer as your main R process. For those complicated situations where it is not possible to access the R interpreter, workspaces are ideal because they store a persistent reproducible runtime state that you can recover locally.


  1. In the case of dynamic branching, names does not accept individual branch names, but you can still supply the name of the overarching dynamic target.↩︎

  2. In addition, current random number generator seed (.Random.seed) is also the value y_05f206d7 started with.↩︎

  3. When you are finished debugging, you can remove all workspace files with tar_destroy(destroy = "workspaces").↩︎

Copyright Eli Lilly and Company