Chapter 3 Debugging
Under the default settings, conventional debugging tools such as traceback()
, debug()
, browser()
and other popular debugging techniques may not provide useful information on why a given target is failing. Not even .Last.error
or .Last.error.trace
from callr
are automatically informative. However, targets
provides its own extensive support for debugging and troubleshooting errors. This chapter demonstrates the techniques.
3.1 Error messages
The metadata in _targets/meta/meta
contains error messages and warning messages from when each target last ran. tar_meta()
can retrieve these clues.
tar_meta(fields = error, complete_only = TRUE)
tar_meta(fields = warnings, complete_only = TRUE)
3.2 Environment browser
By default, tar_make()
runs in a reproducible background process, so debug()
and browser()
do not interrupt the pipeline. To use the environment browser your the main session, restart R and supply callr_function = NULL
to tar_make()
. callr_function = NULL
risks invalidating your hard-earned results, so only use it after you have just restarted R and only use it for debugging.
# Restart R first...
debug(custom_function_called_from_a_target)
tar_make(names = target_to_debug, callr_function = NULL)
#> debugging in: custom_function_called_from_a_target()
#> Browse[1]>
3.3 The debug option
targets
has a more convenient way to launch the environment browser from inside a target:
- In
_targets.R
, write a call totar_option_set()
withdebug
equal to the target name. Consider also settingcue
equal totar_cue(mode = "never")
sotar_make()
reaches the target you want to debug more quickly. - Launch a fresh clean new interactive R session with the
_targets.R
script in your working directory. - Run
targets::tar_make()
(ortargets::tar_make_clustermq()
, ortargets::tar_make_future()
) withcallr_function = NULL
. - When
targets
reaches the target you selected to debug, your R session will start an interactive debugger, and you should seeBrowse[1]>
in your console. Runtargets::tar_name()
to verify that you are debugging the correct target. - Interactively run any R code that helps you troubleshoot the problem. For example, if the target invokes a function
f()
, enterdebug(f)
and thenc
to immediately enter the function’s calling environment where all its arguments are defined.
To try it out yourself, write the following _targets.R
file.
# _targets.R
library(targets)
tar_option_set(debug = "b")
<- function(x, another_arg = 123) x + another_arg
f list(
tar_target(a, 1),
tar_target(b, f(a))
)
Then, call tar_make(callr_function = NULL)
to drop into a debugger at the command of b
.
# R console
tar_make(callr_function = NULL)
#> ● run target a
#> ● run target b
#> Called from: eval(expr, envir)
1]> Browse[
When the debugger launches, run targets::tar_name()
to confirm you are running the correct target.
1]> targets::tar_name()
Browse[#> [1] "b"
In the debugger, the dependency targets of b
are available in the current environment, and the global objects and functions are available in the parent environment.
1]> ls()
Browse[#> [1] "a"
1]> a
Browse[#> [1] 1
1]> ls(parent.env(environment()))
Browse[#> [1] "f"
1]> f(1)
Browse[#> [1] 124
Enter debug(f)
to debug the function f()
, and press c
to enter the function’s calling environment where another_arg
is defined.
1]> debug(f)
Browse[1]> c
Browse[#> debugging in: f(a)
#> debug at _targets.R#3: x + another_arg
2]> ls()
Browse[#> [1] "another_arg" "x"
2]> another_arg
Browse[#> [1] 123
3.4 Workspaces
Workspaces are a persistent alternative to the environment browser. A workspace is a special lightweight reference file that lists the elements of a target’s runtime environment. Using tar_workspace()
, you can recover a target’s workspace and locally debug it even if the pipeline is not running. If you tell targets
to record workspaces in advance, you can preempt errors and debug later at your convenience. Here is how:
- Set
error = "workspace"
intar_option_set()
ortar_target()
. Then,tar_make()
and friends will save a workspace file for every target that errors out. - In the
workspaces
argument oftar_option_set()
, specify the targets for which you want to save workspaces. Then, runtar_make()
or similar. A workspace file will be saved for each existing target, regardless of whether the target runs or gets skipped in the pipeline.
Here is an example of (1).
# _targets.R file:
options(tidyverse.quiet = TRUE)
library(targets)
library(tidyverse)
options(crayon.enabled = FALSE)
tar_option_set(error = "workspace")
<- function(arg, value) {
f stopifnot(arg < 4)
}list(
tar_target(x, seq_len(4)),
tar_target(
y,f(arg = x, value = "succeeded", a = 1, b = 2, key = "my_api_key"),
pattern = map(x) # The branching chapter describes patterns.
) )
# R console:
tar_make()
#> ● run target x
#> ● run branch y_29239c8a
#> ● run branch y_7cc32924
#> ● run branch y_bd602d50
#> ● run branch y_05f206d7
#> x error branch y_05f206d7
#> ● save workspace y_05f206d7
#> Error : x < 4 is not TRUE .
#> Error: callr subprocess failed: x < 4 is not TRUE .
One of the y_*******
targets errored out.
<- tar_meta(fields = error) %>%
failed na.omit() %>%
pull(name)
print(failed)
#> [1] "y_05f206d7"
tar_workspace()
reads the special metadata in the workspace file and then loads the target’s dependencies from various locations in _targets/objects
and/or the cloud. It also sets the random number generator seed to the seed of the target, loads the required packages, and runs _targets.R
to load other global object dependencies such as functions.
tar_workspace(y_05f206d7)
We now have the dependencies of y_05f206d7
in memory, which allows you to try out any failed function calls in your local R session. 2 3
print(x)
#> [1] 4
f(arg = 0, value = "my_value", a = 1, b = 2, key = "my_api_key")
#> [1] "my_value"
f(arg = x, value = "my_value", a = 1, b = 2, key = "my_api_key")
#> Error in f(x) : x < 4 is not TRUE
Keep in mind that that although the dependencies of y_05f206d7
are in memory, the arguments of f()
are not.
arg#> Error: object 'arg' not found
value#> Error: object 'value' not found
3.5 Tradeoffs
For small to medium-sized workloads, the environment browser and the debug option are usually the best choices. These techniques immediately direct control to prewritten function calls and get you as close to the error as possible. However, this may not always be feasible in large distributed workloads, e.g. tar_make_clustermq()
, where most of your targets are not even running on the same computer as your main R process. For those complicated situations where it is not possible to access the R interpreter, workspaces are ideal because they store a persistent reproducible runtime state that you can recover locally.