The targets R Package User Manual
Chapter 1 Introduction
targets package is a Make-like pipeline toolkit for Statistics and data science in R. With
targets, you can maintain a reproducible workflow without repeating yourself.
targets learns how your pipeline fits together, skips costly runtime for tasks that are already up to date, runs only the necessary computation, supports implicit parallel computing, abstracts files as R objects, and shows tangible evidence that the results match the underlying code and data.
Data analysis can be slow. A round of scientific computation can take several minutes, hours, or even days to complete. After it finishes, if you update your code or data, your hard-earned results may no longer be valid. Unchecked, this invalidation creates chronic Sisyphean loop:
- Launch the code.
- Wait while it runs.
- Discover an issue.
- Restart from scratch.
1.2 Pipeline toolkits
Pipeline toolkits like GNU Make break the cycle. They watch the dependency graph of the whole workflow and skip steps, or “targets”, whose code, data, and upstream dependencies have not changed since the last run of the pipeline. When all targets are up to date, this is evidence that the results match the underlying code and data, which helps us trust the results and confirm the computation is reproducible.
Unlike most pipeline toolkits, which are language agnostic or Python-focused, the
targets package allows data scientists and researchers to work entirely within R.
targets implicitly nudges users toward a clean, function-oriented programming style that fits the intent of the R language and helps practitioners maintain their data analysis projects.
1.4 About this manual
This manual is a step-by-step user guide to
targets. It walks through basic usage, outlines general best practices, dives deep into advanced features like high-performance computing, and helps
drake users transition to
targets. See the documentation website for most other major resources, including installation instructions, links to example projects, and a reference page with all user-side functions.
1.5 What about
drake is an older R-focused pipeline toolkit, and
drake’s long-term successor. There is a special chapter to explain why
targets was created, what this means for
drake’s future, advice for
drake users transitioning to the
targets, and the main technical advantages of