4 The composition of target objects
Most of the package’s conceptual challenges and intricacies are expressed in the "target"
class, and this decentralization helps targets
effectively reason about entire pipelines. This chapter describes the classes that form the building blocks of a target.
4.1 Overall structure
To maximize performance, classes with many instances per workflow are simple environments. Most of these objects lack explicit S3 class attributes, but all of them have formal constructors, helpers, and validators.
The following classes define specialized objects for the fields of targets.
- Command
- Settings
- Value
- Metrics
- Store
- File
- Subpipeline
- Junction
- Cue
Some types of targets need only some of these objects as fields.
Field | Builder | Stem | Branch | Bud | Pattern |
---|---|---|---|---|---|
Command | ✓ | ✓ | ✓ | ✓ | ✓ |
Settings | ✓ | ✓ | ✓ | ✓ | ✓ |
Value | ✓ | ✓ | ✓ | ✓ | ✓ |
Metrics | ✓ | ✓ | ✓ | ||
Store | ✓ | ✓ | ✓ | ✓ | |
File | ✓ | ✓ | ✓ | ||
Subpipeline | ✓ | ✓ | ✓ | ||
Junction | ✓ | ✓ | |||
Cue | ✓ | ✓ | ✓ | ✓ | |
Patternview | ✓ |
The class inheritance hierarchy of targets is below, and the orchestration chapter explains why the package is designed this way.
- Target
- Bud
- Builder
- Stem
- Branch
- Pattern
4.2 Classes
4.2.1 Command class
A command
object is an abstraction around an R code chunk. It contains an R expression, the names of packages and object dependencies that the expression needs to in order to run, the random seed to run it with, and a string and hash of the expression. The hash is used to help determine if the target is already up to date.
4.2.2 Settings class
A settings
object keeps track of the user-defined target-specific configuration settings of the targets, such as the target name, storage format, failure mode, memory management behavior, and branching pattern specification (if applicable).
4.2.3 Value class
The value
class is a layer around a target’s return value. Having a special value
object allows us to easily distinguish between two situations:
- The target did not run or load data from storage yet.
- The target did run, but its expression returned
NULL
.
Without a special value
class, both (1) and (2) would result in NULL
values. But for (1), we have an empty value
object instead of NULL
.
In addition, the value
class has sub-classes for different data iteration/aggregation methods. Users can choose either list-like aggregation and slicing or vctrs
-powered aggregation and slicing. This functionality comes in handy for branching.
4.3 Metrics class
A metrics
object stores metadata metrics about the instance of a target’s build, including runtime, as well as warnings, error messages, and tracebacks if applicable. Initially, the metrics
object is creates as part of a build
object, which is returned by a command
object when it is run. Very soon after, the metrics and return value are separated out from the build
object and placed directly in the target
object.
4.3.1 Store class
A store
object describes how a target
stores and queries its return value in file system storage. It contains a file
object, as well as methods for managing the file, such as reading, writing, and decisions that involve hashes. The user-selected format of the target in settings
determines the sub-class of the store
.
4.3.2 File class
A file
object is an abstraction of a collection of files and directories. It contains the paths, as well as the hash, maximum time stamp, and total storage size of the aggregate. The latter two metrics help decide whether to recompute a computationally expensive hash or trust that the hash is already up to date.
4.3.3 Subpipeline
A subpipeline
is not actually a class of its own, it is just a pipeline
object with only the direct dependencies of a particular target and no value
objects in those dependencies. Its only purpose is to efficiently assist with the mechanics of worker-side dependency retrieval.
4.3.4 Junction class
A junction
serves as a branching specification for patterns and a budding specification for stems
. It contains the name of the parent pattern or stem, the names of the children (buds or branches), and the names of the dependencies of each bud or branch. The junction is the explicit representation of the user-defined pattern
argument of tar_target()
combined with the hashes of the available dependencies.
4.3.5 Cue class
A cue object is a collection of rules for deciding whether a target is up to date. targets
allows the user to activate or suppress some of these rules to change the conditions under which targets rerun.
4.3.6 Patternview class
A patternview object keeps track of the overall status of a all a pattern’s branches as a group. Its helps make it more efficient to keep track of the progress, runtime, and storage size of an entire pattern.