Chapter 14 Cloud integration

targets has built-in cloud capabilities to help scale pipelines up and out. Cloud storage solutions are already available, and cloud computing computing solutions are in the works.

Before getting started, please familiarize yourself with the pricing model and cost management and monitoring tools of Amazon Web Services. Everything has a cost, from virtual instances to web API calls. Free tier accounts give you a modest monthly budget for some services for the first year, but it is easy to exceed the limits. Developers of reusable software should consider applying for promotional credit using this application form.

14.1 Compute

Right now, targets does not have built-in cloud-based distributed computing support. However, future development plans include seamless integration with AWS Batch. As a temporary workaround, it is possible to deploy a burstable SLURM cluster using AWS ParallelCluster and leverage targetsexisting support for traditional schedulers.

14.2 Storage

targets supports cloud storage on a target-by-target basis using Amazon Simple Storage Service, or S3. After a target completes, the return value is uploaded to a user-defined S3 bucket S3 bucket. Follow these steps to get started.

14.2.1 A disclaimer

Please be aware of your resource and monetary constraints when using AWS. The AWS-backed storage capabilities in targets sends individual data files to S3 buckets on a target-by-target basis and periodically queries the HTML header data. For a large number of S3-backed targets, the pipeline may slow down, and it may incur monetary costs due to a large number of API requests, both for uploading new targets and checking if existing targets are up to date. It may be best to select a small number of important or large targets to send to S3 buckets, while putting the majority of small targets under source code version control, e.g. Git/GitHub. To put the whole _targets/ data store in a single S3 bucket, including metadata, it is best to run the pipeline locally first and then invoke aws.s3::s3sync() or similar.

14.2.2 Communicate with your collaborators

An S3 bucket can be a shared file space. If you have access to someone else’s bucket, running their pipeline could accidentally overwrite their cloud data. It is best if you and your colleagues decide in advance who will write to the bucket at any given time.

14.2.3 Get started with the Amazon S3 web console

If you do not already have an Amazon Web Services account, sign up for the free tier at Then, follow these step-by-step instructions to practice using Amazon S3 through the web console at

14.2.4 Configure your local machine

targets uses the aws.s3 package behind the scenes. It is not a strict dependency of targets, so you will need to install it yourself.


Next, aws.s3 needs an access ID, secret access key, and default region. Follow these steps to generate the keys, and choose a region from this table of endpoints. Then, open the .Renviron file in your home directory with usethis::edit_r_environ() and store this information in special environment variables. Here is an example .Renviron file.

# Example .Renviron file

Restart your R session so the changes take effect. Your keys are sensitive personal information. You can print them in your private console to verify correctness, but otherwise please avoid saving them to any persistent documents other than .Renviron.

#> [1] "us-east-1"

14.2.5 Create S3 buckets

Now, you are ready to create one or more S3 buckets for your targets pipeline. Each pipeline should have its own set of buckets. Create one through the web console or with aws.s3::put_bucket().

#> [1] TRUE

Sign in to to verify that the bucket exists.

14.2.6 Configure the pipeline

To connect your pipeline with S3,

  1. Supply your bucket name to resources in tar_option_set(). To use different buckets for different targets, set resources directly in tar_target(). In older version of targets, resources was a named list. In targets >=, please create the resources argument with helpers tar_resources() and tar_resources_aws().
  2. Supply AWS-powered storage formats to tar_option_set() and/or tar_target(). See the tar_target() help file for the full list of formats.

Your target script file file will look something like this.

# Example _targets.R
# With older versions of targets:
# tar_option_set(resources = list(bucket = "my-test-bucket-25edb4956460647d"))
# With targets >=
  resources = tar_resources(
    aws = tar_resources_aws(bucket = "my-test-bucket-25edb4956460647d")
write_mean <- function(data) {
  tmp <- tempfile()
  writeLines(as.character(mean(data)), tmp)
  tar_target(data, rnorm(5), format = "aws_qs"),
  tar_target(mean_file, write_mean(data), format = "aws_file")

14.2.7 Run the pipeline

When you run the pipeline above with tar_make(), your local R session computes rnorm(5), saves it to a temporary qs file on disk, and then uploads it to a file called _targets/objects/data on your S3 bucket. Likewise for mean_file, but because the format is "aws_file", you are responsible for supplying the path to the file that gets uploaded to _targets/objects/mean_file.

#> ● run target data
#> ● run target mean_file
#> ● end pipeline

And of course, your targets stay up to date if you make no changes.

#> ✓ skip target data
#> ✓ skip target mean_file
#> ✓ skip pipeline

14.2.8 Manage the data

Log into You should see objects _targets/objects/data and _targets/objects/mean_file in your bucket. To download this data locally, use tar_read() and tar_load() like before. These functions download the data from the bucket and load it into R.

#> [1] -0.74654607 -0.59593497 -1.57229983  0.40915323  0.02579023

The "aws_file" format is different from the other AWS-powered formats. tar_read() and tar_load() download the object to a local path (where the target saved it locally before it was uploaded) and return the path so you can process it yourself.18

#> [1] "_targets/scratch/mean_fileff086e70876d"
#> [1] "-0.495967480886693"

When you are done with these temporary files and the pipeline is no longer running, you can safely remove everything in _targets/scratch/.

unlink("_targets/scratch/", recursive = TRUE)

Lastly, if you want to erase the whole project or start over from scratch, consider removing the S3 bucket to avoid incurring storage fees. The easiest way to do this is through the S3 console. You can alternatively call aws.s3::delete_bucket(), but you have to make sure the bucket is empty first.


  1. Non-“file” AWS formats also download files, but they are temporary and immediately discarded after the data is read into memory.↩︎

Copyright Eli Lilly and Company