6 Use vcr (& webmockr)

In this chapter we aim at adding HTTP testing infrastructure to exemplighratia using vcr (& webmockr).

Corresponding pull request to exemplighratia. Feel free to fork the repository to experiment yourself!

6.1 Setup

Before working on all this, we need to install vcr.

First, we need to run vcr::use_vcr() (in the exemplighratia directory) which has a few effects:

  • Adding vcr as a dependency to DESCRIPTION, under Suggests just like testthat.
  • Creating an example test file for us to look at. This is useful the first few times you setup vcr in another package, after a while you might even delete it without reading it.
  • Adding a .gitattributes file with the line tests/fixtures/**/* -diff which will hide the changes in your cassettes from the git diff. It makes your git diff easier to deal with. 3
  • Creating a setup file under tests/testthat/helper-vcr.R,
library("vcr")
invisible(vcr::vcr_configure(
  dir = vcr::vcr_test_path("fixtures")
))
vcr::check_cassette_names()

When testthat runs tests, files whose name start with “helper” are always run first. They are also loaded by devtools::load_all(), so the vcr setup is loaded when developing and testing interactively. See the table in the R-hub blog post “Helper code and files for your testthat tests”.

The helper file created by vcr

  • loads vcr,
  • indicates where mocked responses are saved ("../fixtures" which translates, from the root of the package, to tests/fixtures),
  • and checks that you are not using the same name twice for cassettes (mock files).

We have to tweak the vcr setup a bit for our needs.

  • We do not want our API token to appear in the mock responses, and we know it’s used in the Authorization header of the requests, so we use the filter_request_headers argument of vcr::vcr_configure(). For other secret filtering one can use filter_response_headers and filter_sensitive_data (for a regular expression purging, on the whole saved interactions).

  • We need to ensure that we set up a fake API key when there is no API token around. Why? Because if you remember well, the code of our function gh_organizations() checks for the presence of a token. With mock responses around, we don’t need a token but we still need to fool our own package in contexts where there is no token (e.g. in continuous integration checks for a fork of a GitHub repository).

Below is the updated setup file saved under tests/testthat/helper-vcr.R.

library("vcr")

vcr_dir <- vcr::vcr_test_path("fixtures")

if (!nzchar(Sys.getenv("GITHUB_PAT"))) {
  if (dir.exists(vcr_dir)) {
    # Fake API token to fool our package
    Sys.setenv("GITHUB_PAT" = "foobar")
  } else {
    # If there's no mock files nor API token, impossible to run tests
    stop("No API key nor cassettes, tests cannot be run.",
         call. = FALSE)
  }
}

invisible(vcr::vcr_configure(
  dir = vcr_dir,
  # Filter the request header where the token is sent, make sure you know
  # how authentication works in your case and read the Security chapter :-)
  filter_request_headers = list(Authorization = "My bearer token is safe")
))

So this was just setup, now on to adapting our tests!

6.2 Actual testing

The most important function will be vcr::use_cassette("cassette-informative-and-unique-name", {code-block}) which tells vcr to create a mock file to store all API responses for API calls occurring in the code block.

Let’s tweak the test for gh_api_status, it now becomes

test_that("gh_api_status() works", {
  vcr::use_cassette("gh_api_status", {
    status <- gh_api_status()
  })
  testthat::expect_type(status, "character")
})

We only had to wrap the code involving interactions with the API, status <- gh_api_status(), in vcr::use_cassette().

If we run this test (in RStudio clicking on “Run test”),

  • the first time, vcr creates a cassette (mock file) under tests/testthat/fixtures/gh_api_status.yml where it stores the API response. It contains all the information related to requests and responses, headers included.
http_interactions:
- request:
    method: get
    uri: https://kctbh9vrtdwd.statuspage.io/api/v2/components.json
    body:
      encoding: ''
      string: ''
    headers:
      Accept: application/json, text/xml, application/xml, */*
  response:
    status:
      status_code: 200
      category: Success
      reason: OK
      message: 'Success: (200) OK'
    headers:
      vary: Accept,Accept-Encoding,Fastly-SSL
      cache-control: max-age=0, private, must-revalidate
      x-cache: MISS
      content-type: application/json; charset=utf-8
      content-encoding: gzip
      strict-transport-security: max-age=259200
      date: Thu, 15 Oct 2020 11:59:23 GMT
      x-request-id: d9888435-3f04-4401-be5c-b9d1bfdfa015
      x-download-options: noopen
      x-xss-protection: 1; mode=block
      x-runtime: '0.037254'
      x-permitted-cross-domain-policies: none
      access-control-allow-origin: '*'
      accept-ranges: bytes
      x-content-type-options: nosniff
      etag: W/"gz[a479c9894f51b7db286dc31cd922e7bf]"
      x-statuspage-skip-logging: 'true'
      x-statuspage-version: fd137a4bb14c20ce721393e5b6540ea6eebff3a3
      referrer-policy: strict-origin-when-cross-origin
      age: '0'
    body:
      encoding: UTF-8
      file: no
      string: '{"page":{"id":"kctbh9vrtdwd","name":"GitHub","url":"https://www.githubstatus.com","time_zone":"Etc/UTC","updated_at":"2020-10-15T08:57:35.302Z"},"components":[{"id":"8l4ygp009s5s","name":"Git
        Operations","status":"operational","created_at":"2017-01-31T20:05:05.370Z","updated_at":"2020-09-24T02:32:00.916Z","position":1,"description":"Performance
        of git clones, pulls, pushes, and associated operations","showcase":false,"start_date":null,"group_id":null,"page_id":"kctbh9vrtdwd","group":false,"only_show_if_degraded":false},{"id":"brv1bkgrwx7q","name":"API
        Requests","status":"operational","created_at":"2017-01-31T20:01:46.621Z","updated_at":"2020-09-30T19:00:29.476Z","position":2,"description":"Requests
        for GitHub APIs","showcase":false,"start_date":null,"group_id":null,"page_id":"kctbh9vrtdwd","group":false,"only_show_if_degraded":false},{"id":"4230lsnqdsld","name":"Webhooks","status":"operational","created_at":"2019-11-13T18:00:24.256Z","updated_at":"2020-10-13T14:51:17.928Z","position":3,"description":"Real
        time HTTP callbacks of user-generated and system events","showcase":false,"start_date":null,"group_id":null,"page_id":"kctbh9vrtdwd","group":false,"only_show_if_degraded":false},{"id":"0l2p9nhqnxpd","name":"Visit
        www.githubstatus.com for more information","status":"operational","created_at":"2018-12-05T19:39:40.838Z","updated_at":"2020-04-02T21:56:21.954Z","position":4,"description":null,"showcase":false,"start_date":null,"group_id":null,"page_id":"kctbh9vrtdwd","group":false,"only_show_if_degraded":false},{"id":"kr09ddfgbfsf","name":"Issues","status":"operational","created_at":"2017-01-31T20:01:46.638Z","updated_at":"2020-10-10T00:02:16.199Z","position":5,"description":"Requests
        for Issues on GitHub.com","showcase":false,"start_date":null,"group_id":null,"page_id":"kctbh9vrtdwd","group":false,"only_show_if_degraded":false},{"id":"hhtssxt0f5v2","name":"Pull
        Requests","status":"operational","created_at":"2020-09-02T15:39:06.329Z","updated_at":"2020-10-10T00:02:49.033Z","position":6,"description":"Requests
        for Pull Requests on GitHub.com","showcase":false,"start_date":null,"group_id":null,"page_id":"kctbh9vrtdwd","group":false,"only_show_if_degraded":false},{"id":"br0l2tvcx85d","name":"GitHub
        Actions","status":"operational","created_at":"2019-11-13T18:02:19.432Z","updated_at":"2020-10-13T20:23:36.040Z","position":7,"description":"Workflows,
        Compute and Orchestration for GitHub Actions","showcase":false,"start_date":null,"group_id":null,"page_id":"kctbh9vrtdwd","group":false,"only_show_if_degraded":false},{"id":"st3j38cctv9l","name":"GitHub
        Packages","status":"operational","created_at":"2019-11-13T18:02:40.064Z","updated_at":"2020-09-08T15:50:32.845Z","position":8,"description":"API
        requests and webhook delivery for GitHub Packages","showcase":false,"start_date":null,"group_id":null,"page_id":"kctbh9vrtdwd","group":false,"only_show_if_degraded":false},{"id":"vg70hn9s2tyj","name":"GitHub
        Pages","status":"operational","created_at":"2017-01-31T20:04:33.923Z","updated_at":"2020-10-10T00:02:38.220Z","position":9,"description":"Frontend
        application and API servers for Pages builds","showcase":false,"start_date":null,"group_id":null,"page_id":"kctbh9vrtdwd","group":false,"only_show_if_degraded":false}]}'
  recorded_at: 2020-10-15 11:59:23 GMT
  recorded_with: vcr/0.5.4, webmockr/0.7.0
  • all the times after that, unless we delete the mock file, vcr simply uses the mock files instead of actually calling the API.

Let’s tweak our other test, of gh_organizations(). Here things get more exciting or complicated, as we also set out to adding a test of the error behavior. This inspired us to change error behavior a bit with a slightly more specific error message i.e. httr::stop_for_status(response) became httr::stop_for_status(response, task = "get data from the API, oops").

The test file tests/testthat/test-organizations.R is now:

test_that("gh_organizations works", {
  vcr::use_cassette("gh_organizations", {
    orgs <- gh_organizations()
  })
  testthat::expect_type(orgs, "character")
})

test_that("gh_organizations errors when the API doesn't behave", {
  webmockr::enable()
  stub <- webmockr::stub_request("get", "https://api.github.com/organizations?since=1")
  webmockr::to_return(stub, status = 502)
  expect_error(gh_organizations(), "oops")
  webmockr::disable()
})

The first test is similar to what we did for gh_api_status(). In the second test there is more to unpack.

  • We enable the use of webmockr at the beginning with webmockr::enable(). Why webmockr? Because it can help mock a failure scenario.
  • We explicitly write that a request to https://api.github.com/organizations?since=1 should return a status of 502.
  stub <- webmockr::stub_request("get", "https://api.github.com/organizations?since=1")
  webmockr::to_return(stub, status = 502)
  • We then test for the error message with expect_error(gh_organizations(), "oops").
  • We disable webmockr with webmockr::disable().

Instead of using webmockr for creating a fake API eror, we could have

  • recorded a normal cassette;
  • edited it to replace the status code.

Read pros and cons of this approach in the vcr vignette Why and how edit your vcr cassettes?, especially if you don’t find the webmockr approach enjoyable.

Without the HTTP testing infrastructure, testing for behavior of the package in case of API errors would be more difficult.

Regarding our secret API token, the first time we run the test file, vcr creates a cassette where we notice these lines

http_interactions:
- request:
    method: get
    uri: https://api.github.com/organizations?since=1
    body:
      encoding: ''
      string: ''
    headers:
      Accept: application/json, text/xml, application/xml, */*
      Content-Type: ''
      Authorization: My bearer token is safe

Our API token has been replaced with the string we indicated in vcr::vcr_configure(), My bearer token is safe.

6.3 Also testing for real interactions

What if the API responses change? Hopefully we’d notice that thanks to following API news. However, sometimes web APIs change without any notice. Therefore it is important to run tests against the real web service once in a while.

The vcr package provides various methods to turn vcr use on and off to allow real requests i.e. ignoring mock files. See ?vcr::lightswitch.

In the case of exemplighratia, we added a GitHub Actions workflow that will run on schedule once a week, for which one of the build has vcr turned off via the VCR_TURN_OFF environment variable. We chose to have one build with vcr turned on and otherwise the same configuration to make it easier to assess what broke in case of failure (if both builds fail, the web API is probably not the culprit). Compared to continuous integration builds where vcr is turned on, this one build needs to have access to a GITHUB_PAT secret environment variable. Furthermore, it is slower.

One could imagine other strategies:

  • Always having one continuous integration build with vcr turned off but skipping it in contexts where there isn’t any token (pull requests from forks for instance?);
  • Only running tests with vcr turned off locally once in a while.

6.4 Summary

  • We set up vcr usage in our package exemplighratia by running use_vcr() and tweaking the setup file to protect our secret API key and to fool our own package that needs an API token.
  • Inside test_that() blocks, we wrapped parts of the code into vcr::use_cassette() and ran the tests a first time to generate mock files that hold all information about the API interactions.
  • In one of the tests, we used webmockr to create an environment where only fake requests are allowed. We defined that the request that gh_organizations() makes should get a 502 status. We were therefore able to test for the error message gh_organizations() returns in such cases.

Now, how do we make sure this works?

  • Turn off wifi, run the tests again. It works! Turn on wifi again.
  • Open .Renviron (usethis::edit_r_environ()), edit “GITHUB_PAT” into “byeGITHUB_PAT”, re-start R, run the tests again. It works! Fix your “GITHUB_PAT” token in .Renviron.

So we now have tests that no longer rely on an internet connection nor on having API credentials.

We also added a continuous integration workflow for having a build using real interactions once every week, as it is important to regularly make sure the package still works against the latest API responses.

For the full list of changes applied to exemplighratia in this chapter, see the pull request diff on GitHub.

How do we get there with other packages? Let’s try httptest in the next chapter!

6.5 PS: Where to put use_cassette()

Where do we put the vcr::use_cassette() call? Well, as written in the manual page of that function, There’s a few ways to get correct line numbers for failed tests and one way to not get correct line numbers: What’s correct?

  • Wrapping the whole testthat::test_that() call (do not do that if your test contains for instance `skip_on_cran()``);
vcr::use_cassette("thing", {
  testthat::test_that("thing", {
    lala <- get_foo()
    expect_true(lala)
  })
})
testthat::test_that("thing", {
  vcr::use_cassette("thing", {
    lala <- get_foo()
  })
    expect_true(lala)
})

What’s incorrect?

testthat::test_that("thing", {
  vcr::use_cassette("thing", {
    lala <- get_foo()
    expect_true(lala)
  })
})

We used the solution of only wrapping the lines containing API calls in vcr::use_cassette(), but it is up to you to choose what you prefer.