Chapter 30 Managing cassettes | HTTP testing in R

30.1 Why edit cassettes?

By design vcr is very good at recording HTTP interactions that actually took place. Now sometimes when testing/demo-ing your package you will want to use fake HTTP interactions. For instance:

What happens if the web API returns a 503 code? Is there an informative error?
What happens if it returns a 503 and then a 200 code? Does the retry work?
What if the API returns too much data for even simple queries and you want to make your cassettes smaller?

In all these cases, you can edit your cassettes as long as you are aware of the risks!

30.2 Risks related to cassette editing

If you use a vcr cassette where you replace a 200 code with a 503 code, and vcr is turned off, the test will fail because the API will probably not return an error. Use vcr::skip_if_vcr_off().
If you edit cassettes by hand you can’t re-record them easily, you’d need to re-record them then re-apply your edits.

Therefore you’ll need to develop a good workflow.

30.3 Example 1: test using an edited cassette with a 503

First, write your test e.g.

vcr::use_cassette("api-error", {
  test_that("Errors are handled well", {
    vcr::skip_if_vcr_off()
    expect_error(call_my_api()), "error message")
  })
})

Then run your tests the first time.

It will fail
It will have created a cassette under tests/fixtures/api-error.yml that looks something like

http_interactions:
- request:
    method: get
    uri: https://eu.httpbin.org/get
    body:
      encoding: ''
      string: ''
    headers:
      User-Agent: libcurl/7.54.0 r-curl/3.2 crul/0.5.2
  response:
    status:
      status_code: '200'
      message: OK
      explanation: Request fulfilled, document follows
    headers:
      status: HTTP/1.1 200 OK
      connection: keep-alive
    body:
      encoding: UTF-8
      string: "{\n  \"args\": {}, \n  \"headers\": {\n    \"Accept\": \"application/json,
        text/xml, application/xml, */*\", \n    \"Accept-Encoding\": \"gzip, deflate\",
        \n    \"Connection\": \"close\", \n    \"Host\": \"httpbin.org\", \n    \"User-Agent\":
        \"libcurl/7.54.0 r-curl/3.2 crul/0.5.2\"\n  }, \n  \"origin\": \"111.222.333.444\",
        \n  \"url\": \"https://eu.httpbin.org/get\"\n}\n"
  recorded_at: 2018-04-03 22:55:02 GMT
  recorded_with: vcr/0.1.0, webmockr/0.2.4, crul/0.5.2

You can edit to (new status code)

http_interactions:
- request:
    method: get
    uri: https://eu.httpbin.org/get
    body:
      encoding: ''
      string: ''
    headers:
      User-Agent: libcurl/7.54.0 r-curl/3.2 crul/0.5.2
  response:
    status:
      status_code: '503'

And run your test again, it should pass! Note the use of vcr::skip_if_vcr_off(): if vcr is turned off, there is a real API request and most probably this request won’t get a 503 as a status code.

30.3.1 The same thing with webmockr

The advantage of the approach involving editing cassettes is that you only learn one thing, which is vcr. Now, by using the webmockr directly in your tests, you can also test for the behavior of your package in case of errors. Below we assume api_url() returns the URL call_my_api() calls.

test_that("Errors are handled well", {
  webmockr::enable()
  stub <- webmockr::stub_request("get", api_url())
  webmockr::to_return(stub, status = 503)
  expect_error(call_my_api()), "error message")
  webmockr::disable()

})

A big pro of this approach is that it works even when vcr is turned off. A con is that it’s quite different from the vcr syntax.

30.4 Example 2: test using an edited cassette with a 503 then a 200

Here we assume your package contains some sort of retry.

First, write your test e.g.

vcr::use_cassette("api-error", {
  test_that("Errors are handled well", {
    vcr::skip_if_vcr_off()
    expect_message(thing <- call_my_api()), "retry message")
    expect_s4_class(thing, "data.frame")
  })
})

Then run your tests the first time.

It will fail
It will have created a cassette under tests/fixtures/api-error.yml that looks something like

http_interactions:
- request:
    method: get
    uri: https://eu.httpbin.org/get
    body:
      encoding: ''
      string: ''
    headers:
      User-Agent: libcurl/7.54.0 r-curl/3.2 crul/0.5.2
  response:
    status:
      status_code: '200'
      message: OK
      explanation: Request fulfilled, document follows
    headers:
      status: HTTP/1.1 200 OK
      connection: keep-alive
    body:
      encoding: UTF-8
      string: "{\n  \"args\": {}, \n  \"headers\": {\n    \"Accept\": \"application/json,
        text/xml, application/xml, */*\", \n    \"Accept-Encoding\": \"gzip, deflate\",
        \n    \"Connection\": \"close\", \n    \"Host\": \"httpbin.org\", \n    \"User-Agent\":
        \"libcurl/7.54.0 r-curl/3.2 crul/0.5.2\"\n  }, \n  \"origin\": \"111.222.333.444\",
        \n  \"url\": \"https://eu.httpbin.org/get\"\n}\n"
  recorded_at: 2018-04-03 22:55:02 GMT
  recorded_with: vcr/0.1.0, webmockr/0.2.4, crul/0.5.2

You can duplicate the HTTP interaction, and make the first one return a 503 status code. vcr will first use the first interaction, then the second one, when making the same request.

http_interactions:
- request:
    method: get
    uri: https://eu.httpbin.org/get
    body:
      encoding: ''
      string: ''
    headers:
      User-Agent: libcurl/7.54.0 r-curl/3.2 crul/0.5.2
  response:
    status:
      status_code: '503'
- request:
    method: get
    uri: https://eu.httpbin.org/get
    body:
      encoding: ''
      string: ''
    headers:
      User-Agent: libcurl/7.54.0 r-curl/3.2 crul/0.5.2
  response:
    status:
      status_code: '200'
      message: OK
      explanation: Request fulfilled, document follows
    headers:
      status: HTTP/1.1 200 OK
      connection: keep-alive
    body:
      encoding: UTF-8
      string: "{\n  \"args\": {}, \n  \"headers\": {\n    \"Accept\": \"application/json,
        text/xml, application/xml, */*\", \n    \"Accept-Encoding\": \"gzip, deflate\",
        \n    \"Connection\": \"close\", \n    \"Host\": \"httpbin.org\", \n    \"User-Agent\":
        \"libcurl/7.54.0 r-curl/3.2 crul/0.5.2\"\n  }, \n  \"origin\": \"111.222.333.444\",
        \n  \"url\": \"https://eu.httpbin.org/get\"\n}\n"
  recorded_at: 2018-04-03 22:55:02 GMT
  recorded_with: vcr/0.1.0, webmockr/0.2.4, crul/0.5.2

And run your test again, it should pass! Note the use of vcr::skip_if_vcr_off(): if vcr is turned off, there is a real API request and most probably this request won’t get a 503 as a status code.

30.4.1 The same thing with webmockr

The advantage of the approach involving editing cassettes is that you only learn one thing, which is vcr. Now, by using the webmockr directly in your tests, you can also test for the behavior of your package in case of errors. Below we assume api_url() returns the URL call_my_api() calls.

test_that("Errors are handled well", {
  webmockr::enable()
  stub <- webmockr::stub_request("get", api_url())
  stub %>%
  to_return(status = 503)  %>%
  to_return(status = 200, body = "{\n  \"args\": {}, \n  \"headers\": {\n    \"Accept\": \"application/json,
        text/xml, application/xml, */*\", \n    \"Accept-Encoding\": \"gzip, deflate\",
        \n    \"Connection\": \"close\", \n    \"Host\": \"httpbin.org\", \n    \"User-Agent\":
        \"libcurl/7.54.0 r-curl/3.2 crul/0.5.2\"\n  }, \n  \"origin\": \"111.222.333.444\",
        \n  \"url\": \"https://eu.httpbin.org/get\"\n}\n", headers = list(b = 6))
  expect_message(thing <- call_my_api()), "retry message")
    expect_s4_class(thing, "data.frame")
  webmockr::disable()

})

The pro of this approach is the elegance of the stubbing, with the two different responses. Each webmockr function like to_return() even has an argument times indicating the number of times the given response should be returned.

The con is that on top of being different from vcr, in this case where we also needed a good response in the end (the one with a 200 code, and an actual body), writing the mock is much more cumbersome than just recording a vcr cassette.

Be aware when you add your cassettes to either .gitignore and/or .Rbuildignore.

30.5 gitignore cassettes

The .gitignore file lets you tell [git][] what files to ignore - those files are not tracked by git and if you share the git repository to the public web, those files in the .gitignore file won’t be shared in the public version.

When using vcr you may want to include your cassettes in the .gitignore file. You may wan to when your cassettes contain sensitive data that you don’t want to have on the internet & dont want to hide with filter_sensitive_data.

You may want to have your cassettes included in your GitHub repo, both to be present when tests run on CI, and when others run your tests.

There’s no correct answer on whether to gitignore your cassettes. Think about security implications and whether you want CI and human contributors to use previously created cassettes or to create/use their own.

30.6 Rbuildignore cassettes

The .Rbuildignore file is used to tell R to ignore certain files/directories.

There’s not a clear use case for why you’d want to add vcr cassettes to your .Rbuildignore file, but if you do be aware that will affect your vcr enabled tests.

30.7 sharing cassettes

Sometimes you may want to share or re-use cassettes across tests, for example to reduce the size for package sources or to test different functionality of your package functions that make the same query under the hood.

To do so, you can use the same cassette name for multiple vcr::use_cassette() calls. vcr::check_cassette_names() will complain about duplicate cassette names, preventing you from accidentally re-using cassettes, however. To allow duplicates, you can provide a character vector of the cassette names you want to re-use to the allowed_duplicates argument of vcr::check_cassette_names(). That way you can use the same cassette across multiple tests.

30.8 deleting cassettes

Removing a cassette is as easy as deleting in your file finder, or from the command line, or from within a text editor or RStudio.

If you delete a cassette, on the next test run the cassette will be recorded again.

If you do want to re-record a test to a cassette, instead of deleting the file you can toggle record modes.

30.9 cassette file types

For right now the only persistence option is yaml. So all files have a .yml extension.

When other persister options are added, additional file types may be found. The next persister type is likely to be JSON, so if you use that option, you’d have .json files instead of .yml files.