13 Security

When developing a package that uses secrets (API keys, OAuth tokens) and produces them (OAuth tokens, sensitive data),

  • You want the secrets to be usable by you, collaborators and CI services, without being readable by anyone else;
  • You want tests and checks (e.g. vignette building) that use the secrets to be turned off in environments where secrets won’t be available (CRAN, forks of your development repository).

Your general attitude should be to think about:

  • what are my secrets (an API key, an OAuth2.0 access token and the refresh token, etc.) and where/how exactly are there used (in the query part of an URL? as a header? which header, Authentication or something else?) – packages like httr might abstract some of the complexity for you but you need to really know where secrets are used and could be leaked,
  • what could go wrong (e.g. your token ending up being published),
  • how to prevent that (save your unedited token outside of your package, make sure it is not printed in logs or present in package check artefacts),
  • how to fix mistakes (how do you deactivate a token and how do you check no one used it in the meantime).

13.1 Managing secrets securely

13.1.1 Follow best practice when developing your package

This book is about testing but security starts with how you develop your package. To better protect your users’ secret,

  • It might be best not to let users pass API keys as parameters. It’s best to have them save them in .Renviron or e.g. using the keyring package. This way, API keys are not in scripts. The opencage package might provide some inspiration.

  • If the API you are working with lets you pass keys either in the request headers or query string, prefer to use request headers.

13.1.2 Share secrets with continuous integration services

You need to share secrets with continuous integration services… for real requests only! For tests using vcr, httptest or webfakes, you at most need a fake secret, e.g. “foobar” as API key – except for recording cassettes and mock files, but that is something you do locally.

In GitHub repositories, when storing a new secret, do not save it with quotes. I.e. if your secret is “blabla”, the field should contain blabla, not "blabla" nor 'blabla'.

knitr::include_graphics("secret.png")
Screenshot of the interface for adding secrets in a GitHub repository, showing how the secret is stored without any quote.

13.1.2.1 API keys

For API keys, you can use something like GitHub repo secrets if you use GitHub Actions. Then for the secret to be accessible as environment variable from your workflow in GitHub Actions as explained in gargle docs you need to add a line like

env:
  PACKAGE_PASSWORD: ${{ secrets.PACKAGE_PASSWORD }}

13.1.2.2 More complex objects

If your secret is an OAuth token, you might be able to re-create it from pieces, where the pieces are strings you can store as repo secrets much like what you’d do for an API key. E.g. if your secret is an OAuth token, the actual secrets are the access token and refresh token.

env:
  ACCESS_TOKEN: ${{ secrets.ACCESS_TOKEN }}
  REFRESH_TOKEN: ${{ secrets.REFRESH_TOKEN }}

Therefore you could re-create it using e.g. the credentials argument of httr::oauth2.0_token(). The re-creation using environment variables Sys.getenv("ACCESS_TOKEN") and Sys.getenv("REFRESH_TOKEN") would happen in a testthat helper file.

13.1.3 Secret files

For files, you will need to use encryption and to store a text-version of the encryption key/passwords as GitHub repo secret if you use GitHub Actions. Read the documentation of the continuous integration service your are using to find out how secrets are protected and how you can use them in your builds.

See gargle vignette about securely managing tokens.

The approach is:

  • Create your OAuth token locally, either outside of your package folder, or inside of it if you really want to, but gitignored and Rbuildignored.
  • Encrypt it using e.g. the user-friendly cyphr package. Save the code for this and for the step before in a file e.g. inst/secrets.R for when you need to re-create a token as even refresh tokens expire.
  • For encrypting you need some sort of password. You will want to save it securely as text in your user-level .Renviron and in your GitHub repo secrets (or equivalent secret place for other CI services). E.g. create a key via sodium_key <- sodium::keygen() and get its text equivalent via sodium::bin2hex(sodium_key). E.g. the latter command might give me e46b7faf296e3f0624e6240a6efafe3dfb17b92ae0089c7e51952934b60749f2 and I will save this in .Renviron
MEETUPR_PWD="e46b7faf296e3f0624e6240a6efafe3dfb17b92ae0089c7e51952934b60749f2"

Example of a script creating and encrypting an OAuth token (for the Meetup API).

# thanks Jenny Bryan https://github.com/r-lib/gargle/blob/4fcf142fde43d107c6a20f905052f24859133c30/R/secret.R

token_path <- testthat::test_path(".meetup_token.rds")
use_build_ignore(token_path)
use_git_ignore(token_path)

meetupr::meetup_auth(
  token = NULL,
  cache = TRUE,
  set_renv = FALSE,
  token_path = token_path
)

# sodium_key <- sodium::keygen()
# set_renv("MEETUPR_PWD" = sodium::bin2hex(sodium_key))
# set_renv being an internal function taken from rtweet
# that saves something to .Renviron

# get key from environment variable
key <- cyphr::key_sodium(sodium::hex2bin(Sys.getenv("MEETUPR_PWD")))

cyphr::encrypt_file(
  token_path,
  key = key,
  dest = testthat::test_path("secret.rds")
)
  • In tests you have a setup file with code like below.
key <- cyphr::key_sodium(sodium::hex2bin(Sys.getenv("MEETUPR_PWD")))

temptoken <- tempfile(fileext = ".rds")

cyphr::decrypt_file(
  testthat::test_path("secret.rds"),
  key = key,
  dest = temptoken
)

Now what happens in contexts where MEETUPR_PWD is not available? Well there should be no tests using it! See our chapter about making real requests.

13.1.4 Do not store secrets in the cassettes, mock files, recorded responses

  • With vcr make sure to configure vcr correctly.
  • With httptest only the response body (and headers, but not by default) are recorded. If those contains secrets, refer to the documentation about redacting sensitive information.
  • With webfakes you will be creating recorded responses yourself, make sure this process does not leak secrets. If you test something related to authentication, use fake secrets.

If the API you are interacting with uses OAuth for instance, make sure you are not leaking access tokens nor refresh tokens.

13.1.5 Escape tests that require secrets

This all depends on your setup for testing real requests. You have to be sure no test requiring secrets will be run on CRAN for instance.

13.2 Sensitive recorded responses?

In that case you might want to gitignore the cassettes / mock files / recorded responses, and skip the tests using them on continuous integration (e.g. testthat::skip_on_ci() or something more involved). You’d also Rbuildignore the cassettes / mock files / recorded responses, as you do not want to release them to CRAN.

13.3 Further resources

Some tools might help you detect leaks or prevent them.

  • shhgit’s goal is “Find secrets in your code. Secrets detection for your GitHub, GitLab and Bitbucket repositories”.
  • Yelp’s detect-secret is “An enterprise friendly way of detecting and preventing secrets in code.”.
  • git-secret is a “bash tool to store your private data inside a git repo”.