14 Security
When developing a package that uses secrets (API keys, OAuth tokens) and produces them (OAuth tokens, sensitive data),
- You want the secrets to be usable by you, collaborators and CI services, without being readable by anyone else;
- You want tests and checks (e.g. vignette building) that use the secrets to be turned off in environments where secrets won’t be available (CRAN, forks of your development repository).
Your general attitude should be to think about:
- what are my secrets (an API key, an OAuth2.0 access token and the refresh token, etc.) and where/how exactly are there used (in the query part of an URL? as a header? which header, Authentication or something else?) – packages like httr or httr2 might abstract some of the complexity for you but you need to really know where secrets are used and could be leaked,
- what could go wrong (e.g. your token ending up being published),
- how to prevent that (save your unedited token outside of your package, make sure it is not printed in logs or present in package check artefacts),
- how to fix mistakes (how do you deactivate a token and how do you check no one used it in the meantime).
14.1 Managing secrets securely
14.1.1 Follow best practice when developing your package
This book is about testing but security starts with how you develop your package. To better protect your users’ secret,
It might be best not to let users pass API keys as parameters. It’s best to have them save them in
.Renviron
or e.g. using the keyring package. This way, API keys are not in scripts. The opencage package might provide some inspiration.If the API you are working with lets you pass keys either in the request headers or query string, prefer to use request headers.
14.1.3 Secret files
For files, you will need to use encryption and to store a text-version of the encryption key/passwords as GitHub repo secret if you use GitHub Actions. Read the documentation of the continuous integration service your are using to find out how secrets are protected and how you can use them in your builds.
See gargle vignette about securely managing tokens.
The approach is:
- Create your OAuth token locally, either outside of your package folder, or inside of it if you really want to, but gitignored and Rbuildignored.
- Encrypt it using e.g. the user-friendly cyphr package. Save the code for this and for the step before in a file e.g. inst/secrets.R for when you need to re-create a token as even refresh tokens expire.
- For encrypting you need some sort of password. You will want to save it securely as text in your user-level .Renviron and in your GitHub repo secrets (or equivalent secret place for other CI services). E.g. create a key via
sodium_key <- sodium::keygen()
and get its text equivalent viasodium::bin2hex(sodium_key)
. E.g. the latter command might give mee46b7faf296e3f0624e6240a6efafe3dfb17b92ae0089c7e51952934b60749f2
and I will save this in .Renviron
MEETUPR_PWD="e46b7faf296e3f0624e6240a6efafe3dfb17b92ae0089c7e51952934b60749f2"
Example of a script creating and encrypting an OAuth token (for the Meetup API).
# thanks Jenny Bryan https://github.com/r-lib/gargle/blob/4fcf142fde43d107c6a20f905052f24859133c30/R/secret.R
token_path <- testthat::test_path(".meetup_token.rds")
use_build_ignore(token_path)
use_git_ignore(token_path)
meetupr::meetup_auth(
token = NULL,
cache = TRUE,
set_renv = FALSE,
token_path = token_path
)
# sodium_key <- sodium::keygen()
# set_renv("MEETUPR_PWD" = sodium::bin2hex(sodium_key))
# set_renv being an internal function taken from rtweet
# that saves something to .Renviron
# get key from environment variable
key <- cyphr::key_sodium(sodium::hex2bin(Sys.getenv("MEETUPR_PWD")))
cyphr::encrypt_file(
token_path,
key = key,
dest = testthat::test_path("secret.rds")
)
- In tests you have a setup / helper file with code like below.
key <- cyphr::key_sodium(sodium::hex2bin(Sys.getenv("MEETUPR_PWD")))
temptoken <- tempfile(fileext = ".rds")
cyphr::decrypt_file(
testthat::test_path("secret.rds"),
key = key,
dest = temptoken
)
Now what happens in contexts where MEETUPR_PWD
is not available?
Well there should be no tests using it!
See our chapter about making real requests.
14.1.4 Do not store secrets in the cassettes, mock files, recorded responses
- With vcr make sure to configure vcr correctly.
- With httptest and httptest2 only the response body (and headers, but not by default) are recorded. If those contains secrets, refer to the documentation about redacting sensitive information (for httptest2).
- With webfakes you will be creating recorded responses yourself, make sure this process does not leak secrets. If you test something related to authentication, use fake secrets.
If the API you are interacting with uses OAuth for instance, make sure you are not leaking access tokens nor refresh tokens.
14.1.5 Escape tests that require secrets
This all depends on your setup for testing real requests. You have to be sure no test requiring secrets will be run on CRAN for instance.
14.2 Sensitive recorded responses?
In that case you might want to gitignore the cassettes / mock files / recorded responses,
and skip the tests using them on continuous integration (e.g. testthat::skip_on_ci()
or something more involved).
You’d also Rbuildignore the cassettes / mock files / recorded responses, as you do not want to release them to CRAN.
14.3 Further resources
Some tools might help you detect leaks or prevent them.
- shhgit’s goal is “Find secrets in your code. Secrets detection for your GitHub, GitLab and Bitbucket repositories”.
- Yelp’s detect-secret is “An enterprise friendly way of detecting and preventing secrets in code.”.
- git-secret is a “bash tool to store your private data inside a git repo”.