29 Aug 2017

restic cryptography

tl;dr: this is not an audit and I take no responsibility for your backups, but I had a quick look at the crypto and I think I’m going to use restic for my personal backups.

I keep hearing good things about restic. I am redoing my storage solution, and restic seems to tick all the boxes for my personal backups:

Open Source
written in Go
runs on OpenBSD
B2 backend
good docs
verifiable backups
reversable format
encryption

But it does not look like the encryption has ever been audited.

Today I have to wait a couple hours to get a passport (I’m Italian, this involved rolling dice for Charm Person) so I figured I would have a look at it.

Important: this does NOT qualify as a professional audit, nor am I endorsing restic’s encryption beyond “I looked at it in a noisy waiting room for an hour I guess”.

This post also does not attempt to fully explain all the cryptography it mentions, so if you find something particularly curious, confusing, or fascinating do let me know and I’ll try to write properly about it.

Data model

Let’s see what we are encrypting. The full reference is here.

restic backs up to repositories. A repository is stored at a given remote. There’s deduplication across the repository.

Repository contents are content-addressed by SHA-256 at the encrypted file level, not at the backed up file level. That’s good not to leak hashes of files, but I wonder how deduplication works.

Everything else is abstractions built out of JSON inside those files. This is very reminiscent of Camlistore. I wonder if there are caches/indexes or if it knows how to walk the tree from scratch. (Turns out, both! See below.)

Threat model

They have an explicit threat model, which is great!

It does speak more about implementation that adversaries, but now I’m asking too much.

Essentially: confidentiality and authentication for stored files and metadata against an adversarial repository. Rollbacks and snapshot correlation are possible. Checks out.

I imagine file/backup sizes are not protected, but it’s not mentioned.

Encryption

Every in-repo file (the ones of which the SHA-256 is taken) is encrypted with AES-256-CTR-Poly1305-AES. Format is “16 bytes random IV + ciphertext + MAC”: encrypt-then-MAC, good.

Unusual choice, not using AES-GCM. To be clear: GCM is awful, but for TLS reasons it’s the AEAD with the fastest implementations. Case in point, in Go AES-CTR is much slower with no good reason. Also unusual is not using the TLS pairing ChaCha20-Poly1305. Choosing AES makes sense to use hardware acceleration, but then in practice ChaCha20-Poly1305 might be faster than AES-CTR for the reason above. All in all, it’s a rational choice in theory, but unusual and a bit sub-optimal in practice. While still secure, a “self-rolled” AEAD is a bit of a yellow flag (but not a red flag).

I admit I haven’t studied Poly1305-AES before, but it’s the original design of Poly1305. The Poly1305 authenticator as I know it and as implemented by Go uses one-time keys; Poly1305-AES uses a fixed-key AES operation to encrypt a nonce to make one-time keys. Poly1305 keys are 32 bytes, used in two halves as r (which is fixed in Poly1305-AES and gets masked) and s (which is the only part derived with AES in Poly1305-AES). Might be important to check that this Poly1305-AES implementation is keeping the right half fixed, since golang.org/x/crypto/poly1305 simply takes an undocumented *[32]byte “one-time key” and I suspect switching derived and fixed half can be catastrophic. The Go package does not document which half is which.

Good news is that even if the paper says…

There are safe ways to reuse k for encryption, but those ways are not analyzed in this paper.

… restic took the high road and uses separate keys for encryption and Poly1305-AES.

Password to key derivation is classic double-wrapping: plaintext key metadata files (which include hostname/username) contain scrypt parameters, salt and a data blob. scrypt with the supplied password is used as a KDF to generate the keys to decrypt (AES256-Poly1305-AES) the blob, which contains the master key. An attacker can prevent password revocation, so changing a password after it got compromised doesn’t protect future backups to that repository. This could be improved by making a new master key for subsequently generated blobs.

I wonder if it’s possible, manipulating N/r/p/salt, to make an unknown password generate a predictable key. If it is, an attacker can force a client to make a backup with keys they control. It probably isn’t, but I don’t have time to figure out which scrypt property it boils down to. (Exercise for the reader!)

Deduplication

Data and tree blobs are encrypted individually and packed together in pack files with an encrypted header. A full index of all plaintext blobs is kept cached by the client, and encrypted in the repository.

The data from each file is split into variable length Blobs cut at offsets defined by a sliding window of 64 byte. The implementation uses Rabin Fingerprints for implementing this Content Defined Chunking (CDC). An irreducible polynomial is selected at random and saved in the file config when a repository is initialized, so that watermark attacks are much harder.

Hmm. So this is how deduplication happens. Like Camlistore.

Not a fan of the sentence “attacks are much harder”. I know little about Rabin Fingerprints, but I can imagine the attack relies on leaks by the chunker algorithm through blob sizes. And packs don’t help because I bet you can spot the Poly1305 authenticators in them, allowing you to split the blobs up without reading the header.

I’ll add to the TODO to learn more about CDC. (There’s a post about them on the restic blog!) In the meantime I’ll trust this irreducible polynomial to make leaks not too obvious, and remember not to backup potentially attacker-supplied data in a reliable manner.

That’s important also because attacker-supplied data can lead to straightforward fingerprinting attacks on any kind of deduplicated system. (This should have been in the threat model.)

Implementation

I started from github.com/restic/restic/internal/crypto, at commit 3559f9c7760ffadd32888c531d0d08f0c1aa98e3.

Random bytes are generated with crypto/rand with panic() on error. Good.

scrypt parameters are checked with (*github.com/elithrar/simple-scrypt.Params).Check which is reassuring since they are attacker controlled.

There are explicit Key/EncryptionKey/MACKey structures with separate K and R for Poly1305-AES, which is good. The amount of indirection in filling them with random data makes me uncomfortable—one missing * and pass-by-value would result in an empty key—but it seems implemented right. Also, hey, that’s what (*Key).Valid() is checking for, neat!

The Poly1305-AES implementation seems to put r and AES(k)(s) in the right order for golang.org/x/crypto/poly1305 (see above), with r in the first 16 bytes of the key.

However, the application code insists on masking r as the paper says before calling poly1305, even if that package does not require it. And that got me worried, because the masking done by poly1305 looks different. What if poly1305 decided to use a different representation (since it can, since the key parts and format are unspecified) and now restic is masking off meaningful bits? 😱

paper keys section

And now I’m reading 130-bit arithmetic mapped to 26-bit integer registries to figure out what DJB’s magic mask actually does to the numbers (deja-vu… coughclampingcough). This is what happens when you roll your own crypto. You make cryptographers hurt. Think about your cryptographer.

Turns out that it’s the same masking in 26-bit little-endian windows. Here’s the ASCII art sketch I had to make to figure it out. (It compares the paper masks, above, and the poly1305 ones, below, after reversing the bit shifts and swapping endianness).

Anyway applying the mask is pointless, dangerous, and took 45+ minutes to audit. I’m going to submit a PR to remove it when I recover.

I simplified the rest of Encrypt/Decrypt while reviewing it. The Decrypt API (as opposed to the Encrypt one), does not return a new slice, but only an int for the caller to slice the plaintext with. That felt like an easy and bad thing to forget, so I inspected the callers with Sourcegraph. Found no issues.

(FWIW, I’m a fan of append()-like interfaces, where you can still reuse buffers by slicing to [:0] but can’t forget to reassign the resulting slice.)

There seems to be a problem with the Decrypt overlap rules, though. To save memory you might want to use the same buffer for plaintext and ciphertext. That is allowed with some constraints. The cipher.Stream interface implemented by CTR mode says this at the method XORKeyStream:

Dst and src may point to the same memory.

That ain’t helpful. There is already a discussion about it in a Go issue but I think it means that dst and src may overlap entirely (without misalignment) or not at all.

However, calling Decrypt with the same buffer for plaintext and ciphertext (as done throughout the code and allowed by the docs) means that buf[16:] is decrypted into buf[:] because of the IV. That’s theoretically(?) not ok.

I opened an issue about it, but I don’t think it’s a problem right now (but it might become one as the CTR implementation is optimized). I suggested adopting the standard cipher.AEAD interface, which is append-style and makes it easy to get exact overlap easy.

I finally had a quick look at internal/repository/key.go. I made sure scrypt is the only KDF, and clarified that the data field in a key object is just encrypted like any other blob. The design docs had the MAC size wrong and got me confused.

Conclusion

The design might not be perfect, but it’s good. Encryption is a first-class feature, the implementation looks sane and I guess the deduplication trade-off is worth it.

So… I’m going to use restic for my personal backups.

Again, this is not a professional assessment of the cryptosystem, nor an audit of the implementation. Thanks to Alexander Neumann for writing restic and for providing useful feedback for this post.

I had a quick look at the AES256-Poly1305-AES encryption scheme and code of "restic" before using it for my backups. https://t.co/peWE7WdLpZ
— Filippo Valsorda (@FiloSottile) 29 August 2017