The other day I set up a new OpenBSD instance with a nice RAID array, encrypted with Full Disk Encryption. And promptly proceeded to forget part of the passphrase.
We know things get interesting when I lose a password.
I did a weak attempt at finding some public bruteforce tool, and found nothing. I say weak because somewhere in the back of my brain, I already wanted to take a peek at the OpenBSD FDE implementation.
Very little is documented, and while I do trust OpenBSD, I want to know how my data is encrypted. So this was the "perfect" occasion.
Hold on, because it will be a bumpy ride, straight into the OpenBSD core sources, following notes I took during the ~3 hours process.
Goals
We need to extract enough info from the encrypted disk and rebuild enough of the decryption algorithm to be able to rapidly try many passphrases.
What this usually means in FDE is finding the details of the Key Derivation Function, and whatever mechanism is used to detect if the passphrase is correct or not.
Starting points
A prompt. A damn prompt.
# bioctl -c C -l sd3a softraid0
Passphrase:
softraid0: incorrect key or passphrase
We start chasing by looking at the bioctl
and softraid_crypto
implementations, Cmd-F'ing "Passphrase:" and "incorrect key or passphrase".
https://github.com/openbsd/src/blob/master/sys/dev/softraid_crypto.c
https://github.com/openbsd/src/blob/master/sbin/bioctl/bioctl.c
The first hit is promising.
bio_kdf_derive(&kdfinfo, &kdfhint, "Passphrase: ", 0);
void
bio_kdf_derive(struct sr_crypto_kdfinfo *kdfinfo, struct sr_crypto_kdf_pbkdf2
*kdfhint, char* prompt, int verify)
// [...]
derive_key_pkcs(kdfhint->rounds,
kdfinfo->maskkey, sizeof(kdfinfo->maskkey),
kdfhint->salt, sizeof(kdfhint->salt), prompt, verify);
derive_key_pkcs
is a banal checking wrapper for pkcs5_pbkdf2
, so we now know how the passphrase is derived into a key:
kdfinfo->maskkey = pbkdf2(password, kdfhint->salt, kdfhint->rounds)
Let's chase kdfhint
.
Pass the salt
The salt is certainly stored on the encrypted disk. The object must be populated by the lines just above the bio_kdf_derive
call, because before that its memory is zeroed:
create.bc_opaque = &kdfhint;
create.bc_opaque_size = sizeof(kdfhint);
create.bc_opaque_flags = BIOC_SOOUT;
/* try to get KDF hint */
if (ioctl(devh, BIOCCREATERAID, &create))
err(1, "ioctl");
I tried a few leads here, including following the BIOCCREATERAID
ioctl, but what got me somewhere was a code search for "bc_opaque
".
if (copyout(sd->mds.mdd_crypto.scr_meta->scm_kdfhint,
bc->bc_opaque, bc->bc_opaque_size))
goto done;
It's copied from some deeper metadata object. This seems complex. Hmmm.
Let's try a new angle: what is the type of the kdfhint
?
/*
* sr_crypto_genkdf is a generic hint for the KDF performed in userland and
* is not interpreted by the kernel.
*/
struct sr_crypto_genkdf {
u_int32_t len;
u_int32_t type;
#define SR_CRYPTOKDFT_INVALID 0
#define SR_CRYPTOKDFT_PBKDF2 1
#define SR_CRYPTOKDFT_KEYDISK 2
};
/*
* sr_crypto_genkdf_pbkdf2 is a hint for the PKCS#5 KDF performed in userland
* and is not interpreted by the kernel.
*/
struct sr_crypto_kdf_pbkdf2 {
u_int32_t len;
u_int32_t type;
u_int32_t rounds;
u_int8_t salt[128];
};
Aha! If it's "not interpreted by the kernel", then it must be verbatim in the disk metadata. We need to look at one.
A simple example
To reproduce a case where we will know if we got it right, we make a small encrypted image, with passphrase "password".
# dd if=/dev/zero of=file.img bs=1 count=1M
# vnconfig vnd0 file.img
# disklabel -E /dev/rvnd0c
Label editor (enter '?' for help at any prompt)
> a a
offset: [0]
size: [2048]
FS type: [4.2BSD] RAID
> w
> q
No label changes.
# bioctl -c C -l /dev/vnd0a softraid0
New passphrase: password
Re-type passphrase: password
softraid0: CRYPTO volume attached as sd4
Here is the hexdump: https://gist.github.com/FiloSottile/8294e708396396d6b6d49c7c839b72ec
We are looking for a sr_crypto_kdf_pbkdf2
structure, which we can recognize because it starts with a u_int32_t
length, followed by a u_int32_t
type of value 1, followed by a u_int32_t
number of rounds. There are many 01 00 00 00
(little endian!) around, but only one seems surrounded by two other u_int32_t
:
00002960 -- -- -- -- -- -- -- -- -- -- -- -- 8c 00 00 00 |..U...(zU.......|
00002970 01 00 00 00 00 20 00 00 50 1f db 08 97 6d 2c 40 |..... ..P....m,@|
00002980 63 fb ff 91 5e 6c 75 fc b9 44 86 16 77 1f 6d 65 |c...^lu..D..w.me|
00002990 4d 64 f8 56 ab 11 83 c7 7b 01 ac a0 f2 69 51 83 |Md.V....{....iQ.|
000029a0 b3 41 df c4 83 21 7a ce 75 37 3d f8 80 4f 6d 36 |.A...!z.u7=..Om6|
000029b0 06 63 55 15 ff de 7d 7a b1 ac dd 0c f8 41 63 bb |.cU...}z.....Ac.|
000029c0 42 cc a6 85 4a b5 52 f4 50 ec 9f 05 3f 9d 8b 8d |B...J.R.P...?...|
000029d0 64 fe 85 ba 8f ce 08 87 97 e2 8d 35 2c 9d 6a 2d |d..........5,.j-|
000029e0 cb 8c e2 7e 72 65 7d 7e 56 76 87 89 e6 ba cc 49 |...~re}~Vv.....I|
000029f0 bd 84 43 ef e6 3e 07 d6 00 00 00 00 00 00 00 00 |..C..>..........|
Indeed, the length field is 8c = 140 = 4 + 4 + 4 + 128
, and the rounds number 0x2000 is reasonable. We have our salt!
A checksum to check your key
While lurking this comment caught my eye:
/* Check that the key decrypted properly. */
sr_crypto_calculate_check_hmac_sha1(sd->mds.mdd_crypto.scr_maskkey,
sizeof(sd->mds.mdd_crypto.scr_maskkey),
(u_int8_t *)sd->mds.mdd_crypto.scr_key,
sizeof(sd->mds.mdd_crypto.scr_key),
check_digest);
if (memcmp(sd->mds.mdd_crypto.scr_meta->chk_hmac_sha1.sch_mac,
check_digest, sizeof(check_digest)) != 0) {
...
}
Apparently the correctness of the passphrase is checked by doing a HMAC of something, and comparing it with an expected value.
Let's see what this chk_hmac_sha1
structure is.
/*
* Check that HMAC-SHA1_k(decrypted scm_key) == sch_mac, where
* k = SHA1(masking key)
*/
struct sr_crypto_chk_hmac_sha1 {
u_int8_t sch_mac[20];
} __packed;
Oh, thanks, that makes things much easier. What the comment calls "decrypted scm_key
" is called scr_key
in the snippet above.
We have our check algorithm:
HMAC-SHA1(k=SHA1(maskkey), scr_key) == sch_mac
Keys, keys that encrypt keys
Let's see how this scr_key
is decrypted. Just above.
if (sr_crypto_decrypt((u_char *)sd->mds.mdd_crypto.scr_meta->scm_key,
(u_char *)sd->mds.mdd_crypto.scr_key,
sd->mds.mdd_crypto.scr_maskkey, sizeof(sd->mds.mdd_crypto.scr_key),
sd->mds.mdd_crypto.scr_meta->scm_mask_alg) == -1)
goto out;
sr_crypto_decrypt
is just AES-ECB-256. So last piece of the algorithm:
scr_key = AES-ECB-256_decrypt(k=maskkey, scm_key)
Hexdump spelunking
Now, it's a matter of finding scm_key
and sch_mac
in the disk image. Again, let's look at the data structures, starting with chk_hmac_sha1
.
u_int32_t scm_check_alg; /* key chksum algorithm */
#define SR_CRYPTOC_HMAC_SHA1 1
u_int32_t scm_pad2;
union {
struct sr_crypto_chk_hmac_sha1 chk_hmac_sha1;
u_int8_t chk_reserved2[64];
} _scm_chk;
Sweet. We are looking for 01 00 00 00
(scm_check_alg
), followed by 00 00 00 00
(scm_pad2
), followed by 20 random bytes (SHA1). Sure enough, just after the salt, there's our check HMAC:
00002a60 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 |................|
00002a70 00 00 00 00 26 e8 25 6f 86 8f cd 33 88 1c d4 f1 |....&.%o...3....|
00002a80 1e 9d 2a 98 ca 21 2d 9c 00 00 00 00 00 00 00 00 |..*..!-.........|
00002a90 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
Finally, we need to find the encrypted key, scm_key
. This took me a while, until I realized the size of this encrypted blob:
#define SR_CRYPTO_MAXKEYS 32 /* max keys per volume */
#define SR_CRYPTO_KEYBITS 512 /* AES-XTS with 2 * 256 bit keys */
#define SR_CRYPTO_KEYBYTES (SR_CRYPTO_KEYBITS >> 3)
u_int8_t scr_key[SR_CRYPTO_MAXKEYS][SR_CRYPTO_KEYBYTES];
/* symmetric keys used for disk encryption */
u_int8_t scm_key[SR_CRYPTO_MAXKEYS][SR_CRYPTO_KEYBYTES];
32 * 512/8 = 2048 = 0x800
, 0x800 bytes of random stuff. You can't really miss it in the hexdump. But where are the boundaries? Well, if we are lucky, the line where the big random blob starts (00002160
) and the one where the salt starts (00002960
) will be approximately... Yes! Exactly 0x800 bytes apart :)
That random blob is all key material, followed by the PBKDF2 rounds and salt, and by the check HMAC.
Wrapping it up
So now we found all the pieces to write some code and find out if our assumptions were correct:
func main() {
scmKey := decode(scmKey)
salt := decode(salt)
maskkey := pbkdf2.Key([]byte("password"), salt, rounds, 32, sha1.New)
// AES-ECB-256_decrypt(k=maskkey, scm_key) = scr_key
a, err := aes.NewCipher(maskkey)
if err != nil {
log.Fatal(err)
}
for i := 0; i < len(scmKey); i += a.BlockSize() {
a.Decrypt(scmKey[i:i+a.BlockSize()], scmKey[i:i+a.BlockSize()])
}
// HMAC-SHA1(k=maskkey, scm_key) == sch_mac
h := sha1.Sum(maskkey)
mac := hmac.New(sha1.New, h[:])
mac.Write(scmKey)
expectedMAC := mac.Sum(nil)
fmt.Print(hex.Dump(expectedMAC))
}
If we are right, this will output the same HMAC as in the last hexdump snippet. The first time I forgot to hash the maskkey
, almost tore my hair out. But then...
$ go build -i . && ./openbsd-fde-crack
00000000 26 e8 25 6f 86 8f cd 33 88 1c d4 f1 1e 9d 2a 98 |&.%o...3......*.|
00000010 ca 21 2d 9c |.!-.|
VoilĂ !
Now that we know how to extract the data and how to try passphrases against it, it will be trivial to write a bruteforce tool to recover the part of passphrase I forgot.
There's some code here, but don't expect a fire-and-forget tool, this post gives you enough information to figure out stuff on your own: https://github.com/FiloSottile/openbsd-fde-crack
To know what happens the next time I lose a password (sigh), follow me on Twitter.
UPDATE: I found it! After fixing a bug or two in the brute force tool and almost losing hope, it found the right combination of forgotten word and (Italian) misspelling.
UPDATE: I later found a nice article documenting the entire system. It also includes references to JohnTheRipper having a module for this. Well, this was more fun.