Sunday, August 21, 2022

How To Recover the Password for an Encrypted Veracrypt Container

Veracrypt, the successor of Truecrypt, is a free and open-source encryption software. Think of it like Microsoft's Bitlocker -- except it is free and completely open source.[1]

In this blog post, I will briefly show you how to recover the password for a (non-system) Veracrypt container if you still remember a substantial part of it. I've managed to recover my own partially forgotten passwords using the methods I detail below. Essentially, we will be conducting a hybrid attack on the PBKDF2-derived key for the Veracrypt container -- i.e., a hybrid between a dictionary attack and a brute-force attack. The tools that we will make use of are any Linux distribution (Debian, Red Hat, etc.), Python, and Hashcat, a powerful and open-source password cracker.[2]

Let us begin. The first thing to bear in mind is that the first 512 bytes of a Veracrypt container is a header which contains all the necessary ingredients to calculate the key to your container (e.g., the salt). Below is a detailed table of (at least) what is contained in the first 512 bytes.


 

Note that the header is the first 512 bytes only in standard volumes or containers, not in system volumes. In the latter, Veracrypt documentation specifies that the header is the last 512 bytes of the first logical track. This is what the documentation states:

The first 512 bytes of the volume (i.e., the standard volume header) are read into RAM, out of which the first 64 bytes are the salt (see VeraCrypt Volume Format Specification). For system encryption (see the chapter System Encryption), the last 512 bytes of the first logical drive track are read into RAM (the VeraCrypt Boot Loader is stored in the first track of the system drive and/or on the VeraCrypt Rescue Disk).
In any case, we want to extract this derived key-hash and then check to see if it matches any of our computed hashes from a word list. If we get a match, then we will know that we have recovered our password! So the first thing we want to do is to extract these initial 512 bytes. This can be easily done in the Linux command line; just type the following in your terminal:

dd if=[container name] of=[anything] bs=512 count=1



"If" stands for "input file," and "of" stands for "output file." "bs" just means bytesize, and "count" just means how many iterations of the bytes to go through. So the above terminal command will produce the same output as the following:

dd if=[container name] of=[anything] bs=1 count=512



Alternatively, you can just run the following Python code to the same effect (just be sure to change the file names accordingly):

with open("in-file", "rb") as in_file, open("out-file", "wb") as out_file: out_file.write(in_file.read(512))

 

We have now extracted the header from the Veracrypt container. As we shall soon see, Hashcat has an option that will automatically recognize a Veracrypt header and compute the derived key from the information contained therein (e.g., from the salt and key-derivation function used). But first, let's generate the wordlist we will pass to Hashcat.

Now, for the sake of the demonstration here, we'll keep it simple. So let's say that our password that we forgot was ACHILLESheel2385. Furthermore, let's say that we remember ACHILLESheel, and that there was a 1 - 5 digit number at the end. So, taking a step back, it should be clear what we need to do: produce a word list containing every character set that matches the following RegEx pattern:

/ACHILLESheel\d{1,5}/

Thankfully, there's a great Python module that will construct all the strings that match a specified RegEx pattern -- exrex. In your terminal, type the following command (but just make sure that you have the exrex module installed):

python exrex.py 'ACHILLESheel\d{1,5}' -o wordlist

 

This will generate a wordlist of ACHILLESheel{all combinations of five digits} in wordlist.txt of your current working directory. We are now ready to use Hashcat. If you type "hashchat -h" in your terminal you should see that hashcat has a lot of numerical numbers associated with different types of hashing algorithms and software, including Veracrypt. Hashcat basically has preprogrammed binaries that are used to compute the hashes made from a wide variety of algorithims and programs. The numerical value that we are looking for is 13721, as this is Hascut's reference for Veracrypt's default encryption settings. There are other non-default options that Hashcat has, but 13721 is the one we are looking for.



So all the preparation is done: we are now ready to to run hashcat and crack the password! Type in the following command:

hashcat -m 13721 --status out-file wordlist.txt



Note that if you see that the 111110 passwords are taking too long to crack on your system, you can just "cheat" and use a different regex pattern that will drastically decrease the computing time (since this is just an example). For example, you can use /ACHILLESheel23\d{2}/. This will cut the password wordlist down from 111110 to 100! After you run Your output should be something like the following:

So there you have it  -- this is the process for successfully recovering a Veracrypt password that you partially remember! It's important to underscore that the more of your password you remember, the way easier it is to crack it. If you don't have any idea of what your password is, and just remember that it is very long and complex, it's very likely that all the computers in the world wouldn't be able to crack the password for aeons!

----------------------------------------------------------------------------

[1] In general, I believe that FOSS that is widely used is more secure than closed-source software counterparts. This is for the simple reason that FOSS is open-source software and so, if it is quite prominent (like Veracrypt, Signal, DD-WRT, Ansible, Ubuntu, Libre Office, Android, etc.), it should have undergone more security-expert scrutiny than its closed-source counterparts. Indeed, it should be continuously undergoing more scrutiny than its commercial closed-source counterparts. Now, one may argue that the sword cuts both ways here: the open-source nature of FOSS means that there are more malicious developers with access to the code base and actively trying to develop exploits against it. Thus, for example, there is more malware written against (closed-source) Windows operating systems than there is written against (open-source) Linux operating systems. While this is true, I still think that the sword cuts more in the way of security than insecurity here, as I doubt that critical vulnerabilities discovered by an APT open-source software will go undetected by the rest of the world's security experts for much longer than similar bvulnerabilities in closed-source software. The latter are simply lack boxes. There's no way for the world's security community to tell whether such software is safe and free of backdoors. So my view is that when it comes to security, FOSS is king. And I believe most infosec experts would agree. 

[2] We could also use Crunch, a program that generates password wordlists based on user-defined parameters. But if one knows RegEx it should be easier to use the Python module exrex to generate wordlists based on specified RegEx patterns. One can use a tool like RegExPal to check if one's desired output text(s) matches one's RegEx pattern.