toplip - "the best place to hide something is right under your nose." toplip is our command line, very strong encryption and decryption utility with optional plausible deniability, image embedding, and multiple/variable passphrase complexity.
- Very strong encryption (XTS-AES256 based, possibly cascaded)
- Optional "plausible deniability"
- Optional image embedding/extraction (PNG/JPG)
- Optional multiple passphrase protection
- Simplified brute force recovery protection
- No identifiable output markers
- Open source/GPLv3
- Commercial support/training
A standalone executable binary of toplip can be downloaded from our Products page. Please note that you will have to
chmod +x it after download.
As with all our products, toplip is bundled with the HeavyThing library itself. The download link for our library is in the top right of every page on our site, along with the SHA256 sum of the download itself. If you have downloaded the same version and your SHA256 does not match, it has been modified by parties other than ourselves.
NOTE: Compiling from sources is only necessary if you wish to modify toplip itself or make configuration changes to the HeavyThing library. The toplip executable binary is released as part of the HeavyThing library itself.
AUD$250,000 Crypto Throwdown
Our bug and design-flaw finding challenge has ended uneventfully. Please see the $250k Throwdown page for our published keys and commentary.
Table of Contents
- Feature Highlights
- Feature Overview
- Source Code
- Usage and options
- -b - I/O in base64
- -d - decryption
- -r - one-time pass generation
- -m mediafile - encryption output inside images
- -1 - disable cascaded AES256
- -c COUNT - multiple passphrases
- -i ITER - specify key derivation iterations
- -drbg - mix HMAC_DRBG with scrypt key material
- -nomix - do not mix TLSv1.2 PRF with scrypt key material
- -alt inputfile - plausible deniability/alternate inputs
- -noalt - do not add extra random data to output
- Example usage
- Technical Information
- Cryptographic design motivations
- Underlying crypto methods
- Effective Security
- Encryption output format
- Decryption discovery/HEADER validity pseudocode
- Hacking and brute force recovery notes
Our toplip utility has unique and simple to use features that we were not able to find in other command line utilities.
Very strong encryption
At the core of toplip's chosen encryption method is AES256, along with an XTS-AES design to protect confidential data. To construct our passphrase-based key materials, we use scrypt as a building block, and optionally mix other key derivation functions into it. See the Technical Information section if you are interested in the nitty-gritty.
Optional "plausible deniability"
Each toplip encrypted output may contain up to four wholly independent files, each created with their own separate (and unique) passphrase material. Due to the way the encrypted output is put together, there is no way to easily determine whether or not multiple files actually exist in the first place. By default, even if only one file is encrypted using toplip, random data is added automatically. If however more than one file is specified, each with their own passphrase material, then you can selectively extract each file indepedently and thus deny the existence of the other files altogether. This effectively allows a user to open an encrypted bundle with controlled exposure risk, and no computationally inexpensive way for an adversary to conclusively identify that additional confidential data exists. See the Technical Information section if you are interested in the nitty-gritty.
Optional image embedding/extraction (PNG/JPG)
Unless you are working on crypto or pseudorandom number generators, it is unlikely that you have a reason to have a slew of files on your computer that look like random data. Placing important or otherwise private encrypted materials inside images means that a casual observer will not discover there is any extraneous data inside the image to begin with. The images that contain toplip encrypted data can be opened/viewed without error or revealing the toplip contents.
Optional multiple passphrase protection
The ability to specify more than one passphrase per encrypted file allows for a much greater difficulty level in extraction and decryption operations. Each supplied passphrase must undergo its own expensive key material generation stage, and allowing multiple passphrases provides simple command line methods to greatly improve brute force resistance. The end result of our multiple passphrase protection is cascaded AES256. See the Technical Information section if you are interested in the nitty-gritty.
Simplified brute force recovery protection
The original design intention of scrypt was to "make it costly to perform large-scale custom hardware attacks..." Building upon this theme, we provide command line options to increase the initial and final output stage of scrypt's PBKDF2 iteration counts. In addition to the aforementioned multiple passphrase protection, we chose to additionally mix in TLSv1.2 PRF or HMAC_DRBG to the scrypt output material, and finally we perform a substantial amount of "one-way" AES256 key material operations. The end result of our efforts in this regard mean that even with the default settings, fast brute forced based passphrase attacks are not possible with today's hardware. See the Technical Information section if you are interested in the nitty-gritty.
Usage and options
# ./toplip Usage: toplip [-b] [-d] [-r] [-m mediafile] [[-nomix|-drbg][-1][-c COUNT][-i ITER ]-alt inputfile] [-nomix|-drbg][-1][-c COUNT][-i ITER ]inputfile -b == input/output in base64 (see below notes) -d == decrypt the inputfile -r == generate (and display of course) one time 48 byte each pass phrases as base64 -m mediafile == for encrypting only, merge the output into the specified mediafile. Valid media types: PNG, JPG (plain JFIF or EXIF). (Note that decrypting will auto-detect and attempt to extract if the inputfile for decryption is given a media file). -1 == for each input file (-alt or main), this option disables the use of cascaded AES256, and instead uses a single AES256 context (two for the XTS-AES stage). -c COUNT == for each input file (-alt or main), this option overrides the default count of one (1) passphrase. Specifying a higher count here will ask for this many actual passphrases, and generate this number of separate key material and crypto contexts that are then used over-top of each other (cascaded). -i ITER == for each input file (-alt or main), specify an alternate iteration count for scrypt's internal use of PBKDF2-SHA512 (default is 1). For the initial 8192 bytes of key material, and before one-way AES key grinding of same, we use scrypt and this option overrides how many iterations of PBKDF2-SHA512 it will perform for each passphrase. (NOTE: this can _dramatically_ increase the calc times). Hex values or decimal values permitted (e.g. 10, 0xfff, etc). -drbg == for each input file, by default the 8192 bytes of key material is xor'd with TLSv1.2 PRF(SHA256) of the supplied passphrase(s). This option will mix the key material with HMAC_DRBG(SHA256) instead. -nomix == for each input file (see -drbg), this option specifies no additional mixing of the scrypt generated 8192 byte key material. -alt inputfile == generate one or more "Plausible Deniability" file (encrypting only) This will ask for another set of passphrases, which MUST NOT be the same. Without this option, three alternate contents are randomly generated such that it is impossible to tell by examining the encrypted output whether there is or is not anything other than pure random. See the -noalt option for what happens without this option. This option can be specified up to 3 times (for a max of 4 files). -noalt == Do not generate additional random data. By default, extra random data is inserted into the encrypted output such that forensic analysis (with a valid set of passphrases) on a given encrypted output does not cover all of the ciphertext present. See further commentary below about why the default setting is a good thing. Specifying this option means that no extra random data is inserted into the output (and this might be useful if you do not need plausible deniability, or you are dealing with very large files).
-b - I/O in base64
If encrypting, this will output the encrypted data in base64. If decrypting, this will decode the base64 encoded data prior to decryption. Note that this option is mutually exclusive to the use of image files (image embedding/extraction requires binary form only).
-d - decryption
Use this option to decrypt the given inputfile. Note that only one input file can be supplied, and that if the input file is a supported image type, it will automatically be detected for suitable extraction.
-r - one-time pass generation
For encryption only, if this option is supplied no passphrase(s) will be asked. Instead, they are randomly generated 48 bytes each that are output as base64 strings. This is useful for one-time strong passphrase generation of encrypted outputs.
-m mediafile - encryption output inside images
For encryption only, use this option to merge the encrypted output into the supplied image (PNG or JPEG, either JFIF or EXIF formats).
-1 - disable cascaded AES256
By default, toplip makes use of cascaded AES256. Use this option to disable cascading and use a single AES256 context (two for the XTS-AES stage as the XTS specification requires).
-c COUNT - multiple passphrases
Specified separately for each inputfile (encryption, or for the inputfile on decryption), this option overrides the default count of 1 passphrase. Specifying a higher count here will ask for this many actual passphrases (or generate them for you if -r was specified). Each supplied passphrase generates separate 8192 byte key material blocks that are then used in cascaded form.
-i ITER - specify key derivation iterations
Specified separately for each inputfile (encryption, or for the inputfile on decryption), this option overrides the default iteration count of 1 for scrypt's initial and final PBKDF2 stages. Hexadecimal or decimal values permitted, e.g. 0xf00d, 10, etc. NOTE: this can dramatically increase the calculation times.
-drbg - mix HMAC_DRBG with scrypt key material
Specified separately for each inputfile (encryption, or for the inputfile on decryption), this option overrides the default key material mixing of TLSv1.2 PRF for each passphrase's 8192 byte key material block, and instead mixes in HMAC_DRBG output.
-nomix - do not mix TLSv1.2 PRF with scrypt key material
Specified separately for each inputfile (encryption, or for the inputfile on decryption), this option disables TLSv1.2 PRF mixing of the final 8192 bytes of key material per passphrase. Note that this option is mutually exclusive with the -drbg option.
-alt inputfile - plausible deniability/alternate inputs
For encryption only, this option generates one or more "plausible deniability" alternate embedded files inside the final output. For each inputfile specified, separate set(s) of passphrases will be acquired or generated, and for each input file specified, different passphrase generation parameters (-c COUNT, -i ITER, -drbg, -nomix, -1) may be specified.
-noalt - do not add extra random data to output
For encryption only, this option disables the generation of random "bogus" files in the final encrypted output. By default, if less than four input files are specified, toplip adds random data, sized relative to the main inputfile (random size with its upper limit set there and its lower limit set to the internal minimum size per file). Using this option, no additional random data will be added, and only the input files specified with the -alt option and the main inputfile will be included in the final output.
The examples contained herein are by no means an exhaustive collection of use cases. How toplip is used should depend largely on your individual security requirements over what you are protecting, and common sense is obviously prerequisite. NOTE: we did not include
stdout output for the below examples. If you run these commands as listed, they will work as expected.
No plausible deniability, defaults
user@dev:/tmp> echo "toplip encryption example" > example1.plaintext user@dev:/tmp> toplip example1.plaintext > example1.encrypted user@dev:/tmp> toplip -d example1.encrypted # To restore the file instead of writing to stdout: user@dev:/tmp> toplip -d example1.encrypted > example1.decrypted
Plausible deniability, defaults
# this example is the same as above, only with 2 input files # and two separate passphrases user@dev:/tmp> echo "toplip encryption primary" > example2.primary user@dev:/tmp> echo "toplip encryption secondary" > example2.secondary user@dev:/tmp> toplip -alt example2.secondary example2.primary > example2.encrypted # That will have asked for two separate passphrases # To extract either, supply the respective passphrase to toplip: user@dev:/tmp> toplip -d example2.encrypted # To restore the file instead of writing to stdout: user@dev:/tmp> toplip -d example2.encrypted > example2.decrypted
Image embedding, defaults
# this example assumes we have an image named: familyphoto.jpg in our directory user@dev:/tmp> echo "toplip encryption example" > example3.plaintext user@dev:/tmp> toplip -m familyphoto.jpg example3.plaintext > newfamilyphoto.jpg # newfamilyphoto.jpg now contains our hidden encrypted data, to extract: user@dev:/tmp> toplip -d newfamilyphoto.jpg # To restore the file instead of writing to stdout: user@dev:/tmp> toplip -d newfamilyphoto.jpg > example3.decrypted
Image embedding, plausible deniability, defaults
# this example is the same as above, only with 2 input files # and two separate passphrases user@dev:/tmp> echo "toplip encryption primary" > example4.primary user@dev:/tmp> echo "toplip encryption secondary" > example4.secondary user@dev:/tmp> toplip -m familyphoto.jpg -alt example4.secondary example4.primary > newfamilyphoto.jpg # That will have asked for two separate passphrases # To extract either, supply the respective passphrase to toplip: user@dev:/tmp> toplip -d newfamilyphoto.jpg # To restore the file instead of writing to stdout: user@dev:/tmp> toplip -d newfamilyphoto.jpg > example4.decrypted
# Same as the second example, but with increased passphrase complexity # NOTE: if these settings are used as-is, it will take a fair while user@dev:/tmp> echo "toplip encryption primary" > example5.primary user@dev:/tmp> echo "toplip encryption secondary" > example5.secondary user@dev:/tmp> toplip -c 2 -i 0x8000 -alt example5.secondary -c 6 -i 0x10000 example5.primary > example5.encrypted # That will have asked for 6 passphrases for the primary # and 2 passphrases for the secondary # To extract either, supply the respective set of passphrases to toplip: # primary extraction: user@dev:/tmp> toplip -c 6 -i 0x10000 -d example5.encrypted # secondary extraction: user@dev:/tmp> toplip -c 2 -i 0x8000 -d example5.encrypted
All of what follows is from the commentary inside the toplip source file itself, which can also be browsed in HTML: toplip source in HTML.
Cryptographic design motivations
- Passphrase-based key derivation
Computationally expensive key derivation, combined with optional cascading of multiple keys was the primary objective. This was specifically to render high-speed passphrase brute forcing difficult, and in a user-specified way (with command line options that obviously need to be remembered along with passphrases themselves).
Thanks to cryptocurrency proliferation, there are now a great many hash accelerators. Things like hashcat.net (hats off to that) and others mean that "simple" or straightforward hash based KDF don't provide the same level of protection for passphrases. By modifying scrypt to make use of HMAC-SHA512 in its initial and final stages of PBKDF2, and further by allowing arbitrary iterations counts for same, we can effectively render "purist" hash brute forcers ineffective. Further, that we obviously do not store any part of the "computed hash" (key material) in the output, means that all brute force attempts will require a minimum number of calculations, much in the same way that traditional disk encryption uses. By combining the output from scrypt-SHA512 with other PRFs and performing one-way AES256 key "grinding", the task for brute forcing is considerably more expensive than any single PBKDF method.
- Plausible deniability payload discovery
The ability to optionally embed multiple payloads inside the same contiguous output without disclosing the existence of same, each with their own separate passphrases/key materials.
- Payload confidentiality
Making use of XTS-AES in optionally in combination with our cascaded AES256, as well as via multiple passphrase based cascaded XTS-AES.
Underlying crypto methods
No "new" or "homebrew" crypto has been used here. AES256, HMAC-SHA512, scrypt (and thus
PBKDF2), TLSv1.2 PRF(SHA256), HMAC-DRBG(SHA256), and XTS-AES (via htxts) have been put together
in carefully considered ways to achieve the aforementioned design goals.
As explained briefly above, each passphrase supplied is passed to scrypt-sha512 to derive 8192 bytes of key material. Command line options allow for specifying a >1 PBKDF2 iteration count that the underlying scrypt function uses. Note that the SALT used for scrypt (and other mixing functions) is 32 bytes of PRNG output.
Depending on command line options, this 8192 bytes is then further manipulated (listed per opt):
-nomix: Nothing further is done.
-drbg: HMAC_DRBG(SHA256) is used, seeded with SALT || SHA512(passphrase), to generate an additional 8192 bytes of key material, which is then xor'd with the scrypt output.
default: TLSv1.2 PRF(SHA256) is used, secret = passphase, label = 'key derivation', seed = SALT to generate an additional 8192 bytes of key material, which is then xor'd with the scrypt output.
Note that our use of scrypt, TLSv1.2 PRF, and HMAC_DRBG for key derivation are not NIST/FIPS approved methods of key derivation.
Once the above has derived 8192 bytes of key material, that is passed to the "htcrypt" set of functions, which is an encapsulation of 256 separate AES256 encryption and decryption contexts. As explained in the htcrypt.inc commentary, htcrypt's init then takes the first 64 bytes of the scrypt-generated key material, and uses it as its "main sequencer." It then initializes 256 AES256 encryption contexts using 32 bytes each of the original 8192 bytes (32x256 == 8192). Then, using the last 64 bytes of the original 8192 bytes of key material as a "temporary sequencer", it encrypts the full 8192 bytes of key material 64 times, using the temporary sequencer bytes as indexes to the AES256 encryption contexts initialized before. It performs this "reencrypting the 8192 bytes of key material 64 times" a full 1024 times. This is effectively using AES256 as a CSPRNG to further arrive at a computationally difficult final set of 8192 bytes key material.
Once the final 8192 bytes of key material is arrived at, htcrypt then initializes all 256 separate AES256 encryption and decryption contexts using 32 bytes each in succession. Unless the -1 option is specified, for later-done encrypt operations, each call to htcrypt$encrypt actually results in 64 separate AES256 block encryptions, and the contexts used are determined by the initial "main sequencer" noted earlier in forward order. For decrypt operations, the "main sequencer" bytes are used in reverse for the underlying AES256 decrypt calls. If the -1 option is set, only one AES256 context is used (at the end of the 8192 bytes of generated key material), and thus does not use cascaded AES256 like the default.
For the 128 bytes of HEADER_OR_IV described below, htcrypt encrypt/decrypt is then used in a CBC manner. The 16 byte IV is randomly generated, and then for each set of htcrypt contexts (one per passphrase set), the HEADER = HEADER xor IV, htcrypt$encrypt HEADER, IV = IV xor HEADER, htcrypt$encrypt IV. (see below for more detail)
For the actual payload encryption, we make use of the "htxts" set of functions, which is identical to AES-XTS except for the use of htcrypt$encrypt and htcrypt$decrypt instead of a single AES256 encrypt/decrypt. The initial XTS 16 byte tweak value is the unencrypted first 16 byte IV (randomly generated). The first block of htxts plaintext is prepended with 64 bytes of additional random IV.
For integrity verification, an HMAC-SHA512 is appended to the output, along with a partial randomly sized garbage block (to prevent all outputs from being 16-byte aligned).
No "new" or "homebrew" crypto has been used here. AES256, HMAC-SHA512, scrypt (and thus
PBKDF2), TLSv1.2 PRF(SHA256), HMAC-DRBG(SHA256), and XTS-AES (via htxts) have been put together
in carefully considered ways to achieve the aforementioned design goals.
The point of cascading AES256 was not necessarily to increase the security of AES256 in and of itself, but to mandate the use and generation of a full 8192 bytes of key material. At best this dramatically increases its security, and at worst it falls back to a single-key AES256. If -1 is specified, thereby disabling cascaded AES256, then all of the encrypt/decrypt ops are "by the book" and the key material used for the single (and double for XTS) AES256 is taken from the end of the 8192 bytes of key material, still enforcing the use of 8192 bytes. If -1 is not specified (the default), then the effective security at the block level for the HEADER_OR_IV portion is somewhere above a single 32 byte AES256 key, and for the XTS portion somewhere above two 32 byte keys, and for our requirements this is more than sufficient. Noting here that there is ongoing debate about what the actual effective security gains are from performing cascaded AES256, but for the purposes herein, regardless of how that debate turns out, the design goals are satisfied.
At the time of this writing:
Since related key attacks do not apply to this design, at the worst our HEADER_OR_IV (see below) security sits at 2**254, and if cascaded AES256 is enabled possibly much higher still. For the XTS payload encryption, since it uses two full sets of AES256 keys, and at worst we are at a "non-key-halved" full security of XTS-AES. If indeed cascaded AES256 lends to a security increase, it could further be argued that the use of the 64 bytes of sequencing material also contribute to the overall security. Either way, our baseline effective security is more than sufficient for the design goals herein.
The command line options for constructing key material complexity provide for a much higher than "typical" level of passphrase-based security, and at the end of the day, the effective security is most likely tied to the passphrase(s) as it should be. (Noting of course you could provide extremely high entropy passphrases and shift it the other way.)
Encryption output format
Each output starts with a 32 byte SALT, followed by eight 16 byte blocks, each of which can be
used as an IV or a HEADER block. These eight blocks are randomly ordered and selected at encrypt
time, and a HEADER consists of the start and end offsets in the overall crypto stream. Which 2
blocks get used by a given stream is chosen at random. If plausible deniability is enabled (and
thus more than one encryption stream exists), each one chooses its own 2 random blocks. For
unused blocks, they are simply PRNG initialized.
The order of input files is randomized before encryption begins (which may include "bogus" files, and/or alternates).
Following the aforementioned 32+128 bytes, up to four separate htxts encrypted payloads follow. Each plaintext payload is prepended with 64 bytes of PRNG, appended with a block sized padding, followed by an HMAC_SHA512 of the plaintext, followed by a random length (1..15) of garbage data.
Note that the 64 byte PRNG "preamble" does not increase security in any way, because we are not making use of block chaining (XTS-AES method is used). We do however include it in the HMAC calculation, as well as use the trailing 4 bits as our padding indicator.
[0..31] == SALT [32..47] == HEADER_OR_IV [48..63] == HEADER_OR_IV [64..79] == HEADER_OR_IV [80..95] == HEADER_OR_IV [96..111] == HEADER_OR_IV [112..127] == HEADER_OR_IV [128..143] == HEADER_OR_IV [144..159] == HEADER_OR_IV [160...] == crypted materials and PRNG mix (one or more) crypted material HTXTS( [0..63] == PRNG output [64..X] == plaintext [X..block-size-padded] == PRNG HMAC-SHA512(plaintext) garbage (random length 1..15)
HEADER and IV indexes are randomly selected (the list of 8 is scrambled initially).
All of the HEADER_OR_IV entries are initialized with PRNG output, and then for each input file specified (depending on -alt, -noalt, etc), an IV and HEADER block is selected.
If -alt is specified once or more (thus plausible deniability, aka multiple sets of valid keys/files), for each file specified an additional HEADER and IV is chosen. The list of input files is also randomized. Four minus the number of input files "bogus" files are added and are PRNG-only output.
If -alt is not specified, and -noalt is also not specified (thus, one set of keys and one file), then three additional PRNG "bogus" files are added to the list and is PRNG-only output.
If -noalt is specified, then the a single header/IV is chosen, but the actual encrypted contents begin at offset 160 and no extra random data (other than the garbage/padding above) is added.
Decryption discovery/HEADER validity pseudocode
For x in 0..7 For y in 0..7 if y <> x copy HEADER_OR_IV[x] to tempbuf copy HEADER_OR_IV[y] to tempbuf foreach key in reverse-order keys (keys == passphrase-derived htcrypt contexts) key.decrypt(tempbuf) xor tempbuf with tempbuf key.decrypt(tempbuf) xor tempbuf with tempbuf if (both qwords of tempbuf are within bounds of filesize, and low qword < high qword) initial XTS tweak = tempbuf start_ofs = low qword of tempbuf end_ofs = high qword of tempbuf goto success end if end if if we made it through the loop and arrived here, fail. success: previous_key = null foreach key in forward-order keys (keys == passphrase-derived htcrypt contexts) key.tweak = initial XTS tweak (from above discovery loop result) if previous_key <> null previous_key.encrypt(key.tweak) end if previous_key = key
Hacking and brute force recovery notes
What follows are further notes about conducting brute force passphrase attempts and the encryption methods utilized.
Key derivation is intentionally expensive, and with command line options to increase its
complexity manifold. At a minimum, for each set of derived key material, the discovery process
outlined in section #4 above would need to be completed in order to validate a given set
of passphrase inputs. (Of course, you could skip the discovery process and attempt the
XTS-AES sections instead, but then you would be forced to guess the location and extents of
the actual payload, versus the HEADER/IV discovery process in section #4 which would yield
those values anyway and is simpler compared to the XTS-AES process itself.)
The defaults contained herein, like many other key derivation settings, are a compromise between acceptable user-experience delays and the associated computational difficulty of brute forcing. At these levels on a local development machine, approximately 1.7 attempts per second per CPU core are possible.
For non-default settings, especially where multiple passphrases and high iteration counts are specified, things quickly become intractable (at least on consumer-grade hardware). By increasing the iteration count to 100,000 on the same local development machine the time per attempt goes up to 23 seconds per CPU core per passphrase.
Encryption and XTS description
The SALT is only used for key material generation, and is not used at all for any subsequent
encryption operations. Inside htcrypt.inc, if you enable htcrypt_debug_keymaterial, it will
output the sequence and 256 AES256 keys that it uses to stderr.
Per the discovery method outlined in #4 above, the IV and HEADER are "CBC-style" encrypted, and the PRNG-generated plaintext IV is used as the initial XTS tweak block for actual payload recovery. Further to this, 64 bytes of random data is prepended to the first XTS block's plaintext (though as mentioned above, these 64 bytes do not affect security).
What this effectively means is that the XTS enc/decrypt operations are linked to the CBC-style IV and HEADER portion due to the use of the decrypted IV as the initial XTS tweak value.
At the time of encryption, the plaintext IV (which is then used as the initial XTS tweak) is PRNG output. The "CBC-style" means that the IV + HEADER are encrypted in the following way (per key in forward order):
xor HEADER with IV (which is initial random plaintext IV, or last ciphertext IV) encrypt HEADER (set last ciphertext to result) xor IV with HEADER (which is last ciphertext) encrypt IV (set last ciphertext to result)
then if there are more keys in use, the IV is still the "last ciphertext", per normal CBC-mode
operation, despite reiterating over the same two blocks. (See section #4 for more details.)
For an attacker who has nothing but output from this code to work with, and is attempting to extract plaintext payload(s), there is no way to discern which of the 8 initial 16 byte blocks are IV or HEADER, so the same discovery process outlined in #4 would have to apply to any attempts on the HEADER_OR_IV blocks themselves.
In order to recover any plaintext payload(s), all sets of key material (and their respective AES256 contexts along with their sequencing), the decrypted IV, and the decrypted file offsets (start/end) are required.
If -1 was specified for a given inputfile, then the "htxts" operations here do not use cascaded AES256, and instead do "normal" single AES256 encryption. The tweak encryption key and the data encryption key in this mode are chosen from the end of the final 8192 bytes of key material. If -1 was not specified, thus cascaded AES256 is enabled, then the "htxts" methods used here are still done per the XTS-AES standard, but instead of using a single AES256, we use our cascaded AES256 htcrypt operations described above, and the tweak key is the first unused (in the main sequence) AES256 context, thus providing at most 65 full AES256 contexts for a normal htxts operation. "At most" being that the htcrypt sequencing may actually contain duplicate indexes.
Note that if multiple passphrases (and thus key material) were specified for a given inputfile, that even if -1 is specified, these do end up in a cascaded manner for both the HEADER_OR_IV and the XTS portion as outlined above.
Unlike traditional disk-based XTS-AES, we use the initial 16 byte plaintext (PRNG-generated) IV for the initial tweak value. This has the pleasant side effect that two plaintexts encrypted with the same set of key materials will not result in identical outputs.
htxts encrypt does (per block, which is 2048 bytes each):
AES256 encrypt tweak with htcrypt.aeskeys[tweak key index] (not cascaded) for i in 0..127 xor subblock[i] with tweak htcrypt.encrypt(subblock[i]) xor subblock[i] with tweak LSFRshift(tweak)
so to encrypt a block, the above is run in forward order for each set of keys, modifying the tweak (per key, each key has its own unique tweak) in-place through all blocks.
scrypt key derivation and mixing
The HeavyThing scrypt implementation is a reference one, with N=1024, r=1, p=1 except for the
fact that instead of using SHA256 as scrypt-proper does, we use SHA512 to initialize the state,
and then for the final output stage, again we use SHA512. Note that the PBKDF2 iteration counts
that scrypt uses for its init and final output stages we allow to override the default of 1.
The reason that mixing strategies of the final scrypt key material are employed is twofold: First, that there may be some issue with the scrypt output itself, and secondly that it forces any attacker to implement multiple key generation techniques as we have done here.
The underlying HeavyThing library uses a modified version of Agner Fog's SFMT and Mother-of-all generators. He specifically states that they are safe to use _provided_ that only the combined output is accessible to an attacker, and that a complete subsequence of the output (in our case 1408 bytes) is never revealed. Care is taken to satisfy these requirements.
HMAC-SHA512 integrity verification
The primary purpose for appending HMAC-SHA512 is to verify the integrity of the encrypted payload to ensure that no tampering or other bit errors occur. Since the payload encryption is done with XTS-AES, single bit errors do not corrupt the entirety that follows, and as such may not necessarily be self-evident without employing integrity verification.