Apple Archive

From The Apple Wiki

Apple Archive is a proprietary archive format that can compress and archive files. It is an extension upon the yaa format, however with a different magic "AA01" rather than "YAA1", however the "aa" CLI tool that ships with macOS Big Sur handles both.

They can be handled with public Swift API and private C API in libAppleArchive.dylib. There is also an open-source library compatible with Darwin platforms as well as Linux known as libNeoAppleArchive that can handle them as well.

Apple Archives can be LZFSE, RAW, LZ4, LZMA, or ZLIB compressed. As of iOS 16+/macOS Ventura, they can also be LZBITMAP compressed.

Header

Example:

41413031 46005459 50314450 41545000 00554944 31004749 4431504D 4F4432ED 01464C47 31004D54 4D54BDAD 61660000 00006376 910C4354 4D54A6A4 61660000 0000D4BA 2E03.


First 4 bytes are “AA01” which is the magic (legacy .yaa files may be “YAA1” magic, but they’re the same. libAppleArchive may replace YAA1 magic with AA01 magic).

Next two bytes represent the size of the header itself, including the magic, so the header size is always 6 bytes or over.

This is followed by the field keys. This key is the TYP field key. For field keys, check the AADefs.h header. This one is a UINT, and a subtype of 1. While the headers explain types as well, they don’t explain how they are represented in actual binary form; UINTs are 1, 2, 4 and 8, which both represent that it is a UINT and what size it is. I document the subtypes here: https://github.com/0xilis/libNeoAppleArchive/blob/main/docs/NeoAAFieldType.md. Followed by this is the value for said field key. This is TYP1D, and the D represents directory.

On to the next field key, PAT is a string. Strings have 2 bytes for the string size, and then the rest of the string, which size is included in the header size. This has none, meaning the string is empty; a directory with an empty is used to represent the entry path in Apple Archive.

While not present on this header, BLOBs also exist. DAT will be the actual data for the file. The value of it in the header is the size of the blob, so with DATA, it will be 2 bytes, DATB will be 4 and DATC will be 8. The actual blob itself, however, is not in the header, but at the end of it and is not counted towards the size. So, header size + blobs size will be the size a singular item takes in a AAR. This is a directory though so it does not have a DAT blob. Directories can still have blobs, being XAT to represent the xattrs, but the entry point should not.

libAppleArchive will do headerSize + blobSize to reach the next byte. Is it the file size? If so, we reached the end of the archive, stop extracting. Is there more bytes? If so, means there is another item to extract. So it checks if magic is AA01 or YAA1, and if not, bad magic, we did something wrong, stop extraction. If so though, continue extraction, and repeat same process as before.

AAEntryTypes

These are the values that the TYP field key can be, aka what different types of objects that Apple Archive can represent in the filesystem.

Definition Value Function
AA_ENTRY_TYPE_REG F Regular file
AA_ENTRY_TYPE_DIR D Directory
AA_ENTRY_TYPE_LNK L Symbolic Link
AA_ENTRY_TYPE_FIFO P Fifo
AA_ENTRY_TYPE_CHR C Character Special
AA_ENTRY_TYPE_BLK B Block Special
AA_ENTRY_TYPE_SOCK S Socket
AA_ENTRY_TYPE_WHT W Whiteout
AA_ENTRY_TYPE_DOOR R Door
AA_ENTRY_TYPE_PORT T Port
AA_ENTRY_TYPE_METADATA M Metadata. Header specifies that this is not a filesystem object.