The AFA file format

AFA is an archive format that was introduced in Rance Quest (?). There are three versions of the format (v1, v2 and v3). The v1 and v2 formats are largely the same. AFA v3 however is quite different, and will be treated separately below.

AFA v1 and v2 Header

File Index

Immediately following the header is the compressed (zlib DEFLATE) file index. The uncompressed data is a list file descriptors, each having the following structure.

File Data

Offset	Size (Bytes)	Name	Notes
0x0	4	magic_1	‘AFAH’
0x4	4	unknown_1	0xC
0x8	8	magic_2	‘AlicArch’
0x10	4	afa_version	1 or 2
0x14	4	unknown_2	?
0x18	4	data_start	Start of file data
0x1C	4	magic_3	‘INFO’
0x20	4	compressed_size	Subtract 16 to get the real compressed size
0x24	4	uncompressed_size	Size of the uncompressed file index
0x28	4	file_count	Number of files in the archive

Size (Bytes)	Name	Notes
4	name_size	Actual number of bytes in file_name
4	padded_size	The number of bytes to read for file_name
padded_size	file_name	The name of the file
4	unknown_1	Likely Windows timestamp
4	unknown_2	Likely Windows timestamp
4	unknown_3	Only present in AFA v1
4	file_offset	Offset of the file
4	file_size	Size of the file

The file_offsets given in the file index are relative to the value given in the data_start field of the AFA header. In other words, the offset to the start of a file within the archive is afa.data_start + descriptor.file_offset.

AFA v3

AFA v3 changes the header structure in an incompatible way. The version identifier is not in the same place so it cannot be relied on to identify an AFA v3 file. Additionally, the file index in v3 is obfuscated and encrypted (but not the file contents).

Since the algorithms are quite complicated, I will mostly defer to the libsys4 source code to describe them.

AFA v3 Header

Offset	Size (Bytes)	Description
0x0	4	Magic: ‘AFAH’
0x4	4	Index size (add 8 for data_start)
0x8	4	0x3 (version?)

Following the header is a dictionary for decrypting filenames, and then the compressed file index. However, this data is obfuscated by inserting a single bit of padding immediately following the header. In other words, the data is not byte-aligned.

AFA v3 Dictionary

PRNG

The following code initializes a pseudo-random number generator which is used to create the dictionary.

static struct {
    uint32_t state[521];
    int current;
} rnd;

/*
 * Munge PRNG state.
 */
static void rnd_shuffle(void)
{
    for (int i = 0; i < 32; i += 4) {
        rnd.state[i  ] ^= rnd.state[i+489];
        rnd.state[i+1] ^= rnd.state[i+490];
        rnd.state[i+2] ^= rnd.state[i+491];
        rnd.state[i+3] ^= rnd.state[i+492];
    }
    for (int i = 32; i < 521; i += 3) {
        rnd.state[i  ] ^= rnd.state[i - 32];
        rnd.state[i+1] ^= rnd.state[i - 31];
        rnd.state[i+2] ^= rnd.state[i - 30];
    }
}

/*
 * Initialize the PRNG.
 */
static void rnd_init(uint32_t seed)
{
    uint32_t val = 0;
    for (int i = 0; i < 17; i++) {
        for (int j = 0; j < 32; j++) {
            seed = 1566083941u * seed + 1;
            val = (seed & 0x80000000) | (val >> 1);
        }
        rnd.state[i] = val;
    }
    rnd.state[16] = rnd.state[15] ^ (rnd.state[0] >> 9) ^ (rnd.state[16] << 23);
    for (int i = 17; i < 521; i++) {
        rnd.state[i] = rnd.state[i-1] ^ (rnd.state[i-16] >> 9) ^ (rnd.state[i-17] << 23);
    }
    rnd_shuffle();
    rnd_shuffle();
    rnd_shuffle();
    rnd_shuffle();
    rnd.current = -1;
}

/*
 * Get the next pseudo-random number.
 */
static uint32_t rnd_get_next(void)
{
    rnd.current++;
    if (rnd.current >= 521) {
        rnd_shuffle();
        rnd.current = 0;
    }
    return rnd.state[rnd.current];
}

Initializing the Dictionary

The following code initializes the dictionary. Note that the struct bitstream object is an abstraction for reading byte-unaligned data from the AFA file. See the libsys4 source code for more details.

/*
 * Dictionary for string decoding.
 */
struct {
    uint8_t *bytes;
    size_t size;
} dict;

/*
 * Read the string decoding dictionary.
 * Encrypted via a PRNG seeded by the dictionary size.
 */
static void read_dict(struct bitstream *bs)
{
    dict.size = bs_read_int32(bs);
    dict.bytes = xmalloc(dict.size);

    rnd_init(dict.size);
    for (unsigned dst = 0; dst < dict.size; dst++) {
        int count = (int)rnd_get_next() & 3;
        int skipped = bs_read_bits(bs, count+1);
        if (skipped == -1) {
            goto err;
        }
        rnd_get_next();

        int v = bs_read_bits(bs, 8);
        if (v == -1) {
            goto err;
        }
        dict.bytes[dst] = (uint8_t)v;
    }
    return;
err:
    free(dict.bytes);
}

AFA v3 File Index

Immediately following the dictionary data is the compressed file index (also byte-unaligned).

The uncompressed data is once again obfuscated with a single bit of padding at the start. Then the following:

AFA v3 File Descriptors

Yet again we have obfuscation via bit-sized padding. Each file descriptor begins with 2 bits of padding. Then the following:

Name Encryption

Size (Bytes)	Description
4	Compressed size
4	Uncompressed size
Varies	Compressed data (DEFLATE)

Size (Bytes)	Description
4	File count
Varies	File descriptors

Size (Bytes)	Name	Notes
Varies	file_name	Encrypted
4	unknown_1	Likely Windows timestamp)
4	unknown_2	Likely Windows timestamp)
4	file_offset	Offset of the file
4	file_size	Size of the file

The file names are double encrypted. The first later of encyption is removed by the following function, which uses the PRNG described above.

/*
 * Read an encrypted string. The data is double-encrypted: once via a PRNG
 * seeded by the string length, and again via the encrypted dictionary (above).
 * This function decrypts the first layer.
 */
static uint16_t *afa3_read_encrypted_chars(struct bitstream *bs, size_t *size)
{
    uint32_t buf_size = bs_read_int32(bs);
    uint16_t *buf = xmalloc(buf_size * 2);

    rnd_init(buf_size);
    for (unsigned dst = 0; dst < buf_size; dst++) {
        int count = (int)rnd_get_next() & 3;
        int skipped = bs_read_bits(bs, count+1);
        if (skipped == -1) {
            goto err;
        }
        rnd_get_next();

        int lo = bs_read_bits(bs, 8);
        int hi = bs_read_bits(bs, 8);
        if (lo == -1 || hi == -1) {
            goto err;
        }
        buf[dst] = (uint16_t)(lo | (hi << 8));
    }

    *size = buf_size;
    return buf;
err:
    free(buf);
    return NULL;
}

This yields an array of 16-bit integers, which serve as indices into the dictionary described above. The following code converts this array into a (SJIS-encoded) string.

/*
 * Decrypt an encrypted string via the dictionary.
 */
static char *afa3_decrypt_string(uint16_t *chars, size_t size)
{
    char *buf = xmalloc(size+1);
    for (unsigned i = 0; i < size; i++) {
        if (chars[i] >= dict.size) {
            free(buf);
            return NULL;
        }
        buf[i] = (char)(dict.bytes[chars[i]] ^ 0xa4);
    }
    buf[size] = '\0';
    return buf;
}

AFA v3 File Data

Thankfully, the programmer that designed this horror was feeling merciful and left the file data itself byte-aligned an unencrypted. The offset and file size in the file descriptors can be used in the same manner as in AFA v1 and v2 to locate file data within the archive.