AFA is an archive format that was introduced in Rance Quest (?). There are three versions of the format (v1, v2 and v3). The v1 and v2 formats are largely the same. AFA v3 however is quite different, and will be treated separately below.
Offset | Size (Bytes) | Name | Notes |
---|---|---|---|
0x0 | 4 | magic_1 | ‘AFAH’ |
0x4 | 4 | unknown_1 | 0xC |
0x8 | 8 | magic_2 | ‘AlicArch’ |
0x10 | 4 | afa_version | 1 or 2 |
0x14 | 4 | unknown_2 | ? |
0x18 | 4 | data_start | Start of file data |
0x1C | 4 | magic_3 | ‘INFO’ |
0x20 | 4 | compressed_size | Subtract 16 to get the real compressed size |
0x24 | 4 | uncompressed_size | Size of the uncompressed file index |
0x28 | 4 | file_count | Number of files in the archive |
Immediately following the header is the compressed (zlib DEFLATE) file index. The uncompressed data is a list file descriptors, each having the following structure.
Size (Bytes) | Name | Notes |
---|---|---|
4 | name_size | Actual number of bytes in file_name |
4 | padded_size | The number of bytes to read for file_name |
padded_size | file_name | The name of the file |
4 | unknown_1 | Likely Windows timestamp |
4 | unknown_2 | Likely Windows timestamp |
4 | unknown_3 | Only present in AFA v1 |
4 | file_offset | Offset of the file |
4 | file_size | Size of the file |
The file_offsets given in the file index are relative to the value given in the data_start field of the AFA header. In other words, the offset to the start of a file within the archive is afa.data_start + descriptor.file_offset
.
AFA v3 changes the header structure in an incompatible way. The version identifier is not in the same place so it cannot be relied on to identify an AFA v3 file. Additionally, the file index in v3 is obfuscated and encrypted (but not the file contents).
Since the algorithms are quite complicated, I will mostly defer to the libsys4 source code to describe them.
Offset | Size (Bytes) | Description |
---|---|---|
0x0 | 4 | Magic: ‘AFAH’ |
0x4 | 4 | Index size (add 8 for data_start) |
0x8 | 4 | 0x3 (version?) |
Following the header is a dictionary for decrypting filenames, and then the compressed file index. However, this data is obfuscated by inserting a single bit of padding immediately following the header. In other words, the data is not byte-aligned.
The following code initializes a pseudo-random number generator which is used to create the dictionary.
static struct {
uint32_t state[521];
int current;
} rnd;
/*
* Munge PRNG state.
*/
static void rnd_shuffle(void)
{
for (int i = 0; i < 32; i += 4) {
rnd.state[i ] ^= rnd.state[i+489];
rnd.state[i+1] ^= rnd.state[i+490];
rnd.state[i+2] ^= rnd.state[i+491];
rnd.state[i+3] ^= rnd.state[i+492];
}
for (int i = 32; i < 521; i += 3) {
rnd.state[i ] ^= rnd.state[i - 32];
rnd.state[i+1] ^= rnd.state[i - 31];
rnd.state[i+2] ^= rnd.state[i - 30];
}
}
/*
* Initialize the PRNG.
*/
static void rnd_init(uint32_t seed)
{
uint32_t val = 0;
for (int i = 0; i < 17; i++) {
for (int j = 0; j < 32; j++) {
seed = 1566083941u * seed + 1;
val = (seed & 0x80000000) | (val >> 1);
}
rnd.state[i] = val;
}
rnd.state[16] = rnd.state[15] ^ (rnd.state[0] >> 9) ^ (rnd.state[16] << 23);
for (int i = 17; i < 521; i++) {
rnd.state[i] = rnd.state[i-1] ^ (rnd.state[i-16] >> 9) ^ (rnd.state[i-17] << 23);
}
rnd_shuffle();
rnd_shuffle();
rnd_shuffle();
rnd_shuffle();
rnd.current = -1;
}
/*
* Get the next pseudo-random number.
*/
static uint32_t rnd_get_next(void)
{
rnd.current++;
if (rnd.current >= 521) {
rnd_shuffle();
rnd.current = 0;
}
return rnd.state[rnd.current];
}
The following code initializes the dictionary. Note that the struct bitstream
object is an abstraction for reading byte-unaligned data from the AFA file. See the libsys4 source code for more details.
/*
* Dictionary for string decoding.
*/
struct {
uint8_t *bytes;
size_t size;
} dict;
/*
* Read the string decoding dictionary.
* Encrypted via a PRNG seeded by the dictionary size.
*/
static void read_dict(struct bitstream *bs)
{
dict.size = bs_read_int32(bs);
dict.bytes = xmalloc(dict.size);
rnd_init(dict.size);
for (unsigned dst = 0; dst < dict.size; dst++) {
int count = (int)rnd_get_next() & 3;
int skipped = bs_read_bits(bs, count+1);
if (skipped == -1) {
goto err;
}
rnd_get_next();
int v = bs_read_bits(bs, 8);
if (v == -1) {
goto err;
}
dict.bytes[dst] = (uint8_t)v;
}
return;
err:
free(dict.bytes);
}
Immediately following the dictionary data is the compressed file index (also byte-unaligned).
Size (Bytes) | Description |
---|---|
4 | Compressed size |
4 | Uncompressed size |
Varies | Compressed data (DEFLATE) |
The uncompressed data is once again obfuscated with a single bit of padding at the start. Then the following:
Size (Bytes) | Description |
---|---|
4 | File count |
Varies | File descriptors |
Yet again we have obfuscation via bit-sized padding. Each file descriptor begins with 2 bits of padding. Then the following:
Size (Bytes) | Name | Notes |
---|---|---|
Varies | file_name | Encrypted |
4 | unknown_1 | Likely Windows timestamp) |
4 | unknown_2 | Likely Windows timestamp) |
4 | file_offset | Offset of the file |
4 | file_size | Size of the file |
The file names are double encrypted. The first later of encyption is removed by the following function, which uses the PRNG described above.
/*
* Read an encrypted string. The data is double-encrypted: once via a PRNG
* seeded by the string length, and again via the encrypted dictionary (above).
* This function decrypts the first layer.
*/
static uint16_t *afa3_read_encrypted_chars(struct bitstream *bs, size_t *size)
{
uint32_t buf_size = bs_read_int32(bs);
uint16_t *buf = xmalloc(buf_size * 2);
rnd_init(buf_size);
for (unsigned dst = 0; dst < buf_size; dst++) {
int count = (int)rnd_get_next() & 3;
int skipped = bs_read_bits(bs, count+1);
if (skipped == -1) {
goto err;
}
rnd_get_next();
int lo = bs_read_bits(bs, 8);
int hi = bs_read_bits(bs, 8);
if (lo == -1 || hi == -1) {
goto err;
}
buf[dst] = (uint16_t)(lo | (hi << 8));
}
*size = buf_size;
return buf;
err:
free(buf);
return NULL;
}
This yields an array of 16-bit integers, which serve as indices into the dictionary described above. The following code converts this array into a (SJIS-encoded) string.
/*
* Decrypt an encrypted string via the dictionary.
*/
static char *afa3_decrypt_string(uint16_t *chars, size_t size)
{
char *buf = xmalloc(size+1);
for (unsigned i = 0; i < size; i++) {
if (chars[i] >= dict.size) {
free(buf);
return NULL;
}
buf[i] = (char)(dict.bytes[chars[i]] ^ 0xa4);
}
buf[size] = '\0';
return buf;
}
Thankfully, the programmer that designed this horror was feeling merciful and left the file data itself byte-aligned an unencrypted. The offset and file size in the file descriptors can be used in the same manner as in AFA v1 and v2 to locate file data within the archive.