The Cultures CIF / TAB / SAL File Format

Version 1.1.1
by Molt
Last changed: 3. March 2015

Description

.cif, .tab and .sal files have almost the same file format, for the only difference see the Entry types below.
All file store text strings in an encrypted form.
.cif files represent .ini files and are the most common of the three.
.tab and .sal files store plain text data and are only used for languages.

Format

C1_EncryptedFile
{
    u4 magic;                               // 65601
    array[2]
    {
        u4 numberOfEntries;                 // 2 times the exact same value
    }
    u4 constant;                            // 10
    u4 indexLength;
    encryptedIndexTable[indexLength]
    {
        u1 byte;
    }
    u2 one;                                 // 1
    u4 constant;                            // 10
    u4 contentLength;
    encryptedContentTable[contentLength]
    {
        u1 byte;
    }
}

C2_EncryptedFile
{
    u4 magic;                               // 1021
    u4 zero;                                // 0
    u4 one;                                 // 1
    array[3]
    {
        u4 numberOfEntries;                 // 3 times the exact same value
    }
    u4 contentLength;
    u4 constant;                            // 1001
    u4 zero;                                // 0
    u4 indexLength;
    encryptedIndexTable[indexLength]
    {
        u1 byte;
    }
    u1 byte;                                // 1
    u4 constant;                            // 1001
    u4 zero;                                // 0
    u4 contentLength;
    encryptedContentTable[contentLength]
    {
        u1 byte;
    }
}

Reading it

The first step to reading a .cif/.tab/.sal file after parsing the header is to decrypt encryptedIndexTable and encryptedContentTable. Treating both as encrypted byte arrays, they can both be converted to an unencrypted byte array of the same size.

c = 71;
d = 126;
for(i = 0; i < length; i++)
{
    plainTable[i] = (encryptedTable[i] - 1) ^ c; // ^ = bitwise XOR
    c += d;
    d += 33;
}

Implemented in Java, this procedure could look like:

public static byte[] decryptTable(byte[] encryptedTable)
{
    byte[] plainTable = new byte[encryptedTable.length];
    int c = 71;
    int d = 126;
    for(int i = 0; i < encryptedTable.length; i++)
    {
        plainTable[i] = (byte)(((encryptedTable[i] & 0xFF) - 1) ^ c); // Just make sure you don't lose the highest
                                                                      // bit when dealing with signed types.
        c += d;
        d += 33;
    }
    return plainTable;
}

Now let:

plainIndexTable = decryptTable(encryptedIndexTable);
plainContentTable = decryptTable(encryptedContentTable);

plainIndexTable can now be used as an array of u4 (with length indexLength / 4), of which every value is an offset in plainContentTable.

plainIndexTable[indexLength / 4]
{
    u4 index;
}

plainContentTable[indexLength / 4]
{
    ContentEntry entry;
}

What ContentEntry is depends on the file type.

For .cif files:

ContentEntry
{
    u1 meta;            // Determines the role of the following string.
                        // The only known values for this are:
                        // 1: Section name
                        // 2: Plain content
    char[] string;      // A C-Style string (with a terminating '\0' character)
                        // Note that you DO NOT KNOW its length in advance.
}

For .tab and .sal files:

ContentEntry
{
    char[] string;      // Same as above
}

At this point, .tab and .sal files are trivial to reconstruct: Each ContentEntry equals one line of test.

Now for .cif files any entry with a meta value of 2 can be appended as a new line to the file as well, but entries with a meta value of 1 need to be enclosed by [ ] to create a new ini section.

Hence, a plainIndexTable containing the values 0, 5, 8, 11, 14, 19, 22, 25 and

// \x?? = character with hex value ??
plainContentTable = "\x01aaa\x00\x02a\x00\x02b\x00\x02c\x00\x01bbb\x00\x02d\x00\x02e\x00\x02f\x00"

should be converted to the following ini file:

[aaa]
a
b
c
[bbb]
d
e
f

There could be an empty newline before [bbb] and I create my tools to procude one, but that's optional.

Guessing the file type

Without knowing the file suffix, your options to determine the file type are very limited, but they exist.

First you can take advantage of the fact that there are no .tab or .sal files in any other game that Cultures 1. If your file is a C2_EncryptedFile, it's safe to assume it's a .cif file.

If your file is an C1_EncryptedFile however, you can only look at ContentEntry.meta.

If all of them are \x01 or \x02, it's most likely a .cif file.
If all of them are visible characters (chracter value >= 32), it's most likely a .tab or .sal file - there is no way of telling those two apart however.
If none of the above apply, please contact me as I have never seen such a file and would like to investigate it.

Notes

All integers are little-endian.
All cultures games can work with either an unencrypted .ini file or an encrypted .cif file with the same name when looking for a file.
It is unknown whether this works for .tab and .sal files too since their "unencrypted" suffixes are unknown.
It it also unknown which file takes precedence when both are present. TODO: test this
It is not known whether #-macros and comments are accepted in .cif files. TODO: Test this

Trivia

No .tab or .sal files appear in any game other than Cultures 1.
"cif" most likely stands for "Cultures ini file" or "Cultures information file".
"tab" most likely stands for "table".
In the Cultures 1 library data_l/data.lib, there is a program called "Text Table Converter" (data/gui/texttableconverter.exe) apparently used to create and convert language files. It produces .ini, .cif and .hpp (C++ header) files as well as files in the .tab/.sal format.