Chip's Challenge (DOS / Spectrum / CPC) File Format

By John Elliott. Based mainly on reading the source of c4 by Brian Raiter, and peering at various versions of CC with a hex editor.


Ports of Chip's Challenge prior to the Microsoft version tend to use the same format to describe levels. This format also appears to be used by the original Lynx ROM.

Unless otherwise stated, all multi-byte values are little-endian.


Level list

In CC for DOS, each level is stored in a separate .PAK file in the 'levels' directory. The names of the .PAK files are hardcoded in the file BF1.EXE, at offset 0x94B9. Each level filename is a fixed 13-character ASCIIZ string; the name element is packed to 8 characters with spaces. For example, 'DIGGER.PAK' is stored as

000095A3            64  69 67 67 65  72 20 20 2E  70 61 6B 00     digger  .pak.


There is no master table of passwords. Instead, 256 passwords are generated by the game engine at startup using a pseudo-random number generator, although only the first 149 are used:


After the 256 passwords have been generated, the program then generates three more characters and uses them to overwrite the passwords of levels 98, 107 and 114 (which would otherwise be duplicates):



Each level (.pak) consists of one or more compressed blocks. A block begins with a 4-byte header:
ByteLength of compression dictionary, 0-255. If this is zero the level is not compressed.
ByteNonzero if another compressed block follows this one; zero if this is the last compressed block in the level. In the levels supplied with CC for DOS, this byte is always zero, and I don't know how well various implementations support nonzero values for it.
WordNumber of bytes following the dictionary.

The dictionary then follows. Each entry is three bytes:

ByteToken to define.
ByteToken definition: first byte.
ByteToken definition: second byte.

Each 'definition' byte will be treated either as a reference to another token (if it matches a token already in the dictionary) or a literal byte (otherwise). Note that tokens can be redefined. For example, level 106 (KABLAM) starts with:

	1E		; Dictionary has 30 entries
	00		; This is the last block
	70 00		; 112 bytes follow the dictionary
	02 2C 2C	; Define dictionary entry 02 as { 2C 2C }. Dictionary
			; entry 2C is not defined, so both '2C' are treated
			; as literal values.
	03 00 00	; Define dictionary entry 03 as { 00 00 }.
	04 02 02	; Define dictionary entry 04 as { 02 02 }. Dictionary
			; entry 02 is defined, so both are expanded; entry
			; 4 becomes { 2C 2C 2C 2C }
	02 03 03	; Redefine dictionary entry 02 as { 03 03 }. Dictionary
			; entry 03 is defined, so both are expanded; entry
			; 2 becomes { 00 00 00 00 }

Here are some C code fragments for reading the dictionary, and expanding an entry.

typedef struct dictent
	struct dictent *pLeft;
	struct dictent *pRight;
	unsigned char chLeft;
	unsigned char chRight;

DICTENT	dict[256] = { 0 };
DICTENT *lookup[256] = { 0 };

	/* Load dictionary */
	int dentry = 0;
	for (n = 0; n < entries; n++) 
		unsigned char key   = read_byte();
		unsigned char left  = read_byte();
		unsigned char right = read_byte();

		dict[dentry].pLeft  = lookup[left];
		dict[dentry].pRight = lookup[right];
		dict[dentry].chLeft = left;
		dict[dentry].chRight = right;
		lookup[key] = &dict[dentry];

/* Expand an entry */
int expand(DICTENT *de, unsigned char *dest)
	int len = 0;
	if (de->pLeft) 
		len = expand(de->pLeft, dest);
		dest += len;
		dest[0] = de->chLeft;
		len = 1;
	if (de->pRight) 
		len += expand(de->pRight, dest);
		dest[0] = de->chRight;
	return len;

Following the dictionary is a sequence of bytes, which are expanded to the level data. Again, these are treated as dictionary tokens (if defined in the dictionary) or literal values (otherwise).

unsigned char *dest = buffer;
	unsigned char c = read_byte();

	if (lookup[c])
		dest += expand(lookup[c], dest);
		*dest++ = c;

Expanded data


Once the level has been expanded, the first 1024 bytes will be the map (32 rows of 32 tiles). Tiles are:

00Empty 10Fire boots 20Yellow key 30Red button
01Wall 11Force boots 21Green key 31Chip (uncounted)
02Ice 12Water boots 22Blue button 32Blue block wall
03Dirt 13Fire 23Chip (counted) 33Hint button
04Blue block floor 14Water 24Socket
05Force floor north 15Thief 25Exit
06Force floor east 16Popup wall 26Invisible wall temporary
07Force floor south 17Toggle open 27Invisible wall permanent
08Force floor west 18Toggle closed 28Gravel
09Force floor random 19Green button 29Wall east
0AIce corner SE 1ARed door 2AWall south
0BIce corner SW 1BBlue door 2BWall southeast
0CIce corner NW 1CYellow door 2CBomb
0DIce corner NE 1DGreen door 2DBear trap
0ETeleport 1ERed key 2EBrown button
0FIce boots 1FBlue key 2FClone machine

Other tile types tend to crash the game engine.

Creature list

Following the map is a 384-byte creature list.

000h:	DB	creature0, creature1, ...	; Up to 128 creature IDs
080h:	DB	x0, x1, ...			; X-coordinates
100h:	DB	y0, y1, ...			; Y-coordinates

Creature types are:

04Chip north 05Chip east 06Chip south 07Chip west
08Bug north 09Bug east 0ABug south 0BBug west
0CCentipede north 0DCentipede east 0ECentipede south 0FCentipede west
0CFireball north 0DFireball east 0EFireball south 0FFireball west
10Glider north 11Glider east 12Glider south 13Glider west
14Ball north 15Ball east 16Ball south 17Ball west
18Block north 19Block east 1ABlock south 1BBlock west
1CTank north 1DTank east 1ETank south 1FTank west
20Walker north 21Walker east 22Walker south 23Walker west
24Blob north 25Blob east 26Blob south 27Blob west
28Teeth north 29Teeth east 2ATeeth south 2BTeeth west

Other creature types tend to crash the game engine.

Level time, title and hint

The level time is a little-endian word:

	DW	time

The title is a 0-terminated string, with this encoding:

x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xA xB xC xD xE xF
0x Newline Space 0 1 2 3 4 5 6 7 8 9 A B C D
1x E F G H I J K L M N O P Q R S T
2x U V W X Y Z ! " ' ( ) , - . : ;
3x ?

(Note: The Spectrum version port only supports characters 00h-25h, plus 2Dh).

The hint, if present, uses the same encoding as the title except it is terminated with two 00 bytes in succession (a single 00 byte is treated as a newline but does not end the hint).


On the Spectrum version, levels are stored on cassette tape. World of Spectrum has three tape images:

In all versions, each file on the tape contains eight consecutive levels. The files are distinguished by their level set ID: ID zero holds levels 1-8, ID 1 holds levels 9-16, etc.

TAP format

The TAP format is vastly easier to understand: both as a container, and the way the levels are stored in it. A .TAP file consists of one or more blocks:

	DW	length	;Number of bytes in the following block
	DB	block	;Data block. The first byte is the type, usually
			;00h for a file header, 0FFh for file data.

Each level file is stored in the .TAP as two consecutive blocks, header followed by data:

	DW	13h		;Number of bytes in the following block
	DB	00h		;Block type 0 : header
	DB	03h		;Spectrum file type (CODE)
	DB	id		;Level set ID, 0-17.
	DB	"*CHIP'S* "	;Remainder of Spectrum filename
	DW	datalen		;Length of level data
	DW	8000h		;Notional load address (ignored)
	DW	0		;Unused field
	DB	check		;XOR of all bytes from block type to end.

	DW	blklen		;Number of bytes in the following block,
				;should be datalen + 2
	DB	0FFh		;Block type 0FFh : data
	DB	... level data ...
	DB	check		;XOR of all bytes from block type to end.

For example, this is how the beginning of the first level file looks. I have highlighted the two separate TAP blocks:

00011938                             13 00 00 03  00 2A 43 48          .....*CH
00011940   49 50 27 53  2A 20 97 08  00 80 00 00  5A 99 08 FF  IP'S* ......Z...
00011950   10 A8 12 A9  F4 A9 27 AB  7E AC 83 AD  7B AE B2 AF  ......'.~...{...
00011960   11 00 CB 00  02 00 00 04  02 02 05 04  04 07 05 05  ................

The level file begins with a 16-byte header, giving the addresses of the eight levels it contains (on the assumption that the file has loaded at 0A800h). In the above example, the first level pointer is 0A810h, so Lesson 1 can be found 16 bytes after the start of the file.

The actual levels themelves, once you reach them, are in the same compressed format as the DOS .PAK files.

TZX format

TZX, as a container format, is described at World of Spectrum. It's a nuisance to work with, because there's no consistent way of specifying the length of a chunk.

Of the two TZX releases, the Erbe rerelease contains the same data as the .TAP download. The original (turbloader) TZX stores the levels in an obfuscated form, which the turboloader decrypts on read. For example, the file containing the first levels has the header

E9 CB 5C F7 98 41 83 C3 25 FE CB 76 2B

After decryption by the loader, this becomes

00 89 08 FF 5F 3D 3D C0 97 08

Of these bytes: 00 is the level ID.
89 08 is the number (0889h) of bytes that will be read from the file.
97 08 is the number (0897h) of bytes that will end up in memory.

The other bytes are used in decoding the following data block.

Amstrad CPC

The version of the game downloadable from is in the form of a disk image (in the CPC data format). The filesystem is CP/M, with groups of eight levels held in CP/M files:

CHIPS.CHA: Levels 1-8
CHIPS.CHB: Levels 9-16
CHIPS.CHC: Levels 17-24

Each file begins with a 128-byte AMSDOS header. This is followed by the same data as in the Spectrum tape file: eight words giving the level addresses (assuming a load at 0A800h) and then the eight levels.