LocoScript 1 file format
This is my best guess at the LocoScript 1 file format, based on examining a variety of documents.
LocoScript 1 saves its files on CP/M-formatted discs, and therefore uses CP/M conventions such as 8.3 filenames and 128-byte records.
A byte is 8 bits. A word is 16 bits, in little-endian format.
Header
Nearly all LocoScript files start with a 128-byte header, and the LocoScript 1 document is no exception.
Offset | Size | Description |
---|---|---|
0x00 | 3 | Magic number: 'JOY' |
0x03 | word | File format version number. For LocoScript 1 this is 0x101, i.e. 1.1 |
0x05 | 90 | File identity. Three lines of 30 bytes each, PCW extended ASCII. |
0x5F | byte | Maximum number of layouts. |
0x60 | byte | Maximum number of tabs per layout. |
0x61 | byte | Decimal point character (usually . or , ) |
0x62 | byte | Zero character (0x30 for slashed zero, 0x7F for unslashed) |
0x63 | byte | 0xFF if widows and orphans are allowed, 0 if not. |
0x64 | byte | 0xFF if paragraphs can be broken across a page, 0 if not. |
0x65 | byte | Bitmask of which
word-processing codes to show.Bit 0: Show codes Bit 1: Show rulers Bit 2: Show blanks Bit 3: Show spaces Bit 4: Do not show effectors (paragraph / tab symbols) |
0x66 | 3 | unknown |
0x69 | word | page length in half-lines |
0x6B | word | Number of first page in the file. |
0x6D | word | Number of last page in the file. |
0x6F | byte | Page numbering scheme:0xFF: All the same 0x00: Odd / even pages differ 0x01: First page differs 0x02: Last page differs |
0x70 | byte | Header height, in half-lines |
0x71 | byte | Header start row, in half-lines |
0x72 | byte | Header omit flags:Bit 7 set: omit first page header Bit 0 set: omit last page header |
0x73 | 4 | Unknown |
0x77 | byte | Footer height, in half-lines |
0x78 | byte | Footer start row, in half-lines |
0x79 | byte | Footer omit flags:Bit 7 set: omit first page footer Bit 0 set: omit last page footer |
0x7A | 4 | Unknown |
0x7E | byte | Number of the 128-byte record containing the first chunk (see below). |
0x7F | byte |
Layouts
The layouts are stored immediately following the header (i.e. at offset 128). The length of each layout is 10 bytes plus the number of tab stops (header byte 0x60).
The format of a layout is:
Offset | Size | Description |
---|---|---|
0x00 | byte | Character pitch. Bits 0-5
give character width in 16ths of an inch:0x18 => 10cpi 0x14 => 12cpi 0x10 => 15cpi 0x0E => 17cpi 0x00 => proportionalBit 6 set for double width. |
0x01 | byte | Line pitch, in 432ths of an inch, so 0x48 for 6lpi, 0x36 for 8lpi. |
0x02 | byte | Line spacing, in half-lines; so 1 for half-spaced, 2 for normal-spaced, etc. |
0x03 | byte | Default character style:Bit 1: Word underline (underline words but not spaces) Bit 2: Underline Bit 3: Reverse video Bit 4: Doublestrike Bit 5: Italic Bit 6: ? Bit 7: Justified? |
0x04 | byte | Left margin position. |
0x05 | byte | Right margin position. |
0x06 | byte | Count of left tabs. |
0x07 | byte | Count of right tabs. |
0x08 | byte | Count of centre tabs. |
0x09 | byte | Count of decimal tabs. |
0x0A | variable | Tab stops. First the positions of all the left tabs, then all the right tabs, then centre, then decimal. |
The Document content
The document starts at offset 128 * byte 0x7E of the header. It consists of a number of chunks, each of which is a multiple of 128 bytes long. The format of a chunk is:
Offset | Size | Description |
---|---|---|
0x00 | byte | Length, in 128-byte records. |
0x01 | byte | 0 (high byte of length?) |
0x02 | byte | Flags:Bit 0 set => Last chunk of this page. Bit 7 set => First chunk of this page. |
0x03 | 4 | Unknown |
0x07 | byte | Number of currently- selected layout (0-based) |
0x08 | byte | Line alignment (details not known) |
0x09 | byte | Current character pitch (cf byte 0x00 of the layout). Bit 7 set if this is the default pitch. |
0x0A | byte | Current line pitch (cf byte 0x01 of the layout). Bit 7 set if this is the default pitch. |
0x0B | byte | Current line spacing (cf byte 0x02 of the layout). Bit 7 set if this is the default spacing. |
0x0C | byte | Current character style (cf byte 0x03 of the layout). |
0x0D | variable | Text and markup |
In the text, characters 0x00-0x7F and 0xA0-0xFF are printable, using the PCW character set. This is the same character set used by CP/M on the Spectrum +3. Characters 0x80-0x9F are markup codes:
Code | Description |
---|---|
0x80 | End of chunk. |
0x81 | Space (0x20 is used for hard spaces, i.e. those which do not permit the line to be broken). |
0x82 0x00 | (LastLine) code - break page after this line. |
0x82 0x01 | Form feed - break page now. |
0x82 0x02 | Hyphen (0x2D is used for hard hyphens). |
0x82 0x03 | Soft space. |
0x82 0x04 | Insert current page number, |
0x82 0x05 | Insert last page number. |
0x82 0x06 | Appears to be a combined carriage return and (-ReV). Does not appear in live documents, but is decoded as that if inserted using a binary editor. |
0x82 0x07 | Appears to be a combined carriage return and (+ReV). Does not appear in live documents, but is decoded as that if inserted using a binary editor. |
0x82 0x08 | (SiC) - the word containing this code is spelt correctly. Added some time between LocoScript 1.20 and 1.40. |
0x83 0x00 | (+Bold) Bold on |
0x83 0x01 | (+Wordul) Word underline on |
0x83 0x02 | (+UL) Underline on |
0x83 0x03 | (+ReV) Reverse video on |
0x83 0x04 | (+Double) Doublestrike on |
0x83 0x05 | (+Italic) Italic on |
0x83 0x06 | (+SupeR) Superscript on |
0x83 0x07 | (+SuB) Subscript on |
0x83 0x08 | (+Mail) Begin LocoMail macro. Added some time between LocoScript 1.20 and 1.40. |
0x84 0x00 | (-Bold) Bold off |
0x84 0x01 | (-Wordul) Word underline off |
0x84 0x02 | (-UL) Underline off |
0x84 0x03 | (-ReV) Reverse video off |
0x84 0x04 | (-Double) Doublestrike off |
0x84 0x05 | (-Italic) Italic off |
0x84 0x06 | (-SupeR) Superscript off |
0x84 0x07 | (-SuB) Subscript off |
0x84 0x08 | (-Mail) End LocoMail macro. |
0x85 0x00 0x00 | Soft hyphen. |
0x85 0x00 0xnn | (0xnn > 0) Soft linebreak. |
0x85 0x02 0xnn | (+LayouT) Select layout 0xnn |
0x85 0x03 0xnn | (+LPitch) Set line pitch 0xnn. As in the layout, this is specified in 432ths of an inch. If bit 7 of the pitch is set, this is a (-LPitch) command; the value being selected is the default. |
0x85 0x04 0xnn | (+LSpace) Set line spacing 0xnn. As in the layout, this is specified in half-lines. If bit 7 of the vlaue is set, this is a (-LSPace) command; the value being selected is the default. |
0x85 0x05 0xnn | (+Pitch) Set character pitch 0xnn. As in the layout, this is specified in half-lines. If bit 7 of the vlaue is set, this is a (-Pitch) command; the value being selected is the default. |
0x85 0x06 0xnn | (+Keep) Keep the following 0xnn lines together. |
0x85 0x07 0xnn | (-Keep) Keep the previous 0xnn lines together. |
0x86 0xnn 0xmm | Start of line. The two following bytes are a little-endian Z80 word, giving the distance of the right-hand end of the line from the right margin, in 240ths (?) of an inch. |
0x88 0xnn | End of line. nn is:0x01 => soft end-of-line (normal wrapping) 0x02 => hard end-of-line (paragraph) 0x03 => Unit 0x04 => Tab } These 4 not found in live documents, 0x05 => Indent tab } but LS1 interprets them thus if they're 0x06 => Centre } inserted manually 0x07 => Right align } 0x08 => Carriage return? Seems to be used at the start of some pages. |
0x89 0xaa 0xbb 0xcc 0xdd | Change horizontal position. aa is the code that caused
realignment:0x00 => Layout change causes left margin to move 0x01 => soft end-of-line (normal wrapping) 0x02 => hard end-of-line (paragraph) 0x03 => Unit 0x04 => Tab. 0x05 => Indent tab 0x06 => Centre. 0x07 => Right align.bb is the new X position in characters. cc and dd is the new position in 240ths of an inch (not sure where it's measured from). |
0x8A 0xaa 0xbb 0xcc 0xdd 0xee | Justified text. The first byte is:0x01 => End-of-line (caused by text wrapping). If there are no packing spaces to be inserted, the standard 0x88 0x01 end-of-line is used. 0x02 => Start justified paragraph. 0x08 => A justified version of the 0x88 0x08 carriage return code?Other bytes unknown. |
0x89 0xaa 0xbb 0xcc 0xdd 0xee 0xff 0xgg | Justified version of the 0x89 code. The first 5 bytes are the same as this code. Other bytes unknown. |
0x8C-0x9F | Do not appear to be used. |