ROM String Files

From WebTV Wiki
Jump to navigation Jump to search

All first-gen WebTV/MSN TV clients store text strings and fragments of HTML used to compose some panels in proprietary file pairs contained in a ROMFS container. This pair consists of two files with specific extensions: a named .dat file that stores the actual string data, and a companion .dir file with the same name as the .dat, which contains pointers to each string in that file. These file pairs are usually stored under the ROMFS directory /ROM/Local/, and both files in each pair are either named USNNN or WWNNN (only seen with WebTV for Dreamcast), NNN being a padded number sequence. For simplicity, we will simply refer to the entire format as ROM string files.

The .dat files used for ROM string files store strings by simply storing the raw data sequentially, adding a NUL byte to the end of each string. .dir files store pointers to the start of each string in its corresponding .dat file as a sequence of packed 32-bit integers in big endian format. The .dir files are used to properly read the .dat strings, where they are read from each pointer until a NUL byte is reached.

Caveats

ROM string files do not have any standard for newlines or any way to specify what newline format is used in a set of strings, and as observed in several WebTV/MSN TV builds, the newline format used can vary between builds/releases and is not guaranteed to be consistent. For example, the WebTV Classic builds and WebTV for Dreamcast usually store line-separated text in ROM string files with the line feed terminator, while the 2.9 MSN TV builds for some reason use carriage return (CR) for line-separated text in its ROM string files.

Text encoding is also not explicitly defined for strings stored in ROM string files, expecting the WebTV/MSN TV client to use its own set encoding for rendering the text. English WebTV/MSN TV box builds appear to use standard ASCII for ROM strings, while WebTV for Dreamcast uses Shift-JIS encoding for all of its ROM strings.