I've finally completed my write-up about my exploration of Electronic Arts' Make Your Own Murder Party data disk, as well as an explanation of how the data is decoded by the software. The complete decoded text, my explanation of how it all works, as well as the program I wrote to decode the data can be downloaded here:
(By the way, I'll update the other thread about the Murder Party printing issue sometime soon.)
Here's a brief synopsis of my observations and decoding tutorial:
After a week of digging through Murder Party's program code and comparing several different disk images for the Commodore, Apple and PC versions, I have deciphered the elements that are used to create every Murder Party! (The text data is the same for all three computer systems, by the way. Also it isn't particularly readable... It's sort of a random jumble of sentences and phrases which are chosen by the program as needed.)
In deciding how to begin, I knew that the screen text had to be compressed on the same disk along with the story text, so it would be a good place to start. I fired up the Commodore emulator (Sorry! It's easier for me to use the C64 monitor than AppleWin's debugger) and loaded up a disk image of Murder Party. When the program started printing the descriptive text on the screen, I broke into the monitor and stepped through the code, saving it to a file as I went.
After a few iterations I could see that there was bit isolation and bit-shifting going on, so I could tell that the text was likely compressed using a short bit pattern system. By using the Monitor output file I was able to follow the values printed and trace them backward to see where they originated. In this way I was able to home in on the actual bit manipulation routines, and figure out how the compression was decoded. Murder Party manages to cram nearly 85 pages of raw text (when printed as Courier 11 point) on one side of an 80s floppy disk with plenty of room to spare!
Murder Party uses five bits to represent each character, which allows values from 0 to 31. Codes 6-31 cover the 26 letters of the alphabet in lowercase. Murder Party uses code 3 to turn capitalization on and off, so all the uppercase letters can use the same codes they used when they were little. Code 5 is used for a “space” character, and code 4 represents a space followed by a single uppercase letter, code 2 represents shifting to an alternate character set; which means another 31 characters to play with! These alternate-codes give Murder Party all the punctuation and numbers it needs.
The remaining two codes are 0 — which is the standard “end of line” indicator for the end of a block of text, such as a sentence, paragraph or short phrase depending on the program’s needs — and 1, a very special code which represents to Murder Party that it must make a choice between multiple pieces of text, or customize the text with a player’s name or a characteristic about them.
Murder Party packs eight characters as 5-bit codes into five 8-bit bytes:
11111222 22333334 44445555 56666677 77788888
saving 3 whole bytes! It may not seem like much, but after the games’s roughly 200,000 characters are condensed, it really adds up!
In the zip archive above, I explain how the text is decoded in ASM, provide an overview of the special codes used to insert personalized text into each game, and present my observations of some interesting aspects and potential bugs in the text data. I realize that this will only be of interest to a select few, but I've always wanted to know what was on that disk, and I may as well share my findings!
Nice work. So I take it you were able to get the Apple version of the game to run?
Thanks! I haven't gotten it printing yet, but I didn't need that part working to decode the text, I just needed the data disk (side B) which iatkos posted. My ebay disk arrives on Monday and I should be able to look at it sometime mid week. Hopefully I'll update the other thread next weekend.
Thanks for the write-up! You've certainly invested quite a bit of time into analyzing and decoding all this.
My only suggestion for improvement would be: why don't you upload the ZIP file with your pdf and disassembly to this site - using the "Media browser" to attach directly to a forum post? These external links tend to break some day - when mediafire changes owner or is shut down again. In 10 or 20 years or so, it will be lost. And then someone will find your forum message - but then the pdf with your detailed write-up is gone. This site operates on a 100-Year Plan. So, uploading it here may make it last a bit longer...
That was a very cool read. Nice one. Love the 5 bit data.
Amazing all the stuff they did back then to fit something on one [less] disk. Great dedication too, bravo on your efforts so far.
It reminds of a story when my friend took over an engineering spot at a local dial up ISP. He was looking through the database structure and as I remember it, the previous developer was storing the customer IPV4 addresses as four hex bytes in the database instead of (up to) 12 decimal digits with periods. Back then it must have actually saved enough disk space to be worth it.
Thanks everyone for the kind words of enthusasm! It was fun to figure this out, and I'm glad to share it.
Done! Perhaps a moderator could edit my original post to replace the mediafire link?
Decoding a Murder.zip