Thursday, August 7, 2014

Hacking Clocktower - The First Fear



This one... this one should be fun - let's take a great Japanese survival-horror game, figure out how it works, make it English, and do some other stuff to it :)

Introduction
If you've never heard of ClockTower (or Clocktower - The First Fear as it's known in Japan), it was a game originally for the Super Nintendo by Human Entertainment.

It's one of those point and click adventure games (if you're old like me and remember those), but it's also one of the earlier games that would help create the survival horror genre. Basically, these 4 orphans end up at this mansion after the orphanage closes and they start disappearing; you take over the role of Jennifer in the group and have to find a way out, hopefully find your friends, and figure out what's going on in the mansion.

The atmosphere is gritty, things happen in the house for no apparent reason at random (it's pretty haunted; let's just say), sometimes things just come out and kill you, and all the while, this psychopath with a pair of gardening shears is chasing you around the house - stalking you (the nemesis thing from Resident Evil 3 was not some cute idea Capcom came up with):


The game is filled with gritty situations that involve thinking quickly on your feet to survive, running and hiding is the only option, but saving what sanity Jennifer has left is also important as she can trip over her own feet and die of a heart attack if things get too intense (a la Illbleed).

Anyway, this game was released in 1994 for the SNES, re-released with more content on the PSX in 1997, released on the Wonderswan (but nobody gives a damn about the Wonderswan), and finally released on the PC (and then later re-released on the PC in 1999 as a value edition).

Yes - there's an English translation of the SNES rom by Aeon Genesis (I think it was him, anyway), but the SNES version is kinda shitty compared to the PC version which has much better sounds, an FMV intro, etc. Probably the only version with a bit more content is the PSX version, but we'll get into that later.

First, let's focus on getting the game running without the CD!

Part 1 - The Landscape

Basically, we have a game directory that looks like this:

We have a BG directory containing what looks to be the room backgrounds in a carved up format along with an accompanying .MAP file for each one which I can only assume maps the various slices to the screen:


We have a DATA directory containing various PTN and PYX files... not sure what these are at the moment... probably mapping the level screens to logic or something.

We have a FACE directory containing bmps of the faces when people speak:

We have an item directory containing all the items Jennifer can use:


We have a PATTERN directory with ABM files. I believe these are used for sprite animations, but I could be wrong.

We have a SCE directory with two files: CT.ADO and CT.ADT... we'll get to these - basically, it's a logical script for the actions that happen in the entire game using a custom binary scripting language... this is going to be where we spend most of our time.

We have a SNDDATA directory with MIDI and WAV files for BGM and SFX

We have a VISUAL directory with all the HQ renders shown, menu options, etc in BMP format. This is also where the intro video is held.

Finally, we have a DATA.BIN file which... I have no idea what it does at the moment in addition to the ct32.exe which is our game executable.

Running the game without the CD present will give us this:

Running it in a debugger to find this string, you'll find that it's basically saying it doesn't know where the game data is (ie CD is not inserted).

Part 2 - Let's NOCD this bitch
Looking in IDA, we already see RegQueryValueEx used - basically pulling from a registry key... I wonder what it's getting?

Well then... I guess that's the answer.

BUT WAIT.
Let's check first to see what calls this... maybe we can hardcode it:

It seems to be a bit more digging has paid off! The game looks for an arg called NOREG. If it doesn't find it, it checks the registry value, it skips the check and starts the game assuming it's in the parent directory. A simple patch to make the game think it's always being executed with NOREG as an arg by changing the jz to 0xEB (or unconditional JMP) should do the trick here.

Running it now produces:


Oh YES :D

Part 3 - Finding the in-game text 
Stepping back from a logical standpoint, should we expect all the text to be in the executable? Well, it's possible - many games do it. There are even some strings in the binaries (like the NOCD error we found) that are static, but I have a feeling they're going to be elsewhere (maybe in the DATA files?) - let's see:

Japanese text that uses the SHIFT-JIS format (extremely popular) normally starts with a control character of 0x80-0x8F... simply looking for stuff like this in the executable is a good place to start.

We then copy these strings out  to a new file and open in notepad++ - pasting it into Google Translate:
Well, these look like some errors, but no real game dialogue... time to look in other files!!!

Hrmm - well the PTN and PYX files don't really have anything text-related... let's check the SCE folder :)

Well, the CT.ADT file looks to have 4 byte offsets (they count up from 0x100 all the way to the end of the file, 4 bytes at a time).

The CT.ADO file, however...


Now we're getting somewhere... not only are there ASCII strings that look like file paths, but also SHIFT-JIS text strings. The data in this file is kind of odd... let's see if we can understand it, better. We're gonna have to if we need to parse it.

 Part 4 - Diving into ADO/ADT
We've already looked a bit into how ADT works, let's take a look at what ADO looks like:

Basically, we have a magic (ADC Object File) across the top, then a bunch of 0xFFs to get the offset up to 256, then it looks like data begins. Time to think like a Computer Scientist!!!

We have all these strings, a binary format with no apparent lookup besides potentially that ADT file, there has to be some kind of control code in here mixed in with the data to let the game know what it's looking at as it's being parsed... everything has a structure, somehow.

Looking at the strings with .BMP, we notice that they follow a similar pattern:

39FF followed by a null 2-byte value (most of the time, sometimes it's 0x0100 which makes me think this is a WORD value), then a null-terminated ASCII string and some padding... In fact, EVERY time a BMP is loaded for FACE, 0xFF39 as a value is at the front! 

Let's check the executable:

Nice! Not only do we have this valule, but we have other values as well - let's check a few Shift-JIS strings to see if we can find a pattern:

Awesome! All strings start with 0xFF33, have two 16-bit values (0x0F00), and then a Shift-JIS string.

Note: One thing you'll notice about SHIFT-JIS is that it is NOT null terminated; it can't be. You see, certain programs can operate on a mix of both single and multi-byte character values, but older ones had a more difficult time with this. As a result, you'll notice that all linebreaks (0x0a) have an 0x00 after them. In fact, ANY ASCII character in this has an 0x00 after it (and numbers or english letters). This is a way that the text rendering can support multibyte interpretation with ASCII characters (0-127) and not accidentally fuck up and read a data byte as a control byte and vice-versa.

As a result, we can concur that some logic in the game must parse the string until it finds the end somehow (probably by looking for a new opcode (normally 0x28FF).

So now we know that an ADO file is basically a shit-ton of scripting 'opcodes' and subsequent data. We can theorize that the game reads this and knows what data to expect based on the preceding opcode. Now, we can look at the switch statement in the executable that we found earlier (with all the cases) to mark down every opcode the game supports (to better understand the format)

We do this by noting the values (0xFF20, 0xFF87) and looking at them in the ADO file, determining if they have the same number of data bytes before the next opcode, attempt to figure out if they're 2 byte values, strings, etc, and so on.

In addition, you'll notice that the executable has some interesting text:


In fact, this is a list and looks suspiciously like opcode names - lucky for us, they ARE! We now know what the opcode names are :)
From this point, we can run the game with a debugger, breaking at the various switch statements to see operations of various opcodes - one that might be interesting to us is JMP...

In fact, the first JMP (0xFF22) has a 2 byte value of 0x17 after it.

If you watch this in the game, and set the ADO_offset as a watched variable in IDA, you'll see the game jump from this value to 0x1B32 - How did it know to do that? It's not a multiple... maybe :


AHA!

0x17 * 4 is 0x5C - the ADT is a jump table for various scenes... you'll notice that the CALL function (0xFF23) works in a similar way, but returns to this offset after a while... The first few ADT offsets all point to 0xFF00... it seems to be pretty prominent in the game... the jumps actually skip over them though (they add +2 to the offset after jumping)... are we seeing some kind of RETN opcode???? I think so ;)

You'll notice, however, that the ADT file has various values at the end that are far beyond the size of the ADO file... what gives?! We'll get into that, but it will take a bit more introspection into the game running to determine how these jumps work (specifically watching the jumps).
Dumping the memory, one thing you'll notice is that the ADC object file in memory (CT.ADO) has a value of 0x8000 int16 written in every 0x8000 or 32kb. Besides that, the ADO is unchanged. You can also see in the executable that it a function parses the values and skips 2 bytes ahead if it sees this value (akin to a NOP).


As this game splits the data into 32k chunks (most likely to reference memory in a more segmented fashion (we're dealing with 2 byte values a ton - this is important), there has to be some sort of "address translation" for the ADT (as the ADT uses 4 bytes to reference an offset).

So there's a bunch of math here that, if you watched the debugger, you'd see something akin to this:

I didn't find the function originally, I actually went to the end of the ADT file (assuming it was a pointer to the end of the ADO file (the last RETN being offset 0x253F4). The ADT had this listed as 0x453F4... after looking at various others, I noticed that the translation took the two significant bytes, halved them, and stuck them back onto the end :)

Ok, so far, we can generate ADT files (generating is the opposite; multiplying the most significant depending how many intervals of 0x8000 we've been over), we also have a general breakdown of the opcodes, and we know where the strings are. Before we dive into disassembly; let's keep our eye on the prize and go for broke (translation first).


Part 4: Enter CTTDTI Suite
We know what format the strings are in and how to read them out of the ADO file... of course, injecting them back in will involve modifying the ADT offsets as string sizes are going to be much larger or smaller depending. Firstly, let's focus on ripping the ADO strings out into a text file ... something easily editable and that can be read back into another program to mess with them, even still, let's make a format where we can easily inject the strings into a new ADO file and update offsets easily... what do we need???

Well, the offset where the string starts is important, so is the size of the original string in bytes, then the string itself... sort of like

0xE92 25 blahblahblah

Enter cttd.py:
Basically, I renamed CT.ADO to CT_J.ADO for when I generate a new one.
This program reads in the ADO file, finds 0xFF33, skips 6 bytes ahead (to skip the opcode and 2-2byte values), and writes the starting offset of the string, the length of the string, and the string itself in a tab delimited format ending with a newline to a text file - simple :)

You'll notice that I replace any 0x0a value (newline) with [NEWLINE] - this is because I want the whole string to be processed on one line and be able to specify newlines where I want them without having to modify the format of the text file.

For fun, let's do something kinda silly - we're gonna parse this text file with translator; a python module that dumps data out to google translate and autodetects language, translates, and returns it in your desired language:
cttt.py:

Let's try a couple of strings with an injector now - the last program in this suite parses the text file,adds null-padding to any ASCII character in the strings, and reads the lines into a dict so we know what offsets are affected. It also rebuilds an ADO from scratch (it reads the ADT, loads all the "scenes" into an array with their offsets, copies all the data between strings and afterward), and then regenerates an ADT based upon the sizes of the ADO "scenes" while constructing:
ctti.py:


And we test to see if it works:

Nice!
The Engrish in this is gonna be terrible though - thankfully, I found an rtf on some Clocktower Fan Forum ( ;) ) and can manually edit the strings based upon the rough translations.


It's all translated and ready to go! We run it:


GOD DAMMIT!!!

Ok - something's wrong; let's throw it into IDA and see what's up:

So it looks like it's trying to read the ADO file into memory, but it tries to resolve a pointer and can't because nothing exists at that spot!

Seeing the struct a1 and all the malloc'ing it does, this must be the issue - digging back further, you'll find that these pointers are made here:

So the game (based on the cmp5 opcode) will only make pointers for 5 * 0x8000 chunks of the ADO but will read the ADO data in until EOF (definitely a bug). As a result, we can only load an ADO file of max size 0x28000. Does that stop us!? FUCK NO! Let's dig into this SCE struct in memory a bit more...

We can change all instances of loading the ADO pointers in from 5 to 6 in order to add another pointer, but what's after that last pointer? Why, the start of the ADT offsets, of course :)

We see that ADT offset 0x00 is at struct_head + 0x2A and goes for 0x7D0... seriously? 0x7D0 pointers??? that's like 0x8000 wait a minute:D:D:D:D:D

As a result of only having 0x4800 bytes used for our ADT file, we can say that bumping the ADT start index down to , say, 0x2E would give us 4 more bytes to write another ADO pointer there and we'd still have a ton of reserved room to spare at the end!

Finding struct references to 0x2A and changing them to 0x2E as well and:

#AWWWWWYEAAAAAAA - Gotta love Object Oriented Reverse-Engineering :)

Ok - so now we have the game completely translated, now what?!

In fact - here are the ADO/ADT files I created (drop them in your SCE directory to play clocktower in English :)))
You'd also need to make binary changes with a hex editor to CT32.exe:
Future Work - Part 5 - SCEDASM - the SCE disassembler
The next logical step would be to disassemble all the other opcodes as well to build a text file that could eventually be read into a game/editor :)

Sort of like, well - this:
scedasm.py (WIP and extremely hacky... an exercise for the reader):


Then of course, we'd need to push this all back into an ADO/ADT pair:

sceasm.py:
**NOW PRINTING**

Future Work - Part 6 - The PSX Version
The PSX version of the game uses ADO/ADT as well!!! We could convert the assets and add the PSX exclusive content to the PC version, it would seem.

Until Next Time :)

36 comments:

  1. Nice job on the write up. I just want to point out something that caught my eye:

    >Note: One thing you'll notice about SHIFT-JIS is that it is NOT null terminated; it can't be. You see, certain programs can operate on a mix of both single and multi-byte character values, but older ones had a more difficult time with this. As a result, you'll notice that all linebreaks (0x0a) have an 0x00 after them. In fact, ANY ASCII character in this has an 0x00 after it (and numbers or english letters). This is a way that the text rendering can support multibyte interpretation with ASCII characters (0-127) and not accidentally fuck up and read a data byte as a control byte and vice-versa.

    Shift-JIS is just as null-terminatable as ASCII. Shift-JIS is compatible with ASCII so it also supports single byte characters by definition. The only reason for the \x0a\x00 thing or even with the ASCII text like \x41\x00, is because the programmers didn't properly implement Shift-JIS detection. It's easier these days because you can just use something like _ismbblead() or write your own detection to determine if the character is single or multibyte (it's a simple range check in most cases, Wikipedia has a good enough Shift-JIS table), but I don't know what kind of resources they had back in 1997 so I can't speak for them. But that's just what happens when the programmers make the assumption that only Japanese will ever be used with their engine. I've had to fix that kind of issue plenty of times (bad programmers still do it to this day!). :P

    ReplyDelete
  2. Ahh - that makes a lot more sense, thanks! :)

    ReplyDelete
  3. Very interesting! Now I've just got to get hold of a legit copy of this version. The link to your patch seems to be dead by the way.

    ReplyDelete
    Replies
    1. Thanks! Fixed :)

      http://s000.tinyupload.com/?file_id=09008423864568776439

      Delete
  4. You said you found an RTF on a Clock Tower fan forum? Was it Don't Cry Jennifer? Because I made the RTF.

    ReplyDelete
    Replies
    1. Oh nice! Yeah - not too many forums for this game, haha. Thanks for making a decent starting point :)

      Delete
    2. Sure! I'm glad the PC version is available in English now. Thanks for making this hack!

      Delete
  5. Hi. I've used your tools to update your patch to fix formatting problems and technical issues, added translations for the credits and the letter scene, as well as misc script changes. If it is ok with you, I intend to upload it to www.romhacking.net

    Both yourself and Flamzeron would be listed as two of the authors as you did all the hard stuff.

    ReplyDelete
    Replies
    1. Absolutely man - collaboration is what it's all about :)

      Delete
  6. Hey great job man, However I'm trying to wing it too but I just can't seem to get it to work, I have no coding experience so, I just wanted to ask what file did you put on ida, I found an uploaded version of the game online and I fiddled with it but I found almost everything in the screen shot. Also, what version of IDA did you use? I used the freeware 5.0 version. If you have a more thorough version though i'd love to hear about it or if you could possibly upload all the game files?

    ReplyDelete
  7. Copied the two files and made the changes to the CT32.exe, but getting a File Error message. Did I miss something? o.O

    ReplyDelete
  8. DK999 there's more to it than that, which is why i'm asking the owner =/ so far no reply, hope he comes back soon

    ReplyDelete
  9. Damn, would be so sweet to play this on a PC with a mouse :(

    ReplyDelete
  10. maybe you guys have a different version - not sure. Is there some version present in the EXE?

    ReplyDelete
  11. My (somewhat self serving) advice is to use my patch if you're just interested in playing it: https://www.dropbox.com/s/0p6qsljqwwfvruo/CT%20Win95%20English%20Patch.zip?dl=0

    It translates some bits that weren't included in the script and fixes the text formatting problems. Install the game, copy all the files from the CD (or image) to the installation directory, extract the contents of the patch to that directory, and finally run patch.bat as administrator (i.e. right click it and select run as administrator). The game should then be in English.

    ReplyDelete
  12. Just to add. AFAIK there is only one version of the game. There was an original and re-release but I believe their game files are identical. IIRC I successfully used Mario's patch on both a downloaded ISO and my own copy.

    ReplyDelete
  13. Silanda by version you mean? Actually there are 2. One was a re-release. Which came out on 1998. But I meant the version of his IDA.

    ReplyDelete
  14. I'm confused. Do you just want to play the game in English, because if that's all then you don't need to touch IDA pro. Mario's already done the work: use either the patch + hex edits that are provided in the blog post, or use my patch.

    ReplyDelete
  15. That's what i'm saying, I can't fix it because it's vague for me. I have no coding experience, I found some of the stuff in the screenshots that's what i'm verifying what IDA he used or version rather. I don't know if your patch is trust worthy.

    ReplyDelete
  16. Why are you trying to use IDA Pro? It's a disassembler, it's used for code analysis and debugging. Mario used it to understand what the game was doing, it's completely unnecessary in applying the patch. Copy all the files from the disc to the installation directory, download the patch files from the blog and overwrite the original files, and use a hex editor to make the listed alterations to ct32.exe

    Bear in mind though that the credits aren't translated, the "letter" scene isn't translated, and you'll there will be problems with the text fitting into its box. That's why I used Mario's Python scripts to polish the patch. You don't have to trust my patch, run it through VirusTotal if you like. The only executable in it is xdelta3 which is used to patch the files. If you like you can replace that with a version you can download from the xdelta developer's site (just make sure you rename it to xdelta3.exe), and if you edit the patch.bat file you can see exactly what the patch is patching.

    ReplyDelete
  17. Because I'm a nice guy, here's another version of my updated patch: https://www.dropbox.com/s/s7jz80hn5ffdnkb/CT-Latest.zip?dl=0

    All that is in there are the script files and two bitmaps, no executables, nothing that could harm your PC. Just copy the two directories in the zip into the directory you have Clock Tower installed to. Unlike my other patch I linked, you will still need to do the Hex edits to CT32.exe that are listed in the blog, however these are new versions with a few more script tweaks.

    ReplyDelete
  18. THANK YOU SO DAMN MUCH!!
    Yeah it works, just forgot to copy over my CD Files to the CT directory now it works fine :)
    Thanks for the zips as well, saved 'em to my special place ^^
    Well....now I have to see how I get it to Fullscreen...

    ReplyDelete
    Replies
    1. I'll save you some trouble: you can't. There's no full screen mode. The best you can do is to set the compatibility option to run at 640x480, then it's fullscreen except for the window border. Also a tip: if you are running on a laptop, make sure your power plan is set to power saver. For some reason Clock Tower maxes out one CPU core even though it doesn't need to. Running at high performance will drain battery, produce extra heat, and possibly increase fan noise while the game's running. The game will run the same on a modern system regardless of the power plan used.

      Delete
    2. This comment has been removed by the author.

      Delete
    3. Hmm, the intro screen and the post in game text is in cantonese though? I used the auto patch thing from Silanda's OP. I see what everyone's pointing out, the original PC version file I found was in the isozone I can't guarantee it's clean i'm using a diff laptop so i'm ok. Just a heads up. Could it be possible I got a diff file set similar to what dk999 pointed out?

      Delete
    4. Ah, isozone has two versions, listed as part 1 and 2. Whichever version you downloaded, download the other one. I have a feeling that the one marked (J) is the Chinese one, not Japanese. The Japanese version's menus are in English.

      Delete
  19. My patch almost certainly won't patch the Chinese version, but because I used the @echo off command in patch.bat you won't see the error. To check if that is the case, edit patch.bat to remove the @echo off, open a command prompt, navigate to the install directory and run patch.bat using the command prompt. That way the window won't close when it's finished and you'll be able to see if xdelta produces any errors.

    ReplyDelete
  20. Ah that explains it, thanks i'll do some fiddling. Thanks

    ReplyDelete
  21. Perfect, got it to work, I used the part 2 of the file in iso zone and what happened apparently was that even with running as admin it wouldn't patch so I went on to use cmd prompt to run it. Works fine so far. Much thanks and love to you guys =D

    ReplyDelete
  22. I found a way to play in full screen, by no means perfect, but it's better than the windows menu putting you off the experience...
    1 - Download AutoHotKey (http://www.autohotkey.com/).
    2 - Download this script (To be clear, I have not developed any of this, just picked up loose parts on the various internet tutorials):
    https://mega.co.nz/#!uNg3VTAQ!VLVHoIilEvDIt3XXG9x7tPPxfBtLk9iwVHAyz318XBM
    - This script sets up two hotkeys, one being Ctrl+Alt+F that hides the borders and the title of the window and Ctrl+Alt+G to hide/show the options menu (But you can change to whatever you want).
    3 - Run the script as administrator and run the game, test the hotkeys to see if they work and then you’re set, and then create a batch file if you do not want to open the script every time you wanna play the game..
    I hope it helped, and thanks to the creators of this wonderful patch,
    Sorry for any bad english,.

    ReplyDelete
    Replies
    1. I Forgot to tell to run the game in 640x480 resolution mode.

      Delete
  23. Sorry, I've been away on real life work lately, but I have been reading these - it's nice to see so many people jumping on improving stuff :)

    I did actually try to hack up the binary to implement fullscreen at one point. The engine is capable, but there are a TON of preset sizes for textures being rendered.

    One of the biggest hurdles with that (besides the fact that most of the blitted texture sizes are written into the ADO file) is that the binary is set to render things certain sizes. While it's definitely possible to patch all those values, it's going to take a while in GDI mode (we would've had some tricks if this damn game rendered in OpenGL).

    ReplyDelete
  24. That sounds like a taxing problem, i'd love to help but i'm clueless about technical stuff, still, it got me curious enough to try everything here :D

    ReplyDelete
  25. Uh, why was the link to it removed? =/

    ReplyDelete
  26. Which link? Mario's, mine, and LEONOX's all seem to be working.

    ReplyDelete