Hooking Golden Axe VBlanks

If you’re a 90’s kid you’ve likely played Golden Axe either at the arcade or on the SMD. You know what it’s like to have to attempt to kick the creepy little elusive elves for magic pots. If you haven’t played Golden Axe in a while (or at all), see below.

Golden Axe Theives

In this blog post, I’m going to demonstrate the methods that I used to pull off a relatively simple ROM hack to equip Ax Battler with unlimited magic pots. I also am releasing a Binary Ninja script that I wrote to enumerate PC-relative branch table patterns that were not picked up by auto-analysis.

Initial Analysis

For this hack I chose to use Binary Ninja. I started by cloning a fork of a third-party 68k architecture plugin for BN, found here. This fork has fixes for updates made to the API, improvements from the original, and true support for native 68000 (described later). After dropping the ROM in BN, I manually created the interrupt vector table and created functions at program start, the vblank handler, and hblank handler. My goal was to find the logic responsible for maintaining Ax Battler’s magic counter in hopes that I could make a simple patch to max it out and prevent it from decrementing (unlimited magic). After kicking off initial analysis, I realized BN was successful at finding only a small amount of code (Ghidra and IDA yielded even worse results). After some inspection I found that PC-relative jump and call tables are used all over. These tables are sort of unconventional in that the entries contain instructions instead of addresses. They are setup as follows:

0xd50    jsr $10(pc, d0.w)
...
0xd60    rte
0xd62    bra ($00000db6)
0xd66    bra ($00004196)
...

The instruction at 0xd50 is a PC-relative jump to a location in the branch table beginning at 0xd62 (determined by the index value in d0). Likely because the value of d0 can’t be reliably determined, BN, Ghidra, and IDA all terminate disassembly at the return instruction at 0xd60 (leaving 0xd62 and on as undefined). Unwinding these tables manually would have been a pain, so I wrote a script. The script works by operating on BN’s medium level IL (MLIL) representation of the code. I started by writing a function to extract the base addresses of branch tables (depicted below).

def find_call_tables(self):
    base_addrs = []
    for func in self.bv:
        for block in func.medium_level_il.ssa_form:
            for instr in block:
                # If it's a call or jump and the operand contains addition
                branch_ops = [
                    MediumLevelILOperation.MLIL_CALL_UNTYPED_SSA,
                    MediumLevelILOperation.MLIL_JUMP,
                    MediumLevelILOperation.MLIL_GOTO
                ]
                if instr.operation in branch_ops:
                    if type(instr.dest) == long:
                        continue

                    mlil_add = MediumLevelILOperation.MLIL_ADD
                    mlil_const = MediumLevelILOperation.MLIL_CONST
                    if instr.dest.operation == mlil_add:
                        if instr.dest.operands[0].operation == mlil_const:
                            base_addrs.append(instr.dest.operands[0].constant)
    return base_addrs

This function iterates all MLIL instructions in search of instructions containing a branch operation (MLIL_JUMP MLIL_GOTO, or MLIL_CALL_UNTYPED_SSA). When it locates a branch operation, it checks if there is an addition operation taking place to compute the destination. If so, it takes the value of the first operand in the addition operation, which is the address of the base of the table. The function returns a list of branch table base addresses. Next, I wrote a function to iterate the tables, check for bra instructions (depicted below).

def disas_call_tables(self, base_addrs):
    count = 0
    for addr in base_addrs:
        i = addr
        while True:
            self.br.seek(i)
            opcode = self.br.read32be()
            if (opcode >> 24) == 0x60:
                if not self.bv.get_function_at(i):
                    self.bv.add_function(i)
                    count += 1
            else:
                break
            i += 4

    return count

This function is not as flashy, but it gets the job done. It simply iterates the entries in the branch tables (reading a dword at a time). It checks if the top byte of the dword is 0x60, the first byte of a bra instruction. Any entry that contains 0x60 is disassembled, if it hasn’t been disassembled already. Once a dword is hit that isn’t a bra instruction, the next table is processed. As a result, the call target and any other routines in the code path are also disassembled. After running this script a few times, it located and disassembled over 500 bra instructions (and callee’s). This obviously lead to a lot more code that was previously unknown about.

Locating the Magic Count Offset

After reversing the routine handling logic statically, I decided to take snapshots of RAM from Gens (the emulator) before and after using magic to see if I could derive the offset of the magic counter in RAM from a binary diff. After looking through the diff, I found that offset 0xfe80 changed from 0x0101, to 0x0000 after using my single magic pot at the beginning of the first level.

RAM Binary Diff

This seemed like a good candidate. I searched for references to 0xfffe80 (RAM begins at 0xff0000), and I immediately found a move.w #$100,($fffe80).w instruction. I suspected that this instruction might be the intial magic pot value that is set at the beginning of each level. Using BN, I ,changed the instruction in the hex view to move.w #$100, $(fffe80).w, exported the binary, and reset the emulator. Sure enough, Ax started the level with 6 magic pots. This was a step in the right direction, but I wanted to always have 6 magic pots. There were a few other references to 0xfffe80 that I likely could have patched/nopped to acheive my goal. Instead, I chose to place a hook in the vertical blank interrupt handler routine to branch to unused space at the end of the ROM. This method would provide the ability to execute my own code on every vblank interrupt. To add this hook, I searched for a good 6 byte instruction that I could overwrite with a jsr instruction. I like using jsr instructions (over bsr) because they are easy to apply in hex editors since the absolute address of the destination is contained in the bottom dword of the instruction. I found a 6-byte btst #1, ($ffc183).w instruction at 0xd36 that seemed like a good candidate to overwrite for my hook. I overwrote the instruction with a branch to an offset at the end of the ROM.

Trampoline to Hook Handler

After overwriting the btst instruction, I patched in the handler routine to execute the instruction that I overwrote, write 0x600 to the magic pot counter, and return back to the vblank code.

Hook Handler

At this point, I had one last step. Using Binary Ninja’s quick patch functionality, I edited the checksum routine to always branch to the code path for the good checksum (if you want to correct the checksum, bn-genesis contains functionality to do so).

Binary Ninja Quick Patch to Skip Checksum

Shoutouts and Additional Finds

While I was screwing around with Golden Axe, Jason Wright was hard at work improving Alex Forencich’s 68000 Binary Ninja plugin so he could do real work. Jason was working with binaries compiled for a later generation 68000 variant processor that consists of a 32-bit address bus. I explained issues with the plugin that I discovered with incorrect sign extension in move instructions on vanilla 68000, which consists of a 24-bit address bus. Jason took time to modify the plugin to correct the issue (with a minor contribution from myself). Many thanks to both Alex and Jason!

Interestingly, Ghidra made the same mistake. Incorrect disassembly of three consecutive move instructions are depicted below.

Ghidra Incorrect Disassembly

While this may be correct for 68040, I selected 68000 for the processor. These instructions should disassemble to:

00001a52 11  d8  fe  83    move.b     (A0)+,(DAT_fffe83 ).w
00001a56 11  d8  fe  80    move.b     (A0)+,(DAT_fffe80 ).w
00001a5a 31  fc  80        move.w     #-0x7ffe ,(DAT_ffc104 ).w
         02  c1  04

Conclusion

After exporting the patched binary from BN and throwing it into Gens, we now have unlimited magic to unleash on the Death Adder! I have added the call table enumeration functionality to bn-genesis. It can be found here. I also wrote a similar script for IDA, which can be found here. This script was not near as effective as the BN script, because it searches explicitly for jsr and jmp instructions, instead of operating on IL. I’m sure it could be made better by re-writing it to use IDA’s microcode. I’ll leave that as an excercise for the reader 🙂 As always, thanks for reading!

Unlimited Potion FTW!

SMD Hacking with Ghidra

It’s been two weeks since the National Security Agency publicly released GHIDRA, a powerful software reverse engineering framework. Despite that GHIDRA comes with Motorola 68000 and 6502 processor support, the NSA didn’t release any of their retro game hacking scripts leaving the community to wonder how they have been spending our tax dollars. Today, I’d like to share a couple GHIDRA scripts that I wrote along with this blog post to serve as an introduction to GHIDRA for the ROM hacking community.

Introduction to GHIDRA

GHIDRA is a powerful static software analysis tool similar to IDA Pro or Binary Ninja, tools used primarily for analyzing compiled code. It supports multiple architectures to include Motorola 68000, the processor used by the SEGA Genesis console. GHIDRA takes a IL approach by transforming native code into P-code, the tools intermediate representation. Much of the analysis is done on the IL, allowing the majority of its core features (to include the decompiler and integrated assembler) to work across the 20+ supported architectures. There are way to many great features to cover in a blog post, and I’m still learning myself. If you haven’t already, I encourage you to start with ghidra.re to learn the basics.

Loading a SEGA Genesis ROM

To get started, create a project and import a SEGA Genesis ROM binary. You will be prompted to provide the language. Select 68000 (the default variant). This is all that is required to load the binary. You should be ready to use the Code Browser. To do so, drag the binary over top of the dragon in the tool chest. The ROM will open in the Code Browser, and you will be asked if you want to run initial analysis on the binary. Click no. Because you are loading a flat file, initial analysis will not work correctly without telling GHIDRA more information on the file format. You have two options, do everything manually or download my loader script into your ghidra_scripts folder and run it with the script manager. The loader script will setup the vector table and ROM header before kicking off initial disassembly starting at the functions with entries in the vector table.

Patching the ROM

Now that the ROM has been loaded and initial analysis has been ran, we can analyze our ROM and modify instructions. In this blog post I’ll demonstrate a simple patch in Sonic The Hedgehog to jump to the end of the game. For clarity, I’ve labeled the MainGameLoop and the call targets for each game mode in the jump table. To branch to the end of the game let’s modify the branch instruction to the normal level game mode (GM_Level) to branch to GM_Ending, instead. To edit a instruction, simply click on the instruction and type Ctrl+Shift+G. Type a new M68K instruction and hit Enter. GHIDRA even has tab completion!

Checksum Fixup

Sonic, like most SEGA games, calculates the ROM checksum at runtime. Therefore, if you export the patched binary and try to run it in an emulator without fixing the checksum, it won’t run. Instead, you get a red screen. This leads me to my second GHIDRA script. A script that calculates and fixes the ROM checksum. Simply download the script into your ghidra_scripts directory and run it with the Script Manager. After calculating the new checksum, export the binary by hitting O and save it to a file. Then, load the patched ROM into your favorite emulator to be underwhelmed.