Hooking Golden Axe VBlanks

If you’re a 90’s kid you’ve likely played Golden Axe either at the arcade or on the SMD. You know what it’s like to have to attempt to kick the creepy little elusive elves for magic pots. If you haven’t played Golden Axe in a while (or at all), see below.

Golden Axe Theives

In this blog post, I’m going to demonstrate the methods that I used to pull off a relatively simple ROM hack to equip Ax Battler with unlimited magic pots. I also am releasing a Binary Ninja script that I wrote to enumerate PC-relative branch table patterns that were not picked up by auto-analysis.

Initial Analysis

For this hack I chose to use Binary Ninja. I started by cloning a fork of a third-party 68k architecture plugin for BN, found here. This fork has fixes for updates made to the API, improvements from the original, and true support for native 68000 (described later). After dropping the ROM in BN, I manually created the interrupt vector table and created functions at program start, the vblank handler, and hblank handler. My goal was to find the logic responsible for maintaining Ax Battler’s magic counter in hopes that I could make a simple patch to max it out and prevent it from decrementing (unlimited magic). After kicking off initial analysis, I realized BN was successful at finding only a small amount of code (Ghidra and IDA yielded even worse results). After some inspection I found that PC-relative jump and call tables are used all over. These tables are sort of unconventional in that the entries contain instructions instead of addresses. They are setup as follows:

0xd50    jsr $10(pc, d0.w)
...
0xd60    rte
0xd62    bra ($00000db6)
0xd66    bra ($00004196)
...

The instruction at 0xd50 is a PC-relative jump to a location in the branch table beginning at 0xd62 (determined by the index value in d0). Likely because the value of d0 can’t be reliably determined, BN, Ghidra, and IDA all terminate disassembly at the return instruction at 0xd60 (leaving 0xd62 and on as undefined). Unwinding these tables manually would have been a pain, so I wrote a script. The script works by operating on BN’s medium level IL (MLIL) representation of the code. I started by writing a function to extract the base addresses of branch tables (depicted below).

def find_call_tables(self):
    base_addrs = []
    for func in self.bv:
        for block in func.medium_level_il.ssa_form:
            for instr in block:
                # If it's a call or jump and the operand contains addition
                branch_ops = [
                    MediumLevelILOperation.MLIL_CALL_UNTYPED_SSA,
                    MediumLevelILOperation.MLIL_JUMP,
                    MediumLevelILOperation.MLIL_GOTO
                ]
                if instr.operation in branch_ops:
                    if type(instr.dest) == long:
                        continue

                    mlil_add = MediumLevelILOperation.MLIL_ADD
                    mlil_const = MediumLevelILOperation.MLIL_CONST
                    if instr.dest.operation == mlil_add:
                        if instr.dest.operands[0].operation == mlil_const:
                            base_addrs.append(instr.dest.operands[0].constant)
    return base_addrs

This function iterates all MLIL instructions in search of instructions containing a branch operation (MLIL_JUMP MLIL_GOTO, or MLIL_CALL_UNTYPED_SSA). When it locates a branch operation, it checks if there is an addition operation taking place to compute the destination. If so, it takes the value of the first operand in the addition operation, which is the address of the base of the table. The function returns a list of branch table base addresses. Next, I wrote a function to iterate the tables, check for bra instructions (depicted below).

def disas_call_tables(self, base_addrs):
    count = 0
    for addr in base_addrs:
        i = addr
        while True:
            self.br.seek(i)
            opcode = self.br.read32be()
            if (opcode >> 24) == 0x60:
                if not self.bv.get_function_at(i):
                    self.bv.add_function(i)
                    count += 1
            else:
                break
            i += 4

    return count

This function is not as flashy, but it gets the job done. It simply iterates the entries in the branch tables (reading a dword at a time). It checks if the top byte of the dword is 0x60, the first byte of a bra instruction. Any entry that contains 0x60 is disassembled, if it hasn’t been disassembled already. Once a dword is hit that isn’t a bra instruction, the next table is processed. As a result, the call target and any other routines in the code path are also disassembled. After running this script a few times, it located and disassembled over 500 bra instructions (and callee’s). This obviously lead to a lot more code that was previously unknown about.

Locating the Magic Count Offset

After reversing the routine handling logic statically, I decided to take snapshots of RAM from Gens (the emulator) before and after using magic to see if I could derive the offset of the magic counter in RAM from a binary diff. After looking through the diff, I found that offset 0xfe80 changed from 0x0101, to 0x0000 after using my single magic pot at the beginning of the first level.

RAM Binary Diff

This seemed like a good candidate. I searched for references to 0xfffe80 (RAM begins at 0xff0000), and I immediately found a move.w #$100,($fffe80).w instruction. I suspected that this instruction might be the intial magic pot value that is set at the beginning of each level. Using BN, I ,changed the instruction in the hex view to move.w #$100, $(fffe80).w, exported the binary, and reset the emulator. Sure enough, Ax started the level with 6 magic pots. This was a step in the right direction, but I wanted to always have 6 magic pots. There were a few other references to 0xfffe80 that I likely could have patched/nopped to acheive my goal. Instead, I chose to place a hook in the vertical blank interrupt handler routine to branch to unused space at the end of the ROM. This method would provide the ability to execute my own code on every vblank interrupt. To add this hook, I searched for a good 6 byte instruction that I could overwrite with a jsr instruction. I like using jsr instructions (over bsr) because they are easy to apply in hex editors since the absolute address of the destination is contained in the bottom dword of the instruction. I found a 6-byte btst #1, ($ffc183).w instruction at 0xd36 that seemed like a good candidate to overwrite for my hook. I overwrote the instruction with a branch to an offset at the end of the ROM.

Trampoline to Hook Handler

After overwriting the btst instruction, I patched in the handler routine to execute the instruction that I overwrote, write 0x600 to the magic pot counter, and return back to the vblank code.

Hook Handler

At this point, I had one last step. Using Binary Ninja’s quick patch functionality, I edited the checksum routine to always branch to the code path for the good checksum (if you want to correct the checksum, bn-genesis contains functionality to do so).

Binary Ninja Quick Patch to Skip Checksum

Shoutouts and Additional Finds

While I was screwing around with Golden Axe, Jason Wright was hard at work improving Alex Forencich’s 68000 Binary Ninja plugin so he could do real work. Jason was working with binaries compiled for a later generation 68000 variant processor that consists of a 32-bit address bus. I explained issues with the plugin that I discovered with incorrect sign extension in move instructions on vanilla 68000, which consists of a 24-bit address bus. Jason took time to modify the plugin to correct the issue (with a minor contribution from myself). Many thanks to both Alex and Jason!

Interestingly, Ghidra made the same mistake. Incorrect disassembly of three consecutive move instructions are depicted below.

Ghidra Incorrect Disassembly

While this may be correct for 68040, I selected 68000 for the processor. These instructions should disassemble to:

00001a52 11  d8  fe  83    move.b     (A0)+,(DAT_fffe83 ).w
00001a56 11  d8  fe  80    move.b     (A0)+,(DAT_fffe80 ).w
00001a5a 31  fc  80        move.w     #-0x7ffe ,(DAT_ffc104 ).w
         02  c1  04

Conclusion

After exporting the patched binary from BN and throwing it into Gens, we now have unlimited magic to unleash on the Death Adder! I have added the call table enumeration functionality to bn-genesis. It can be found here. I also wrote a similar script for IDA, which can be found here. This script was not near as effective as the BN script, because it searches explicitly for jsr and jmp instructions, instead of operating on IL. I’m sure it could be made better by re-writing it to use IDA’s microcode. I’ll leave that as an excercise for the reader 🙂 As always, thanks for reading!

Unlimited Potion FTW!