Apple II ROM Disassembly

The Apple II ROM can be divided into three distinct sections:

Disassemblies that cover a specific area are listed in the sections below.

James Davis created a detailed disassembly of the Apple ][+ ROM that covers the full span, from $C000-FFFF. See the HTML listing or download the project file .ZIP.


Peripheral Card ROMs

Booting a 5.25" floppy on an Apple II begins when the firmware in the disk controller card gets control. It has to spin up the floppy drive, seek the head to track 0, watch bytes go by until the start of sector 0 is found, read 342 bytes of raw data, decode the (essentially) base64 encoding to get 256 bytes of actual data, verify the data checksum, and execute the code.

And it has to do all that in 256 bytes of 6502 code.

There's a longer explanation of the BOOT0 process here.


BASIC ROMs

If you want to understand how Applesoft works, the most detailed information available is in Bob Sander-Cederlof's disassembly of the Apple ][+ ROM code. This is currently available on his web site as S-C Documentor: Applesoft.

The disassembly is presented as source code for the S-C Assembler, which runs on the Apple II. Due to the constraints of 8-bit computers, the sources are split into 26 separate files, which is somewhat inconvenient on a modern system.

As an exercise, I converted the entire disassembly to a SourceGen project. This allows perusal of the entire program as a single entity, and provides full cross-reference data. It does have a few drawbacks, notably that the limitations on operands don't allow it to fully express the equations in some places.

If you're interested in digital archaeology, the source code for the original Microsoft BASIC has been posted online with some very interesting commentary.


System Monitor ROMs

The Apple II Reference Manual includes the source code for the original monitor ROM, starting on page 155, and the autostart monitor ROM, starting on page 136. The former shipped in the original Apple ][, the latter in the Apple ][+.

As an exercise, I loaded the ROM images into SourceGen and reproduced the contents. This is a fairly faithful rendition, and provides little in the way of additional commentary or improved formatting. It does, however, make it possible to search, and you can use SourceGen's cross-reference features to see how things connect.

An excellent source of information on the Apple II monitor is the book "Apple II Monitors Peeled", published by Apple Computer in 1981.

The Oft-Misunderstood WAIT

The explanation of how long the WAIT routine at $FCA8 takes to run is incorrect in multiple sources. For example, the original monitor ROM listing says:

fcaa: e9 01        WAIT3       sbc     #$01            ;1.0204 usec
fcac: d0 fc                    bne     WAIT3           ;(13+2712*A+512*A*A)

Neither comment is correct. The official Apple documentation, Apple II Monitors Peeled, says:

  2.5A**2 + 13.5A + 13 machine cycles of 1.023 microseconds

William F. Luebbert's What's Where in the Apple says:

  wait estimated at 2.5A^2+13.5A+13 wait cycles of 1.02 microseconds

These are both multiplying the cycle count by the CPU's clock speed (in cycles per second) when they should be using the cycle time (in seconds per cycle). A 2MHz machine would run the code in half the time, not take twice as long.

So what's the correct answer? Let's start by confirming the cycle count. The code is:

fca8: 38           WAIT     sec                ;2
fca9: 48           WAIT2    pha                ;3
fcaa: e9 01        WAIT3    sbc     #$01       ;2
fcac: d0 fc                 bne     WAIT3      ;2+
fcae: 68                    pla                ;4
fcaf: e9 01                 sbc     #$01       ;2
fcb1: d0 f6                 bne     WAIT2      ;2+
fcb3: 60                    rts                ;6

The inner SBC/BNE loop is usually 5 cycles, because BNE takes 3 cycles when the branch is taken. The last iteration takes one fewer. We decrement A each time, so if initially A=4, the inner loop executes 4+3+2+1 times. So this takes A*(A+1)/2 * 5 - A cycles.

The outer loop executes A times, and takes 12 cycles. Again, the last time through takes one fewer: A*12 - 1.

Outside of that, we have 8 cycles of non-loop stuff (SEC/RTS). If we want to add the JSR that called here that's another 6 cycles, but I prefer to put that in the caller's account instead (could've been a JMP at the end of a function rather than a JSR).

Putting it together yields A*(A+1)/2 * 5 - A + A*12 - 1 + 8. Applying algebra:

  (A*A/2 + A/2) * 5 + A*11 + 7
  A*A*5/2 + A*5/2 + A*11 + 7
  A*A*2.5 + A*13.5 + 7

Throw in the 6-cycle JSR and you get the formula from Apple II Monitors Peeled. So the cycle-count part of their formula is correct. What about the time per cycle?

In a comp.sys.apple2 post, awanderin notes:

The CPU has 64 clock periods of 14 * (1 / 14.318181 MHz) or 0.978µs and one stretched period of 16 * (1 / 14.318181 MHz) or 1.117µs, which gives an average clock period of 0.980µs. That works out to an average clock speed of 1.0205 MHz.

(why) This gives a final result of:

  (A*A*2.5 + A*13.5 + 7) * 0.980 usec

Which is about 4% less than the "official" estimate.

Side note: calling WAIT with A set to zero is *almost* the same as A=256. The code does the subtraction before the zero test, so it doesn't exit immediately. However, the first subtraction clears the carry, which means the next subtraction will subtract 2 instead of 1. So the first two executions of the inner loop have one fewer iteration (the first one because of the inner-loop SBC, the second one because of the outer-loop SBC). So it's 10 cycles short.


Copyright 2019 by Andy McFadden

Back to list of disassembly projects