Wednesday, 16 August 2017

IIGS is no slam-dunk. Coco3 is looking good though...

Slow going on the Apple IIGS front, but I've finally figured out my half-screen issue when shadowing is enabled. After banging my head against a brick wall for a few days, I decided to look at the MAME driver source for the IIGS and immediately found the issue. Because the legacy Apple II hires screen memories overlap the SHR screen memory, you also need to disable the respective bits for those in the SHADOW register. And voila - a blank screen!

Now I've hit another snag. I was intending to use so-called PEI-slamming to update the modified areas of the SHR screen, which require that the direct page and stack are located in BANK1. However, I can't make any sense of the soft switches that control the bank for these and the current state doesn't make sense either. After putting out a call for help I've had a couple of responses but I'm still none-the-wiser.

Somewhat discouragingly, I did a quick experiment double-rendering the frame (one with shadowing disabled, the other enabled) and it's running a bit slower than the arcade game. Granted PEI-slamming will help, but it's going to be tight!

Whilst I'm struggling with IIGS technical issues I got impatient and started on the Coco3 (6809) port. Thus far I have a skeleton main loop that initialises the object table and adds the extra lives to the display list. I've coded up a skeleton rendering routine and fleshed out the CUR command and the tokenised command to display an extra life. It's far from optimal but it works!

First rendering on the Coco3

It's been mostly cut-and-paste from 6502 until this point. Again, the one thorn is the indirect indexed addressing mode of the 6502 but, unlike Lode Runner, it's not as extensively used in Asteroids. Other than that, there's endianity to be mindful of and I've actually swapped the byte order of words in the display list for more efficient rendering. I think this port is going to fall out relatively easily!

Saturday, 12 August 2017

Half the screen in shadow!?!

Not a lot of progress, or time to work on it.

I did, however, take preliminary steps towards eliminating flicker. The plan is to utilise video shadowing, as described in this article here, by one of the authors of the IIGS port of Wolfenstein 3D.

Currently, I've got shadowing permanently enabled and simply write to bank $01, which is shadowed on-the-fly to the SHR memory in bank $E1. Slow and flickering, but no trickery to implement and of course easier to debug. But now it's time to up my game.

So in preparation, I simply changed my initialisation routine to disable shadowing, rather than enable it. As expected, I now get a blank screen as writes to bank $01 are not copied to the SHR screen at all. So far so good.

Next step is to disable shadowing at the start of every frame render. Since I've never actually enabled it anywhere, I should still get a blank screen - right? However, when I run it, I now see the top half of the screen! And it happens in both MAME and GSPort, so it's not likely to be an emulation bug.

My first thought was Alternate Display Mode, which can be activated from the Control Panel. I can only access it in GSPort, but it's already turned off. Toggling the value regardless doesn't make any difference. And for good measure, I also disabled interrupts.

It has me completely stumped. I've fielded some queries out there but for now, no responses.

With a little more spare time tonight, but no path forward on the IIGS port, I returned to the C port. I fixed the object update and now the asteroids move as expected in attract mode. Next I added some inputs, so you can coin-up and start a game. The next bit will be tricky, rendering the player ship, as that's quite involved.

I don't really have to go through the exercise of emulating the DVG anymore; with my tokenised display list I can simply back-port that to the C version and it'll be sufficient for all platforms that it'll be running on. However I figured there wasn't too much more work to do, and it may help in understanding some nuance at some point (eg. exploding ship), so I'll persist for now.

Hopefully soon I'll learn of my stupid mistake and I can get back to finishing off the IIGS port...

Tuesday, 8 August 2017


I've been digging into Norbert's Atari 800XL Asteroids Emulator and trying to ascertain if/where offsets are used for various objects. To re-iterate; the position of an object is relative to the first point in the draw list of vectors that comprise the object. OTOH the position of a bitmap is (always) relative to one corner of the bounding rectangle. So in theory, displaying bitmaps in place of vector objects should require an offset for each object.

I had luck with the player ship and (all) shots. The ship appears to be a lot closer to its true location now, as evidenced by the much reduced incidence of asteroids merely passing close-by and destroying your ship. Shots are definitely much more accurate - if not perfect - as hitting the smaller asteroids requires you to, well, actually hit them!

In order to maintain the current performance, the offsets are coded directly into the compiled sprites, rather than simply offsetting the object's coordinates and thus requiring additional calculations. It did unfortunately necessitate one extra conditional branch in the ship rendering dispatcher because of an odd-valued X offset, but there was no avoiding it.

Characters don't appear to have any offsets applied, as far as I can see. As for asteroids & saucers - eek! There's a few tables with small values that could feasibly contain offsets (one, for example, is indexed by asteroid shape and size) but I'm not sure I'm going to be able to reverse-engineer exactly how it all works as it looks to my untrained eye that it starts to get into display lists. And those routines appear to be working directly on the vector hardware 1024x1024 coordinates as well.

To further add a spanner in the works, played against the Atari version there do still appear to be some inaccuracies between ship and shot and ship and asteroid, albeit subtle. They'll be fun to track down...

Anyway, I'm happy with the progress thus far, and to be honest, it's probably not going to be noticeable to any but the hard-core Asteroids experts out there... and they're not likely to be playing it on the IIGS I wouldn't think. I will look further into it, and perhaps it's a good excuse to crack open Norbert's C64 version which might be a little easier (for me) to follow.

However, I might actually move on now to the ship explosion and then flicker and sound before returning to this issue. I did notice tonight that with only a few small asteroids on the screen, the game is definitely running way too fast, so I'll look at throttling as well.

Discussions on the IIGS FB page tonight have me interested already in another project, though I would only consider it after I've done a proper feasibility study and I have collaborators. I have to admit that I even loaded up the ROM in IDAPro and had a quick look at the start-up code in my lunch break.

But at this stage it's still only a slight possibility and I still have other plans for Asteroids before I put it to bed.

Monday, 7 August 2017

5 lines of code to fix 3 issues!

With the wife wanting some company in the lounge room tonight while she did some crochet, I turned on the TV for some background noise (Jupiter Ascending - what an absolute FX overload) and fired up the laptop to get some of the easier of the remaining tasks out of the way.

To this end I defined the bitmaps for two extra characters, period and underscore, and updated the arcade code to print those rather than render them with discrete vector commands. The period is used in the high score list display, and the underscore during high score entry. Straightforward.

Next was the 'rubbish' that is left on the screen after coining-up and starting a game. I thought I'd found the culprit and fixed it. Although the subsequent game appeared to be remedied, in later games it reappeared. Back to the drawing board on that one.

Finally, despite the dipswitches being hard-coded for 3 ships/game, the 2nd and subsequent games all start with 4 ships. Found that one rather quickly; the code does an LSR on the (read-only) hardware location, branching on a Carry condition. Unfortunately not so benign when that hardware location is emulated in RAM... loading into A and then shifting was a simple fix.

Somewhat interestingly, the game appeared to run faster on my laptop. Not sure if that was my imagination, a different version of MAME, different OS, or something else. Makes me even more interested in seeing it run on a real IIGS...

Aside from the above-mentioned rubbish, that only leaves the exploding ship and proper (accurate) alignment of the sprites before I tackle proper game speed throttling, flicker and sound. I might tackle alignment next, because that has the most detrimental effect on game play atm.

Oh and another thing I've forgotten about; 2-player mode. Trivial, but another task on the list.

Friday, 4 August 2017

VCF West Preview

I've had a kind offer to demonstrate Asteroids for the Apple IIGS at VCF West so today I added a quick text-mode splash screen.

Last-minute splash screen and loading bar (mid-load)

Datajerk's c2d utility allows you to display a text splash screen whilst the game is loading. It requires a dump of $400 bytes from text memory which of course on the Apple is a dog's breakfast. So I whipped up a quick C program that would allow me to easily layout my screen and then write out a binary dump compatible with Apple II text screen memory.

Some eye-candy from the latest build.

End of game - attract mode

High score list

Will be interesting to gauge the reaction of attendees. Unfortunately it's still pre-alpha so there's glitches and flickering, but it was a last-minute offer.

Thursday, 3 August 2017

Closer to an Alpha Release!

Quick update; all of the IIGS code is pure 16-bit now except for two routines - the DVG CUR handler and the support routine that calculates SHR addresses and is only called from there. They need a complete overhaul and merging into one routine. On reflection there might just be enough savings to be had to be noticeable...

All of the rendering is done and optimised (including the small saucer and thrust) except for the exploding ship, plus I need to add data for 'dot' and 'underscore' characters for the high score entry & display. And that's it in terms of optimisation, unless I can coax more out of the hardware by using shadowing and/or blanking to my advantage. It's still around 18% faster from my crude calculations.

For an alpha release I'll rework the CUR routine, add the last pieces of the missing rendering, fix the object alignment, and add a crude text-mode splash screen. I'll tackle the flicker & sound after the alpha is out there.


UPDATE: The DVG CUR routine - and hence all IIGS-specific code in the main execution loop - is now 16-bit and optimised. I replaced the x160 calculation with a table look-up, whose instantiation was facilitated in no small part by CA65's .REPEAT command (nice!)

My latest crude calculation suggests it is ~28% faster with 4 large asteroids on the screen. After the alpha release I'll schedule rendering strictly to the arcade frame rate and see if it can keep up. Having said that, there's no guarantees that the arcade game maintains that frame rate either - there is leeway in the code to skip a frame or two before collapsing in a heap!

Tuesday, 1 August 2017

A shadow of a doubt!?!

I've added the last of the compiled sprites - the only outstanding graphics now are the ship's thrust and the exploding ship. The former I will probably tackle next.

Aside from fixing the saucer rendering, I've been cleaning up - and in the process optimising - the rendering and erase dispatch routines. Originally they were switching back and forth between 8- and 16-bit mode; all the rendering and erase routines themselves are now pure 16-bit code and the dispatchers are almost there. I also shaved some cycles off the asteroid render/erase dispatchers by streamlining my tokenised 'asteroid' instruction.

The DVG CUR handler does some Apple IIGS video calculations and stores values for use by subsequent render/erase routines. It needs a good overhaul - converting to pure 16-bit, doing away with a few values that aren't needed, and adding another that will enable further optimisation of the render/erase routines. However there's not a huge amount of cycles to be saved per frame.

Incidentally, I was studying some IIGS code online and found the switch to change the border colour, so it's now black and does improve the aesthetics somewhat.

Finally, I activated SHR shadowing and expected to see performance gain. Nada. I'll have to go back and study (again) what that actually does and how it is of benefit (if any) in my case.

As for remaining optimisations, there's not a lot else I can come up with atm. The so-called stack-blasting technique isn't really suited to this situation, nor is moving DP to the video memory. They're generally more suited to larger objects and opaque layers. I'd say it's not going to get significantly faster than what it is now, unless shadowing changes something.

One last task I forgot about; there appears to be a requirement for an adjustment factor for the objects rendered as bitmaps. This is due no doubt to the difference between the vector objects having arbitrary 'origins' (defined as the starting point of the beam) versus the origin of a bitmap always being one corner of the bounding rectangle. With luck I can deduce the required values from Norbert's code.

Edging closer to something suitable for an alpha release just to give people a taste of the game...

Monday, 31 July 2017

Locaton, location, location!

Tonight I completely re-arranged the memory map, compressing all the areas together at the bottom of memory, and also finally did away with the DVG ROM at the same time. The 6502 arcade ROM now resides at $2000 (as opposed to $6800) and together with the IIGS code extensions - including the compiled sprites - now extends to "just" $7055. Plenty of space to finish off the rendering routines now.

I forgot that the English (only) messages are actually stored in the DVG ROM, so when I eliminated the last stray write to sound hardware and got the game running, all was well except for the messages, which were rubbish. It finally tweaked and I copied the English message tables into my core asteroids code module (together with the sine table) and it's all working as before.

High Score entry

And all this is now irrefutable proof that my Asteroids "source" code is fully relocatable, including all the ROM, RAM and hardware I/O location references! And now simply changing one assembler directive, for example, I could move it above the Apple II hires screen memory pages if I ever attempt a IIC+ version.

On another note altogether, someone asked me the other day whether I've had to decrease the frame rate. Thus far, the answer is 'no', and I'm hoping that won't change. It did remind me, however, that Norbert's emulator only renders every third frame! That could probably be improved somewhat now with my core, which eliminates all unnecessary display list calculations and operations.

And so onwards with the rendering and optimisations...

Sunday, 30 July 2017

Compiling a to-do list

Real Life has been intervening but I have been chipping away regularly at Asteroids.

I've now got compiled sprites for the bulk of the rendering, and all of the erase routines. And it finally runs faster than the arcade game, albeit only about 18% faster atm. I do still have some optimsations to do - but I also have to remove the flicker and add sound.

To generate code for the compiled sprites (and erases) I simply processed my bitmap .asm file in a quick 'n dirty C program. Not the most efficient tool but... old dog, new tricks...

So to explain exactly how my compiled sprites work is actually quite simple. Instead of looking up sprite data in a table and rendering it to the screen in a loop, I have one routine for each and every sprite that simply loads the sprite data into the A register as immediate operands and then writes that to the display using absolute indexed mode, with the X register containing what is effectively the video address of the sprite.

In the case of the IIGS, the bottleneck isn't actually the execution overhead of the loop or the amount of data per se, but rather the number of video access required since these are rather slow. Where the compiled sprites make the most improvement in this case is that they are only writing pixels that are set, unlike an non-discerning loop which writes the entire bounding rectangle, set or not. And with the asteroid sprites in particular, where the pixels are either relatively sparse, or the asteroid itself is often a lot smaller than the bounding rectangle, the improvements can be significant.

Aside from skipping zero-pixels (or rather words of pixels), there's also no need to OR the value $FFFF to the display, so there's another video (read) access saved. And a mate had a good suggestion that I sort the values within each sprite and use Y as a temporary store (where appropriate) to save a few more cycles!

For certain sprites I didn't bother with some optimisations... for characters there's still a look-up table of data, since during the game there's only a few digits on the screen and little savings to be had in the character data itself. And for small sprites I simply erase the entire bounding rectangle since there's only 14 words to write.

So now, instead of look-up tables of sprite data, I have look-up tables of sprite rendering routines. The erase routines are similarly optimised; only those pixels that were set are erased, and of course it's sufficient to simply write $0000 to those words. The down side of course is increased memory usage - quite a bit in fact - and I've just hit the requirement to re-arrange my memory map because of it. Not surprising since up until this point I've left all the arcade RAM, ROM & hardware addresses in their original locations. That should be trivial to change with the 'source code' of course and there's still plenty of space left on the Apple so no danger of running out anytime soon.

Tonight though, before re-arranging the memory map, I thought I'd tackle the 'crash' at the high score entry routine. Turns out it wasn't a crash at all, but rather a combination of a bug on my part, and yet-to-be modified display code on the other; the core core was in fact running perfectly fine.

My rendering routine wasn't setting the start of the display list buffer correctly; when the high score message was being printed the display list exceeded 256 bytes and the MSB of the address incremented. I subsequently set that as the start of the buffer and so it only rendered what was in the 2nd page of the list - the last few words of the message.

Next issue was that I hadn't updated the routine that printed the initials as you entered them, so nothing showed up when changing letters. And finally, to select a letter it was debouncing the hyperspace key; having mapped that only to the Apple II keyboard you could never trigger the selection of a letter. So I mapped it to the 2nd joystick button and it fixed that issue.

So now you can coin up, play a game at (slightly more than) full speed, enter your initials on the high score screen, and start again.

I still have a few niggly bugs. There's an initialisation issue that sometimes results in garbage on the screen and non-zero velocity for the player. And starting a second game you have four lives instead of three. I'm sure they won't be difficult to track down, and possibly even the same bug.

And aside from bugs, there's re-arranging the memory map, adding compiled sprites for the player ship (already generated, I just ran out of space temporarily), adding the 'thrust' pixel, displaying the correct saucer size, and rendering of the exploding player ship. I also need to add special characters for 'underscore' (high score entry) and 'dot' (high score list). Lastly then, remove the flicker, add sound and a main menu screen.

Sunday, 23 July 2017

IIC, or not IIC, that is the question:

Whether 'tis nobler in the mind to suffer
The Apple II hires video memory map,

Due to an SOS call from my wife whilst I was en-route to WozFest (car trouble) I ended up missing the brief link-up with KFest and people had well and truly broken off into small groups to work on their own projects by the time I arrived, which meant I didn't get the chance to see it running on real hardware.

I did get a chance to do a little more work on it though (despite heckling from the Peanut Gallery - you know who you are) and have now got pixel-shifted graphics rendering. Although somewhat hampered by the flickering graphics, on close inspection it is definitely animating more smoothly now.

I also decided to go down the path of so-called compiled sprites, reasoning that it wouldn't be very difficult to write a C program to parse my ASM bitmap data file to produce the requisite code. I've got one or two minor optimisations to effect first, and then I'll give it a spin. If that doesn't make a marked improvement, I'll be at a bit of a loss in terms of how to proceed further. As a first-pass I'll opt not to use stack-blasting and see where that gets me.

After chatting to a few learned fellow attendees at WOzFest it became apparent that the 4MHz IIC+ would be another good candidate for a port - even more capable than the IIGS in fact - with a faster CPU (same video memory bandwidth) but with a monochrome graphics mode meaning only 1/4 of the graphics data to push around. I'm fast running out of excuses to keep avoiding the legacy Apple II hires video display...

Thursday, 20 July 2017

Game On!

Excellent progress today - in fact it's in good enough shape to demo at WOzFest now although it would be nice to take it along even further if I get the chance!

The bulk of the rendering (sans exploding ship and thrust) and the erasing has been done. I've also managed to pilfer the joystick read routine from Lode Runner and as a result the game is actually playable, albeit slightly slower than the arcade original at this point.


I've still got some optimisations up my sleeve, from simple changes to the display list entry format through to stack-blasting hand-compiled sprites, so I'm still holding out hope that I can get it running fast enough to require being throttled by a IIGS interrupt. And I still haven't worked out the whole video shadowing mechanism; a bug in my code meant that I was never shadowing the SHR screen in the first place - and now when I turn it on, it can't read the keyboard... so there's that to play with as well.

I guess if all else fails I can revert to the legacy hires screen which is a lot less data to move.

From the video it's obvious I've got some graphics tweaks to do, including bit-shifting plus offsets from CUR for each object. There's flickering of course, and the odd glitch and then the minor matter of a complete crash at the end of the game.

As an aside, I got stuck on the joystick not working in the IIGS emulation under MAME. I could move left/up, but centering the joystick read back as $FF, so I couldn't move right/down. That forced my hand in trying to get the disk booting in GSPort, and I eventually realised the floppy disk image should be in slot 5, not slot 7. However GSPort had the same issue...

Then it finally twigged; the routine was written for a 1MHz machine and was running on a 2.8MHz machine. The counter was overflowing before even the centre position was detected! After slowing down the CPU it all started working!

So - finish off the graphics, tweak the display positions, optimise the erase/rendering, fix the Game Over bug. Then add sound, title screen, and release! I've got - realistically - only one more night to work on it before its debut.

Wednesday, 19 July 2017

IIGS Take 2

The experimental work I'd done on the IIGS port prior to starting on the port proper has certainly paid off; as of tonight it's rendering the characters, lives, copyright, asteroids and player ship (and the latter only when it should be). I should note that I'm yet to generate or code for the bit-shifted graphics.

It may look no different to the previous version, but the display list is greatly simplified - effectively tokenised - and all the dead code has been removed from the 6502 core, giving me more headroom on the IIGS. And I've still got a little more optimisation to do in my rendering routines.

IIGS Asteroids Take 2 - optimal display list

When I next get the chance I'll continue with the saucer, the shots and the shrapnel which should be as straightforward as the other objects have been until now.  That'll just leave the exploding ship, which I am yet to work on at all.

I think that'll be a good point to research reading the IIGS keyboard; I've been encouraged by reading vague suggestions that it's possible to read the IIGS keyboard directly from the ADB. If that pans out I'll be able to make the game playable, if slow.

Then it'll be time to work on the bit-shifted graphics and erase logic (right now it's clearing the entire screen every frame). That should bring the game back up to speed and I'm hoping it'll actually then be too fast!

I'll be happy if I get to this point by the weekend for WOzFest Slot 7!

Beyond that, there'll be the addition of the exploding ship, sound (samples), support for variable beam brightness, and spit & polish and bells & whistles, such as a title screen, joystick/paddle support etc etc.

Tuesday, 18 July 2017

A token effort

I have to admit, I haven't been able to tear myself away from the C port to make any further progress on the IIGS port. However, it hasn't all been for nought as it has definitely reinforced my understanding of the arcade code, and cemented my decision regarding tokenising (optimising) the display list for the 8-bit ports.

Before I get to that; most of the work on the C code has been 'infrastructure work' and low-level DVG interface routines, which necessarily support both the new abstract display list and the original in parallel - to facilitate debugging and development. What that leaves, then, is the game logic and housekeeping code which generally tends to be easier to translate to C; the upshot of all this is that I don't think the C port is going to take very long to complete at all!

[Just for the record, I have the C port rendering all the text, including scores, and rendering and animating the asteroids themselves. The pseudo-random number generator is also in lock-step with the arcade machine and produces the same output at the appropriate times].

Keep in mind that the arcade code is only 6KB of 6502 - a lot of that munging 16-bit numbers - and it's not surprising that the C port isn't huge. From memory, Knight Lore was ~12KB of Z80 code and translated to ~5K lines of C. I'm around 1,300 lines for Asteroids already, and you could estimate it'll be in the vicinity of 2,500 lines.

Getting back to the IIGS (and 8-bit) ports; aside from the existing CUR (which sets the current beam coordinates) and HALT display list commands, there'll be a distinct command for the rendering of each object in the game, comprising character, extra ship, copyright, asteroid, ship, saucer, shot, shrapnel and exploding ship. I may add one last command to set the brightness - something the arcade code does but Norbert doesn't bother with - simply because the IIGS has the palette to support some variance in brightness.

Some of those commands will have one or two parameters, but all will render at the current beam position. The parameters will be succinct and optimised for the bitmap display routines. What this means is that I can actually remove a lot of code that generates the display list content that is irrelevant for the port, such as DVG subroutine calls or component vector commands. This is one area where I'll be able to improve performance over Norbert's emulators, only because I effectively have the arcade 6502 source that I can modify and re-assemble at will.

I've also identified which of the bitmaps will and won't require bit-shifting, and which will require an extra byte's width to do so. Because, for example, all of the game's text message coordinates are fixed, specified on a 0-255 grid (before being scaled-up in the display list), and also happen to have even X coordinates, I don't have to bit-shift any of the character set for the IIGS 2BPP SHR graphics!

Most of the remaining bitmaps will require bit-shifting, and a few - not all - of those will require an additional byte's width to facilitate it. But that simply boils down to an extra compare and load for each object rendering, unless I need to really wring the performance out of the rendering routines.

My next task now is to generate shifted bitmap data, which is trivial, and essentially start over from scratch with the IIGS port. I'll probably have to stub out all the routines that write to the display list, and then begin work on the so-called tokenising version. None of that should be too difficult...

[UPDATE: I've regenerated the 6502 ASM file from my disassembly, starting the IIGS port from scratch. All of the DVG write routines have been stubbed-out so that only the CUR command is now written to display list. Next is tokenising the character command and then rendering it on the IIGS.]

As for the erasure; I'm planning on (eventually) making use of the ping-pong display list buffer. Immediately before rendering the new list, I'll simply re-parse the old buffer and use it essentially as dirty rectangles. I do have more sophisticated optimisation possibilities up my sleeve; it's useful to know, for example, that all objects are written to the display list in a fixed order. I'll leave all that, however, until I need it - if ever.

Wednesday, 12 July 2017

Asteroids with a 'C'

On Friday nights my wife & I traditionally watch a show together with which I've become rather bored in recent times. Rather than waste that hour last week I decided to set up the laptop in front of the TV and work on some aspect of Asteroids that required a minimum level of concentration. Ultimately I decided to start work on the C port of Asteroids, mainly because it required a lot of crank-the-handle type coding up front before any real work was required. Like defining data structures for zero page variables and player RAM.

Aside from the aforementioned, I manage to also code the main routine and stubs for all the subroutines called from there. Then over the next few nights I was keen to take it a little further; implementing a rather more 'abstract' display list to aid not only in development and debugging, but also to facilitate the so-called tokenising I'd be doing in the 8-bit ports. That entailed a DVG 'disassembler' of sorts which subsequently morphed itself into a DVG interpreter/emulator which was soon rendering a few vectors on the display.

Of course time is ticking for WOzFest and I do need to bite the bullet on the tokenised display list and optimisations for the IIGS. However it has been a very useful exercise and I've discovered a few subtleties of the DVG which had escaped me until now. Regardless, I really need to put it aside for now and continue on with the IIGS port. In the mean-time, here's a sample rendering of what I have thus far.

Asteroids C port (Win7, GCC, Allegro)

Like my other C ports (Lode Runner and Knight Lore), the C code is as faithful to the original assembler source as practical, whilst optimising aspects of the original code such as using 16- and 32-bit variables rather than multiple bytes for things like addresses, scores, coordinates, etc. I retain all the same subroutines with the same names, albeit adding parameters for values passed in registers, etc. The logic within each routine is representative of the assembly code, differing only to accommodate the aforementioned optimisations and/or clarify the intent, without changing the underlying algorithm or compromising accuracy.

The end result is the same as the 8-bit assembler ports; a game that plays exactly - and looks as far as practical on the target hardware - the same as the original. And as I've discovered in the past, I've even been able to debug aspects of the assembler ports on the C version! In the case of Asteroids, I think the ability to inspect the display list so easily will come in handy down the track.

The C version should be portable to the Amiga and the Neo Geo at the very least. For Lode Runner the C port was an after-thought of the Coco3 (6809) port, but for Knight Lore, I developed it in parallel the with Coco3 port and it was, as I mentioned, very helpful. This time 'round, I'm undecided how I'll proceed once the IIGS port is finished...

Friday, 7 July 2017

To SNES or not to SNES?

No opportunity for any development today but time to ponder random aspects of the project. I was also prompted by gp2000 to look a little further into specific aspects of the code, and discovered something that should have been obvious from the start, but escaped me until today - so thanks George for that inadvertent trigger!

I did tweak some of the coordinate transformation and video address calculations today, converting my 6502 code into 65816 and improving the resolution of some of the calculations. Always good to see a half-page of 8-bit code reduce to a few lines of 16-bit code!

And in the comments of a previous post I pondered the feasibility of porting this to the TRS-80 Model 4. Aside from the effort of porting to yet another CPU (Z80) there's also the fact that the hires board is all-but-crippled by not only port-mapping the hires video memory, but also restricting access to (vertical?) blanking periods. George suggested a hybrid mode mixing the text and hires graphics screens... very interesting but a lot of work none-the-less. I'll put this in the 'maybe' basket.

And on the subject of alternate ports, the SNES sprung to mind! I know little about the technical specifications except for the fact that it is powered by a 65816 (clone). A quick Google reveals it supports 256x224 resolution, allows 128 sprites (up to 32/line) and has the usual tilemap(s).

I'm thinking this would be a no-brainer; text would appear on the tilemap layer, with 27 asteroids, player ship, saucer and 6 shots making up a maximum of 35 sprites on-screen. Extremely unlikely that they'd all appear on the same scan line, but if I was really pedantic about it I could implement a software priority scheme. But with all the arcade 6502 code running, plus the bulk of the IIGS 65816 code available, it wouldn't be a lot of work at all. I'm going to put this in the 'almost certainly' basket, and I might be tempted to tackle it immediately after the IIGS port is done.

EDIT: Doh! It's already been ported to the SNES by Digital Eclipse!

[Makes me wonder if I should be porting that version to IIGS!?!]

That's about it for random musings. A parting fact: whilst the vector display coordinates range from 0-1023, the game's virtual playfield coordinates actually range from 0-8191. Somehow that escaped me... now consider it's all scaled down to 256x192... or in the case of the TRS-80 text mode graphics, 128x48 (128x72 if I get really tricky).

Thursday, 6 July 2017

Use the source, Luke!

In my third blog update for the day, I can report that I've all-but-finished the reverse-engineering of the arcade Asteroids 6502 code.

Aside from temporary storage, all zero page and player RAM variables have been documented. There are no variable addresses remaining for which I do not know the purpose.

About 95% of the code has been commented. There's some particularly nasty code in a few places throughout the ROM that remains uncommented at this stage; aside from some physics there's the exploding ship routine - which seems unnecessarily complex in my opinion - for example.

Importantly, I know what all the code is meant to achieve, even if I don't understand the nitty-gritty of every line in some isolated cases. It's something I'll probably have to rectify once I start transcoding to 6809 and/or C, but for now I'm satisfied that I have a well-enough commented source file on which to base my official Apple IIGS port.

From here I need to re-generate the core .ASM file and re-apply my patches for running on the IIGS. Since I annotated those patches in IDAPro, it should only take about 10 minutes before it's running again with the new source. And thereafter, I can start modifying the code 'for real' this time, including optimising for performance and incorporating pixel-shifted bitmaps.

It'll probably be a few days before I have anything rendering again, and the still screen shots will probably look no different to those I have posted already. The video should look a lot better though...

Wednesday, 5 July 2017

Great minds think alike (or fools never differ).

Interesting to dissect Norbert's Atari800XL Asteroids emulator.

The aforementioned patches to the rendering routines actually implement an alternate display list, of sorts. For all the (alpha-numeric) characters and the extra ships, Norbert adds an entry to his own display list, using the character code directly (no reverse-lookup on JSR address required). He also assigns character code $FF for the DVG CUR command, and inserts the pre-scaled Atari display coordinates. This is essentially what I had in mind for 'tokenising' the display list to optimise for the Apple IIGS.

As described in my last post, the emulator hooks the main loop and calls out to three (3) subroutines.

The first routine is (as I subsequently discovered) the rendering routine and it only renders the display every 3rd call. It does something with self-modifying code that I'm yet to reverse-engineer, before rendering the asteroids directly from the player status RAM area. Next is rendering the player ship or explosion, depending again on the player status - something I'm actually doing now as a 'quick hack'... not so much of a hack as it turns out! And as I suspected, the relative coordinates (offset) of the thrust pixel is stored in a lookup table and plotted discretely. After that, the saucer is rendered, and then the shots (saucer and player), before the alternate display list (characters) is finally rendered. At the end of the routine it appears to handle the high score entry, and then mess with ANTIC registers - and I'm way out of my depth here!

I've missed the copyright message in there somewhere, but perhaps it's done at startup and never deleted from the ANTIC display list? Not worth pursuing further since it's not relevant to the IIGS or likely any other hardware I'll be porting to.

The second hook routine emulates the inputs, and the third the sound.

And as I suspected, when the player status RAM bank switch is hit (changed), the emulator simply swaps 256 bytes between $200 and $300.

So what will I be taking away from this?

I like the idea of the alternate display list, though perhaps with the arcade 'source' it'll be easier for me to simply re-purpose the DVG shared RAM. Certainly it would appear that 'tokenising' the display list is the way to go. I would also eliminate all the dead code that makes up the current display list. And not having to iterate over player status RAM - essentially for the 2nd time each frame - should speed things up a little too.

I'll use Norbert's lookup table for the thrust, but instead use it to pre-render a 2nd set of bitmaps for the player ship. Again, not having the extra look-up and calculations to render a single pixel will increase performance further.

I should also be able to find the ship explosion bitmap(s) in there somewhere, if I can navigate the eccentricities of the Atari 800XL display hardware!

Standing on the shoulders of giants...

Undecided on how best to proceed with the remaining rendering tasks, today at lunchtime I downloaded Norbert's (Atari 800XL) emulator and fired it up in MAME, intending to plot the 'thrust' pixel in each of the 24 player ship bitmaps based purely on observation.

I've documented 12/24 but not surprisingly, it got old pretty quickly and, curiosity getting the better of me, I dumped the first 16KB of the Atari's memory into a binary file and loaded it up into IDAPro.

Before transferring control to the arcade code, Norbert's emulator patches a bunch of addresses in the ROM. Aside from those critical for running on non-Asteroids hardware (ie. the same patches I made) it also patches routines such as the 'CUR' (current) DVG command, character display, display of extra ships, etc.

It also installs a hook in the main game loop. The hook itself calls three subroutines; one to read the Atari joystick inputs and seed the memory-mapped Asteroids inputs, one to play the sounds based on the memory-mapped outputs, and the third I'm yet to ascertain.

Most importantly though, I'm still yet to determine how the emulator goes about rendering the display. From what little I've seen, the display list is somewhat 'corrupted' by the patched routines. However there are other unpatched routines that must still provide data to the emulator via the display list - so I'm not sure how it all works just yet. The waters could also be further muddied by the Atari 800XL's unique display hardware... I'm sure I'll work it all out next session.

Tuesday, 4 July 2017


I converted Norbert's ship data to my IIGS format and checked out exactly what he has rendered. There are 24 bitmaps in total, covering 360 degrees of rotation.

That is contrasted with 64 different renderings of the player ship in the arcade game. However again, at this resolution, it'd be pointless to attempt to render that many different bitmaps.

One interesting thing to note is that the player ship direction is stored as a single byte, the value varying the full range of 0-255 to represent 360 degrees. Each tap of the rotation changes the direction by +/-3, which means that coming full circle, you don't actually end up at 0 again, but rather at 255 or 2 first time 'round, and 254 or 1 next time. Not terribly important, because your direction is effectively right-shifted by 2 bits to determine which ship to render - IOW each tap of the rotation button does not necessarily change the ship rendering.

In the IIGS case, the direction needs to be divided by 24, which is equivalent to right-shifting by 4 and 5 and adding the results, although the resolution of the operation needs to be increased (easily done in 16-bit mode using the XBA instruction) to get all 24 outcomes.

Unfortunately identifying the ship in the display list is practically impossible. Rather than work out how/where to patch the arcade code, as a quick hack I simply added a routine at the end of the DVG emulation code to (always) render the player ship. Fortunately the arcade code always ends the display list with a CUR command corresponding to the center of the screen, so at least it's always rendered there and the game can be 'played' (more-or-less) as long as you don't use the thrust button!

Player ship is now rendered... sort of...

I should add that I'm yet to implement the 'thrust' indicator on the player ship; Norbert hasn't supplied me with any rendering details for this but from a quick look at his emulator video, it looks like a single pixel is illuminated for each bitmap - I just need to work out exactly which pixel!

That really just leaves the player ship explosion, another list of component vectors patched by the arcade code as it is copied to the display list. Again, there won't be any way to identify it in the display list, so I'll likely extend my quick hack to detect when the ship is exploding and render it there; enough to ensure my rendering algorithm is correct.

And that should be everything that needs to be rendered! From that point on, it's a matter of producing pixel-shifted bitmaps where required, updating the rendering routines to use them, and then finally optimise it all to eliminate the flicker and get it running at full frame rate. There'll be some use of the IIGS interrupts to throttle the frame rate, and of course hooking up proper keyboard/joystick/paddle controls and adding a fancy menu. Unlikely it'll all happen before WOzFest, but I should have a decent demo by then at least!?!


Picked some low-hanging fruit in my lunch break today; rendering the copyright message at a more appropriate size.

Although Norbert didn't supply the source data for the copyright message, it was a trivial matter to load a screen shot from his emulator page into a graphics editor, crop the message, reduce the colour depth and save it as a Portable Bitmap File - a text-based format perfectly suited to turning into assembler source data statements.

And while I was at it, I centered the screen on the IIGS display. Again, trivial, since screen accesses are all performed via an index register relative to the start of SHR memory - a constant defined in my IIGS .inc file. Simply adjusting the constant by 4 lines and 32 pixels ($290) was sufficient to center the display for each and every rendering routine.

Centered and a less obtrusive copyright message

Now for the player ship...

Monday, 3 July 2017

A shot in the dark

Shots, as it turns out, are rendered in the display list as zero-length vectors with scale=7 and maximum brightness and thus can be uniquely identified.

So I simply added a check for such in my DVG emulation code and now have shots being displayed for both player and saucer. I added some crude keyboard mappings for fire and left/right rotate, and I can coin up, start a game and take aim at asteroids and destroy them.

Player's ship is yet to be rendered, but asteroids can still be destroyed

That leaves player ship and player explosion. The latter consists of component vector commands copied and patched from the DVG ROM routine. In theory, they are the only two remaining objects in the display list, and it may yet be possible to distinguish between the two... something I need to experiment with in order to confirm. It would be really, really nice if I didn't have to patch the original game - even if just for this exercise - and be able to render all the game graphics!

But next task is to get Norbert's player ship bitmaps converted and displayed in the correct orientation.

Sunday, 2 July 2017

Where's the kaboom? There was supposed to be an earth-shattering kaboom!

It turns out that, as suggested on the Computer Archeology page on the DVG ROM, Asteroids does indeed use the global scale in the animation of the explosion. In fact all-up there are 21 different frames of animation of the explosion, all based on the 4 shrapnel pattern routines in the ROM.

Understandably though, Norbert appears to make do without scaling at all, using 4 patterns as-is. In truth, at 256x192 resolution 21 frames of animation of exploding particles is overkill, and half the frames would probably look the same anyway. The shrapnel bitmaps, like the other objects, are confined to 16x16 pixels and likely don't render quite as large as the shrapnel on the original, but it's not noticeable at all except perhaps for the number of frames each pattern persists for. Either way it doesn't affect game play in any way.

Throughout the original animation, the global scale is changed in 6 steps from 11 thru 15 and finally to 0. What I do is simply ignore the shrapnel pattern number and instead use the global scale to display pattern 0, 1, 2, 2, 3 & 3 since the latter scales are displayed for more frames (somewhat realistically the explosion slows down).

Kaboom! A saucer hits an asteroid.

Next I'll look at the (saucer) shots, and see if they can be unambiguously identified in the display list... perhaps via their brightness and/or vector length??? If not, it's time for some 'less benign' changes to the original code!

Friday, 30 June 2017

Asteroids come in all shapes and sizes

I added the code for the various asteroid sizes; that was a simple matter of checking the global scale value in effect and adjusting the asteroid bitmap table index accordingly. That entailed 12/19 'rocks' in Norbert's source file.

As for the remainders, turns out they actually comprise the small/large saucer, 4 shrapnel (explosion) patterns, and one bitmap of the player ship at '0 degrees' rotation. Since there only appears to be a single function in the DVG ROM to display the saucer, I'm assuming that the global scale is used for the 2nd saucer - but I need to verify that.

But for now, I've added the large saucer and now I'm at the exact point where I left off the text version. But it's a good indication of how the final game will look on the IIGS. I'll likely retain the 256x192 display area, but center it on the IIGS 320x200 SHR display.

All 3 sizes of asteroids and large saucer

It should be straightforward to render the shrapnel patterns next. Although the DVG notes online suggest the global scaling factor may also be used for explosions, I can't see where that's the case when I run the arcade emulation, and certainly Norbert is not scaling them at all.

I think that just leaves the ship, shots and ship explosion. As I've mentioned in a previous entry, these (mostly) manifest themselves in the display list as component vectors, and it's not possible to differentiate the actual objects from them alone. Again, at this point I will need to decide how to optimise the process - either tokenising the display list or bypassing the list altogether.

I've also had a few more thoughts on the erasure. I'm thinking dirty rectangles is going to be easiest to implement and I'm hoping fast enough. Each time I render an object, I'll add the coordinates and dimensions of the bounding rectangle to a list. When it's time to wipe the frame, I'll iterate through the list and wipe the rectangles. After all, this is what Knight Lore did...

Wednesday, 28 June 2017

Shifty operations and bitmaps!

Graphics! Norbert Kehrer was kind enough to send me some of the graphics data from his emulator - notably the character set, the player ship, and the asteroids. Given that his emulators run in 256x192 mode, I thought I'd start with the same to enable me to use his bitmaps as-is. Well, I did have to convert from 1BPP to 4BPP mode, writing a small C program to process Norbert's ASM source file.

At this stage I haven't bothered with bit-shifted data - being 4BPP there's only 2 copies anyway - it's enough to see what it's going to look like. Since the arcade Asteroids works with a 1024x1024 coordinate system, I first had to scale down to 256x192. And it's worth noting here that 192=128+64, which means scaling down Y can be done with shift & add operations only.

And somewhat inconveniently, the IIGS SHR screen is 160 bytes wide, so to find the video address of a coordinate, you need (Y*160 + X/2). Fortunately, 160=128+32, so again shift & add operations are sufficient. These calculations are generally only required when the display list contains a command to set the current coordinate (CUR). And at the risk of bragging, my scale and address calculation routine actually worked first go! Of course that's more than offset by all the stupid bugs I had doing trivial stuff.

First task was getting the characters displayed. Rather than use more calculations to find the character data address, I simply use a table of addresses. The routine simply renders 7 lines of 2 words at the current address, then adds 4 bytes to that. It's not perfect because there's no shifted data, but it's close.

It's worth noting that Asteroids uses several different font sizes by changing the global scale factor in the DVG. However Norbert hasn't emulated this behaviour, evidenced by the relative sizes of the score and high score text. Presumably his copyright text is a single purpose-rendered bitmap. I'm undecided at this point whether I'll follow suit.

Next task was getting some sort of representation of the asteroids themselves on the screen. Norbert's file had 19 bitmaps labelled as 'rocks'; I was expecting 4 asteroids in 3 sizes each, or a total of 12 asteroids. But for the moment I'm only rendering each of the 4 asteroids as the largest size and I'll have to investigate what the last 7 bitmaps actually represent at a later date (perhaps shrapnel?)

Lastly, there needs to be some sort of mechanism to wipe data from the previous frame. At 4BPP the SHR screen is 32KB and too big to wipe completely every frame. However, for now, that will have to suffice, so the video is very flickery, and quite slow, atm. Exactly how I optimise this, I'm undecided. It's worth noting here that in 1BPP mode, Norbert would have had to contend with 'only' 6KB of video memory...

Here's a still of the attract mode, showing 4 asteroids.

Yes, the asteroids are the correct shape too!

Next task is handling the different-sized asteroids, which should actually be quite trivial. That's about as far as I took the text version because after that, it all starts to get tricky!

And a 65816 trap-for-young-players - the MVP & MVN instructions change the data bank register! That wasn't documented in the first reference I was using, and I couldn't work out why my code was going into la-la land after using them.

Monday, 26 June 2017

65C816... Meh!

Another first today - my first 65C816 program. I purposefully omitted the exclamation mark from that last sentence because it really is nothing to get excited about. In fact, if you've never written 65C816 code before, don't rush to change that.

I've replaced the apple.asm file in the Asteroids project with another named iigs.asm. Currently the startup code enables the Super High Resolution (SHR) display, sets linear mapping mode, enables shadowing, and then switches to full 16-bit mode to initialise the palette (all 2 colours in one of 16 palettes) and the SCB. The frame rendering code simply switches to 16-bit mode, then immediately back to 8-bit mode before returning to the Asteroids code.

Booting the disk under IIGS emulation eventually - after the machine boots itself and the floppy disk image - results in a black screen. I've also verified that the game is running and the frame renderer is repeatedly being called. And writing values to the SHR memory from the MAME debugger results in pixels appearing on the display. So no crashes so far...

As for the 65C816; no 8-bit memory accesses in 'full' 16-bit mode? OK, perhaps not so much of an issue if the machine is designed from the ground up around the CPU, but when you're running on an architecture with byte-wide softswitches... and interfacing to 8-bit code and data structures... you're in for a bad time.

Then there's the issue, for example, with the assembler not unambiguously knowing whether to load the accumulator with an 8-bit or 16-bit immediate value - because the mnemonics (and, incidentally, the opcode values) are identical. You have to give it hints, and hope it gets it right. A recipe for frustrating bugs if ever I've seen one.

Anyway, as a first pass, I'll be replicating the logic in the text version, and parsing the VDG display list in the same way. I suspect all the parsing code will remain (8-bit) 6502, and I'll only switch into 16-bit mode to render the bitmaps to the SHR. But first I need to prepare said bitmaps for the IIGS display.

Sunday, 25 June 2017

2600 for a day and IIGS video

A little diversion; someone posted on an Atari-related FB group about tinkering with Ms Pacman and not having much luck getting it 'loaded into a disassembler' for more in-depth hacking. I couldn't help myself and started asking questions, and of course ended up doing it myself to satisfy my own curiosity, having never done anything with 2600 before.

The complication is this case is that the 2600 only maps 4KB of cartridge space, and Ms Pacman is 8KB. There are a handful of different banking schemes implemented in various cartridges, Ms Pacman being one of the simplest. Despite that, DiStella for example, doesn't support banked cartridges though it is forgivable and not really surprising. Also worth nothing that Dan Boris is one of the authors, and there's bound to be a good reason if he elected not to support banking.

After another wildly unsuccessful attempt to understand if/how banking is supported in IDAPro, I forged ahead with Ms Pacman only to discover that the first code bank actually executes at $D000, and not $1000-$1FFF as is documented as the reserved cartridge address space. Of course with the higher address lines missing from the 2600's 6507 CPU, the machine's 8KB of addressable device/memory space is mirrored every $2000.

I then turned my attention to the second bank, loading it into a second IDAPro session - until it was revealed that this bank actually executed at $F000! No doubt making development much, much less painful, it also allowed both banks to be loaded into the same IDAPro disassembly and the banking issue all-but-ignored. I added a few other segments, notably the TIA registers, the zero page area and the PIA registers, and as a result have a ready-to-go base for reverse-engineering.

However, I should note that there's very little chance I'll be tempted to work on this at the expense of Asteroids, or even anytime soon after I'm done! From what very little perusing I did of the source, it really didn't look enticing at all, especially in light of my limited knowledge (having read Racing The Beam) that suggests programming the 2600 is just as much about coding a video hardware controller as it is about coding game logic. And although I briefly mused about porting a 2600 title to another platform, I also quickly realised that much of the code wouldn't resemble the original in the slightest.

So about Asteroids; I've done further reading on the subject of IIGS architecture, and the video memory in particular. At best it looks like you can only write to the video memory at 1MHz, though you can read back at 2.8MHz. I've also read a few interesting articles on optimisation techniques - some specifically for the IIGS - and suspect I'll be employing at least some of them down the track. But for now, I think I'm across the technical aspects enough to choose a tact and begin work on it next session.

So it's time to fire up the 65816 assembler - or rather, 65816 switch on CA65 - and see if I can manage not to crash the IIGS in 4 lines or less...

Tuesday, 20 June 2017

Insert Coin. Press Start. Player 1.

Ordinarily I wouldn't have another update yet, but my 2 yr old came into our bed this morning at 4am and shortly before 5:30am, not having had a minute of sleep since then, I gave up and went out do some more Asteroids.

The plan was to use the MAME debugger to ascertain which DVG ROM subroutines were yet to be implemented. As I expected, the first to reveal itself was the copyright message at the bottom of the screen. The routine itself draws some discrete vectors (presumably for the © symbol) before calling the character routines for the remainder of the message.

I had two choices here; simply do the same and explicitly call my own character routines in sequence for the entire message, or implement some mechanism to allow me to simply point to the DVG ROM routine and recursively execute DVG instructions. I figured the latter wasn't worth the effort - and would be slower - so I implemented the former.

Someone had also 'complained' about the flickering graphics after I posted my last video. Of course this being purely a development aid I wasn't concerned, but knowing the Apple II had two text pages, curiosity got the better of me. And I'm not claiming to be breaking any new ground here, but I did manage to implement double buffering without any conditional logic involved in the process at all.

There's not a lot more to see in attract mode alone, so I decided it was time to properly initalise some dipswitches and hook up some crude control panel inputs. I settled on two hook routines, apple_reset and apple_start, that get called at the end of the original reset and start routines respectively.

In apple_reset, the hardware I/O locations - such as dipswitches - can be initialised. Since they map to normal Apple II RAM locations, all that is required is to write the appropriate value to the respective address. Thus far I set the coinage and the number of starting lives.

In apple_start, the Apple-specific initialisation code is run. Here I'm currently setting up the page flipping logic, and clearing both text pages.

As I've mentioned in the past, the NMI routine in the arcade code handles the coin switch inputs. Other inputs are read in the main game loop, once per frame. For the moment though, I simply added a few lines to read the Apple II keyboard at the end of my frame rendering routine. Pressing <5> will insert a coin by simply incrementing a zero page shadow value, and pressing <1> will start a game by setting a bit in the hardware I/O location (mapped to Apple II RAM) - for 1 frame. That's enough to get a game started and running.

I then added the display of the remaining ships, mainly because it was trivial. Unfortunately with only 16 lines on the screen, they overwrite the score, but the point is that it's more evidence that things are running as expected. The next obvious object to implement was the player ship...


...and here is where things start to get more complicated. The DVG ROM indeed has a table of 17 subroutines for drawing the ship (and optionally the thrust), not unlike other objects. However, these 17 ships only cover 90 degrees of rotation. As a result, the 6502 can't simply add a JSR to the player ship routines into the display list.

Instead, the 6502 copies the component instructions (vectors) from the above-mentioned DVG ROM routines into the display list, adjusting each on-the-fly for the current direction. So when the Apple II rendering code comes to the player ship, it's simply a list of CUR and VEC instructions - nothing decidedly identifiable as the player ship object!

So how do we solve this? In a rare coincidence, the solution is actually an optimisation as far as an Apple port goes - and there are also a few options. The most straightforward is to replace the 6502 routine for the player ship entirely, bypassing the display list and directly rendering the appropriate bitmap on the Apple display. One step removed from that is to 'tokenise' the display list entry; rather than add the component vectors, simply add a 'token' command to display the player ship that the Apple rendering engine can parse. Both have pro's and con's.

At this point though, I think I've taken the text-based proof-of-concept engine as far as I need to. It's time to make the switch to the 2.8MHz IIGS, consider writing the rendering engine in native 65816, start working in graphics mode, and decide how best to solve the latest issue.

Monday, 19 June 2017


I've managed to avoid it for 40 years now, but there was no putting it off any longer; today I wrote my first 6502 code ever (I don't think I'll ever be the same again!) 😮

The first task though was to add another source file to the project and have it link to the main Asteroids code. Thus all the Apple-specific code is contained within the one source file, leaving the original code more-or-less untainted. This code ends up residing at $8000.

Aside from the previously mentioned two (2) simple patches, today I added a subroutine call at the end of the main loop, after the display list for the frame has been generated, to my Apple-specific rendering function in the second source file.

Now the exercise today was to implement something as quickly (easily) as possible in order to see something rendered on the display. To this end, I decided to brave the Apple II video memory mapping, stick to 8-bit mode and render (only) the characters on the text screen.

The rendering routine clears the Apple display, and then iterates through the display list, interpreting the Digital Video Generator (DVG) instructions and updating the Apple video display accordingly. I chose to implement a jump table for the DVG opcode handler routines, and after musing on how this could be done on the 6502, I concluded that one could make use of RTS; and I was subsequently pleased to discover it wasn't a silly idea.

With stub routines in place, the rendering routine iterates through the display list until it encounters the HALT instruction, which the 6502 code places on the end of every frame. It then returns control to the arcade 6502 code to update the game logic and render the next frame.

I'll reiterate something I posted earlier in this blog; I've no intention of - and there's no real need to - render individual component vectors for all the objects. The DVG ROM contains subroutines for drawing each of the objects, so it is sufficient - for the most part - to simply ascertain which object is being drawn, and render it as a whole on the Apple II display. Note that this extends to characters as well as graphics objects.

As far as the DVG emulation goes for today's exercise, I need only implement the CUR instruction (which sets the current beam position) and the JSR routine (which jumps to a DVG ROM routine). For the CUR opcode I simply extract the (10-bit) operands and store them in zero page memory for future reference. At the same time I also create 5-bit (0-31) and 4-bit (0-15) equivalent values for X, Y respectively to represent the Apple text mode coordinates.

For the JSR opcode, it will penultimately be a huge look-up table of object-drawing subroutine addresses and their corresponding Apple equivalents. As it transpires, the DVG ROM already has such a table for the 37 characters, so my JSR opcode handler checks against this table to see if it's a character. If not, it ignores it and returns, otherwise it converts the subroutine address into the corresponding Apple II character code, and then displays it at the aforementioned 'CUR' address on the screen. (I also cheat a little here; ordinarily the current position will have to be updated after displaying a character, but for today's exercise I simply increment the text mode X coordinate).

The end result is a recognisable display, with Player 1, Player 2 and High Scores, and a flashing "PUSH START". For those wondering, the copyright message at the bottom of the screen has its own DVG subroutine, and will therefore have to be handled explicitly in the DVG emulation.


Not bad for a few hours work, and my first 6502 program!

UPDATE: Added all three sizes of asteroids (#,*,+) and the UFO (@). In attract mode you can now see the asteroids getting hit by the UFO and breaking into smaller rocks!

Sunday, 18 June 2017

Asteroids on the Apple II - coming soon to a screen near you!

I was woken at 4:30am this morning with the knees of my 2 yr old son wedged against my back. As you do at my age and you're woken through the night, I stumbled to the bathroom and, still half asleep, took a seat.

Why am I telling you this? Because this blog is all about the process, as well as the technical details, and to this day I still marvel at the circumstances under which my brain still manages to have epiphanies. I'm not sure I even consciously realised I was thinking about Asteroids, but at that moment it came to me that the Digital Vector Generator (DVG) ROM was (also) mapped into the 6502 address space, and that the 6502 code was reading it whilst generating the display list. And of course that ROM was conspicuously absent from the Apple II binary image.

Well tonight I rectified that situation and, after battling IDAPro for a while getting a second binary file loaded into the correct segment at the correct address, was soon able to generate a now-12KB binary that included both the DVG ROM image and (patched) 6502 ROM image.

[As an aside, it only occurred to me during all this that the Apple II .BIN file format is woefully crude, lacking not only the ability to load a single file into a non-contiguous address spaces, but also lacking an explicit execution address.]

Anyway, first order of business was again comparing the display list of the first frame with that generated on arcade hardware. Gone were the large groups of zero bytes; it looked roughly the same size now, and a lot of the data was the same, but it still differed.

Before going further I needed to confirm that the contents of the display list are completely deterministic. Asteroids explicitly zeroes working RAM, so that wasn't the issue. It also makes many calls to a pseudo RNG routine - it's a 16-bit single-tap (IIRC maximal-length) LFSR for those interested - but thankfully none from the NMI, which isn't running (yet) on the Apple version. I couldn't see any other reason to suggest it wouldn't be deterministic. And to be sure, I ran the arcade emulation twice, and the 600th frame on each occasion was identical.

Since the first byte differed, I set a breakpoint in the MAME debugger where it was written to the shared (display list) memory. Not surprisingly, it was a low-level routine that revealed nothing of the origin. Here's where the trace command in MAME comes to the fore; I was able to manually trace back through the code, and see where either the data, or the execution path, differed between the two platforms.

In this case it happened to be the value read back from a coinage dipswitch (or rather shadow zero page value to be precise) that differed. The Apple II version was, not surprisingly, reading back as zero which was freeplay!

I simply fired up the arcade emulation, changed the dipswitch to read back as zero, and compared the first frame of each again. Identical! Then I compared the 600th frame from the arcade version with the Apple version. Eureka!

So now I have the arcade Asteroids 6502 code executing on the Apple II, producing identical output!

To be honest, the whole process has actually been a little less painful than I had expected. All that is required to get this far is patching 4 bytes in the 6502 ROM. I guess like the old joke about the X on the pipe; it's not the value of the 4 bytes that's the hard bit, it's knowing which 4 bytes to patch!

So what's required now to get a playable game?

My next step is (probably) going to be building the code to render the display list to the Apple II video every frame. At this point it'll be a simple matter of calling the routine once from the main loop once it has rendered the display list, immediately before it returns back to the start of the loop. At least I'll get to see the attract mode running.

I should note that the game on the Apple is currently not throttled in any way at all - it simply generates frames as fast as the 6502 code runs before looping back for the next frame. On the arcade hardware, the NMI provided a periodic 'interrupt' to drive the timing of the main loop (now patched out). So at some stage I'll have to add that back into the Apple build.

The NMI also had the task of reading all the hardware, debouncing controls, and formatting it all into shadow variables in 6502 RAM. This is where the Apple II code will differ quite a bit, reading keyboard, joysticks and possibly menu settings.

I'll touch on the sound at a later date.

It's quite neat that the core 6502 code will be running pretty much untouched. I can see now why Norbert simply loads the original arcade ROM images into his emulators and (likely) patches a handful of bytes. The Apple II-specific code will be confined to the NMI and display hook routine.

Of course it also allows alternate display hook routines; different video modes and/or even different platforms. Interesting possibilities...

Friday, 16 June 2017

Furphies, running Asteroids code and corrupt display lists!

The undocumented instructions were a bit of a furphy in the end; as George rightly suspected both instances were a result of a bad disassembly. A single byte immediately following a BEQ instruction turned out not to be code, and ignoring that byte produces a more sane disassembly. Unlike the Z80, a good portion of the 6502 instructions affect the Z flag, and in this case the branch will always be taken. A symptom of me not finishing the RE process completely.

On to more interesting developments; simply commenting out two (2) conditional branches in the main loop allows the code to run though unimpeded. FTR the 1st branch is waiting for the 60Hz 'VBLANK' interrupt and the 2nd is waiting for the DVG to finish rendering the previous frame.

As a result, the game code is running in (I'm assuming) attract mode, continually writing display lists to the shared RAM for each frame. And that's actually what I'm seeing in the MAME debugger!

However it's not all good news; although the first frame renders correctly (all of 4 bytes), the second and subsequent frames (in the order of 128 bytes) do not. The data starts off OK, then differs for a bit and then leaves a large gap of zero bytes, before continuing. However the last group of bytes also appear to be correct. And just to cover all bases, I replaced the above-mentioned conditional branches with NOP instructions so that the rest of the code was identical - same result.

Anyway, I only got a very brief period to work on this tonight, so haven't had the chance to investigate further. And as of right now, I have no concrete theory. Perhaps it's not running attract mode at all, but rather going into Service Mode? But why the zeroes? My next course of action is to feed the display list generated on the Apple II into a DVG emulator, and see what pops out!

Thursday, 15 June 2017

Assemblers, undocumented instructions, and assumed addressing modes.

First order of the day; a helpful fellow developer has pointed me towards c2d, a command-line executable that creates a 'quick booting' Apple II .DSK file from a .BIN. So now simply typing 'make' assembles all my source and subsequently produces - in less than 1 second - an image I can boot in MAME.

Next: getting the arcade Asteroids source listing assembling in CA65. Not surprisingly IDAPro doesn't have direct support for the CA65 assembler. I briefly investigated the option of adding support via the IDAPro SDK, but it requires modifying and rebuilding the processor support module and I haven't have much success in doing so in the past.

Fortunately the supported SVENSON ELECTRONICS 6502/65C02 ASSEMBLER - V.1.0 - MAY, 1988 turns out to be a pretty close match; in fact, ultimately a single search-and-replace is sufficient to fix the pure syntax issues. [This is important since I will need to re-generate the source from IDAPro at some point in the future when I complete the reverse-engineering]. And once I explicitly defined the ZEROPAGE segment, only one syntax error remained.

The assembler had barfed on a DCP instruction. That didn't sound familiar to me, so I consulted my trusty ZAKS 6502 bible. No mention of it. Perhaps it has an alternate mnemonic? Google quickly revealed the problem - it's an undocumented opcode! After some further reading of the CA65 manual, I discovered a command-line switch to enable (some of) these opcodes. With relatively little effort, I now had the arcade Asteroids source code assembling under CA65!

I noticed, however, that the assembly was not producing the same number of bytes as the original, evident by the address of the last (IDAPro auto-generated) label in the assembler output listing. Somewhat fortuitously as it turns out in this case, IDAPro (by default) auto-generates labels that contain the address, making it easy to spot a mismatch against the assembled address.

Tracing back through it, I found the first instance of a mismatch; the code was referencing a zero-page variable via absolute (16-bit) addressing. Since the syntax of CA65 doesn't make a distinction between the two, it was assuming zero-page addressing and generating a different (length) opcode. As it turns out, this is the case in no less than 7 instances throughout the code (most in the same subroutine). I suspect the original assembler did make a distinction, and the programmer simply used the wrong addressing mode a few times, or possibly moved a variable from RAM to the zero-page at a latter stage of development.

After some further Googling I found the solution - forcing absolute addressing for an instruction - buried in a post on the NESDEV forums.

Either way it makes no difference to the outcome, but I do (first) want to verify that I am able to produce an exact binary using CA65. And for authenticity, I would prefer it does run the exact same code as far as possible.

One last mismatch was another undocumented instruction - SKW - this time, unsupported by both IDAPro and CA65. IDAPro disassembled the 3 bytes into a single NOP, which of course CA65 in turn assembled to the single byte $EA. No choice in this case but to define three constant bytes in place of the instruction.

Finally, CA65 appears to produce the same number of bytes as the Asteroids ROM. Indeed, after some further munging I have been able to confirm, via binary file compare, that the output is identical.

The issue now is getting the segments and .ORG statements in order to load at the correct address in Apple DOS (right now it produces a contiguous binary that loads at $0000). For that I need to so some more reading, and experimenting. But decent progress thus far.

UPDATE: The binary produced by CA65 now contains only the Asteroids (ROM) code and loads at $6800 in the Apple IIe emulation under MAME. The initialisation code runs, and it loops waiting for the 'VBLANK' (NMI x4) interrupt - as you would expect on non-Asteroids hardware!

Insert naughty words about CiderPress here!

Most of tonight was spent banging my head against a press. CiderPress to be specific.

Tonight started well as I managed to discover an Apple II 6502 assembly hello world project that actually used make and CA65/LD65 as the toolchain - exactly what I was after! This was going to be easy...

Building the example was trivial, and that left me with a .BIN file. The makefile, however, used a utility called dos33 to write to the .DSK file which I do not possess, so I simply INIT'd a new disk with a simple HELLO program (I'm becoming quite the DOS 3.3 guru), loaded it into CiderPress, and after selecting the right options imported the .BIN file. Couldn't be simpler, right?

Except I couldn't execute my .BIN file in DOS 3.3 under MAME. Or more specifically, it would crash to the monitor. Hmm...

After the obligatory delete and try again, I researched how to produce .LST and .MAP files, took a quick look at the .BIN file (noticed a 4-byte header), and then tried to locate my program in Apple II memory under MAME. It simply wasn't being loaded at the correct spot, or indeed anywhere I could ascertain.

I initially suspected it was being overwritten by BASIC as soon as it was loaded ($0803); tried changing that to no avail, but then soon decided this track was a red herring after all.

Time to research Apple II DOS 3.3 .BIN file formats. Somewhat frustratingly, it didn't appear within the first few Google hits, or even the next few. I finally found a paragraph detailing the 4-byte prefix in a text file at the bottom of a locked filing cabinet in a disused lavatory with a sign on the door saying "BEWARE OF THE LEOPARD!" My .BIN file looked good so far.

Then I noticed a column heading in the Ciderpress disk viewer called Aux, and wondered what it meant. After taking a little too long to find in the help file, I finally discovered it was supposed to be the execution address of binary files. Mine showed $0000 instead of $0803. Hmm....

Time to grab a random .DSK file from the net and see what CiderPress displays in this column. Apple Panic seemed a good candidate - and each of the several binary files in the image had non-zero addresses. So what was I doing wrong with the import process?

All this had taken a few hours now, and in my desperation - before I succumbed to posting questions on forums - I tried Googling for answers. What I did find was another Apple II hello world tutorial, using what looked like the same source, but this time using CiderPress to transfer the binary to a .DSK file. Bingo!

Let me just say that CiderPress's (seeming) inability to import a standard (adorned) DOS 3.3 .BIN file onto a .DSK image file is, well, simply preposterous! So much so in fact, that I'm not even sure I believe it can't be done! Regardless, as the link infers, I needed to strip off the 4-byte prefix (using dd in my makefile) and then rename the file with the numeric filetype and execution address in the filename. I am speechless. I am without speech.

After all that, I quickly changed a few filenames, modified the example and here we have the very first build of Apple IIGS Asteroids.

My first Apple II program - ever!

Next task is to find a command-line utility that allows me to write my .BIN file to a .DSK image file. There seem to be a few options out there. That's going to be essential as I'll be building this hundreds of times over the next few weeks & months. Right now the biggest bottleneck is waiting for DOS to boot on the Apple II emulation.

Once that's done, I'll need to convert the arcade Asteroids source code to the format that CA65 uses; I'm hoping that won't be too painful as I'll likely have to do it a few times as I finalise the reverse-engineering at a later date.