Thursday, October 20, 2011

0.0.5

0.0.5 has been released. I Was finally able to get scrolling working for both horizontal and vertical. Both of these things took a while to figure out. Part of the problem was that I hadn't implemented any of it before now, so in order to get it working I had to implement all of the registers associated with it. So, now 2006, 2005, and 2002 were all pretty significantly modified. I also have ran some of the blargg test roms and have corrected some of the stuff mentioned in there. DMA was pretty much made much better. I'm still seeing timing issues with horizontal scroll, but some games kind of work. SMB kind of works. Everything is way too slow, though.

For the next version I'd like to have at least the most common mapper supported. I'd also like to speed up the code. I've been thinking about redoing part of the application layout as well. I'm no longer a fan of how the sprites and palettes are always visible. It might be nice to put that stuff in separate windows.

I'm not sure if I'll support menus in the next version. To be honest, I like the stripped down way it works. I do a lot of development from the command line and I like the fact that lambnes is pretty command line oriented and has a very limited gui. That said, I'll probably add menus at some point.

Saturday, March 26, 2011

0.0.1

you may be interested to know that I have released version 0.0.1 of my NES emulator: lambnes. It isn't quite perfect, and it doesn't yet have sound, but it is able to play Balloon Fight pretty accurately.

Wednesday, March 9, 2011

I finally decided to do some further testing of the CPU by running one of the test roms that are around. I used nestest. This was insanely helpful. I fixed over a dozen CPU bugs that would have been insanely difficult to track down and fix if I had used a standard rom like super mario bros. or balloon fight. This fixed the strange pattern table behavior noted in comments on the previous post.

The NES has two pattern tables and there are bits in one of the PPU control registers that you set in order to tell the PPU which pattern table to use for sprites and which one to use for background tiles. It looked like I was ignoring the bit for the background pattern table because I was pulling the background tiles from the sprite pattern table, but when I looked in the logs I saw multiple writes to the PPU control register. One that put told it to use the right pattern table for the background, but around a dozen others that told it to use the same pattern table for background tiles as for sprite tiles.

I guessed that this was due to CPU bugs and it turns out I was right. Correcting the majority of the CPU issues corrected that issue and has made the PPU much easier to debug. I really should have done it much sooner. Having proper test cases is important. I tried to test extensively as I was writing, but I was hampered by my limited knowledge of the 6502 processor. Thankfully, more knowledgable individuals had created a very robust set of tests that allowed me to fix most if not all of my CPU bugs.

I've also fixed a couple of the PPU bugs. It used to have vertical hold issues -- the image looped. That's been fixed. I've also implemented a sort of half-assed implementation of the controllers.

Ought to be able to correct the remaining issues with the PPU and the controllers pretty quickly now.

Saturday, February 19, 2011

I started this project with no real knowledge of how the NES works. I had a background in computer science and had been working in IT for more than 5 years, so I had some idea of what assembly was and how it worked, but I'd never worked with it in any extensive capacity and had no experience with electical engineering. All of the programs I'd written in assembly had been very simple. But I undestood how assembly basically worked and I had also had some experience playing around with emulators.

This was important in that the emulator I was thinking about writing is built on conventions used in other emulators. One of the first things I did was determine how I was going to load the instruction set. This was relatively simple. I wanted to be able to load zipped .nes files because as far as I could tell most games were distributed in this format. There was a lot of pretty decent documentation on the .NES format that was pretty easy to find.

Next, I divided the NES platform into discrete parts. The first part I decided to write was the CPU. I knew very little about the platform, but to my knowledge all other parts of the platform relied on the CPU. Also, the CPU was the part that had the best documentation. The 6502 processor used in the NES is pretty well known. There are entire books and web sites devoted to documenting how it works. Also, the CPU was easy to test. The CPU is relatively simple. It performs one of a limited set of actions on data that is obtained from memory in a limited set of ways and stores this data in memory locations specified in a limited number of ways. Implement all of these actions and ways to obtain and store data and you're done. I'd done programming for a long time, so I had an idea of how the CPU functioned. This was a little lower level of programming than I was used to doing, but I still had a pretty decent idea of what was going on and there was plenty of documentation.

The PPU was a bigger kettle of fish. The PPU, or Picture Processing Unit, generates all of the images on the screen. Like the CPU it has pretty well defined ways of doing what it does, but because I hadn't ever looked at a PPU before or read descriptions of one it was easy for me to make poor assumptions regarding how they do what they do.

For instance, in the very beginning I decided that all of the RAM was going to be an integer values. Since I'm a java guy I have the option of using either an int or a byte to represent these integer values. At first I used a byte. It makes some sense. However, I ran in to some issues regarding the fact that java bytes are signed. They represent values between -128 and 127. If you put 128 into a byte it will be viewed as -128. There are ways around this, but I read some explanations on the internet that suggested it was probably best to just use an int.

While working with the CPU, this worked great. I didn't really care very much what part of the memory I was writing to in my early tests. If you look at my test cases for the CPU methods they usually just begin at memory location 0. When I started working with the PPU, though, memory became more important. The CPU interacts with the PPU through specific memory locations. For instance, if the software wants to write to the PPU memory in order to set the palette or the nametable or something, it would first write the address to 0x2006 in two parts with the high bit sent first and the low bit next. All subsequent writes to 0x2007 are sent to the PPU ram address which starts at the address specified by 0x2006 and increments with each write.
I don't propagate all writes to 0x2006 to the PPU. Instead, I have the PPU check 0x2006 each cycle for it's current value. This was causing a problem in that my emulator had trouble distinguishing when a write had been done to 0x2006. So, it would do two reads from 0x2006 on the first two cycles and start writing to PPU memory address 0x0. Clearly registers didn't function how I thought they did.

The problem now became how to determine when a write had been done? At first I assumed that the first write to 0x2006 would be a value greater than 0. This was slightly better, but I was still seeing a bug. Everything was kind of working, but not quite. I decided to look at the assembly code to see exactly how it was setting the registers.

The assembly looked like this:

adc #$20 ;Load Name and Attribute Table
sta $2006
lda #$00
sta $2006
ldy #$00
ldx #$04.LoadTitle
lda ($00),Y ;Load Title Image
sta $2007
iny bne .LoadTitle
inc $01
dex
bne .LoadTitle
rts

First it sets 0x2006 with the PPU memory address of 0x2000 -- the location of nametable0. This determines what sprites will be placed on the screen. It turned out that the problem was the couple of commands between the writes to 0x2006 and the write to 0x2007. I had assumed that as soon as the full address was written to 0x2006 that 0x2007 would start being filled with values that could be written to memory. Another assumption I made was that the address 0x2007 writes to should increment every cycle. Again this was false. It increments each write, and there may or may not be a write every cycle. It turned out I needed to know exactly when a write was made to each register.

There are many ways to solve the issue. I personally decided to go with each of the registers being java Integers rather than ints so that they could hold null values. Also, after each cycle I set them back to null. This way, I can do a simple null check to determine if the register holds a value.

This is why disassembled assembly is good to have around during the debugging process. In the end I'm trying to create the environment that the assembly needs in order to function, and sometimes the easiest way to figure out what it needs is to take a look at what it's trying to do.

Tuesday, December 7, 2010

So. I have not been posting, but I have been continuing work and I plan on posting a little bit more about that this week. I hope to post more about what I've learned about the NES through my research, inaccurate though it likely is. However, for the moment just let me say that I am starting to work on the gui. The PPU is coming along and is just about ready to actually do something. I believe most of the registers are more or less working except for the scroll register. I am parsing the character tiles and am starting work on parsing the background tiles, which ought to go easier since it's not too terribly different than the character tiles.

The gui is a little bit fun because it involves working with swing, which I haven't had an awful lot of experience doing. It's also a bit frustrating because it's slow work as I'm having to do a ton of research in order to understand what it is I'm doing.

I received a comment from another guy who is also working on a java based NES emulator that recommended I not start my project based on emulating SMB. This is in many respects a wise bit of advice. SMB is a fairly simple cartridge, but there are simpler ones out there. There are a ton of test cartridges, for example, which test various parts of the system and can help me debug. So, this is what I'm doing currently. I've worked out a lot of bugs in the CPU and in the registers that simply isn't represented in the code in the repository at the moment. I will probably correct that in the next few days to weeks. However, I'm really more interested in getting the PPU at least partially displaying something in the GUI before I update the SVN.

I have begun researching the APU. On first glance it appears a little bit simpler than the PPU. We'll see, though.

Sunday, October 24, 2010

I uploaded some code to the svn. It's all of the current source. It compiles, but does not really do very much. The PPU is only partially implemented. I've implemented maybe half of the registers that the CPU and PPU use to interact and there's no attempt yet to implement the graphics.

What I have done is emulated a very small portion of the PPU so that it updates the status register with vblanks occasionally. This allows me to look through the logs and see what the project is doing. I cleared up a couple small bugs that were making the CPU error out and quit. It's still hitting a spot somewhere where it's running into an unemulated opcode and I'm trying to figure out if that's a legitimate code or a spot where my cpu has done something unexpected and has jumped the tracks.

The CPU used by the NES uses a byte of memory for each opcode. This means that there are potentially 256 different opcodes that can exist. However, the CPU only officially use something like 150 of those opcodes. The other 100 or so opcodes do things in the CPU, and it wouldn't surprise me if they were occasionally used in production code, but I have decided to capture these codes instead of implementing them. At least for the moment.

So, my CPU hits one of these illegal codes at a point and I'm trying to figure out if this means I need to implement one of these illegal codes that were legitimately put in the source by the programmer, or if it means that my CPU has done something wrong and is now pulling non-opcode data from memory and treating it like it's an opcode. This involves running the CPU for a while and then going through the logs to see what the CPU is doing. I found a disassembled and heavily commented Super Mario source file on the internet and I compare that to what my CPU is doing. I try to figure out what the code is supposed to be doing and make sure the CPU is doing that. It involves looking at a lot of assembly code. It's really gratifying to see my CPU hit these loops in the source and function properly. It's kind of amazing.

So far as I can see, the code never gets to the point that it's setting up the VRAM. This part is essential before I can start implementing the graphical portion of the application. The character ram that is pulled from the cartridge only contains a portion of the actual character information. I'll probably talk more about that when I actually start implementing that.

One thing I'm seeing is that my CPU is SLOW. It also does a metric shit ton of logging. I'm debugging right now, so I'm not all that concerned about it, but at one point or another I will have to do a significant amount of optimization.

I hope to have these very small, but important issues settled in the next few weeks. Once the code appears to be more or less running appropriately and is setting up all of the memory correctly I'll be able to start working on the graphics output. That ought to be kind of fun, but a little daunting.

Wednesday, October 20, 2010

I have pretty much completed the cpu portion of the emulator. Emulating the cpu isn't very difficult. Its functionality is well known. It fetches an opcode, processes it, and updates multiple flags. Probably the hardest thing is finding documentation on what each opcode does exactly and how each flag is updated. Not too bad. Some documentation might be a little misleading or a little tough to read -- some of it is written by professional writers, but not much -- there's a lot of it, though, so if you don't understand something there are about 50 other places to look.

Emulating the ppu is a lot harder. Now we're getting into the meat of the system. Again, finding documentation is harder. Also, because the exact functionality of this portion of the system isn't as well understood by me I'm a lot more reliant on documentation. Not to mention the fact that the cpu is used all over the place and is well documented by various people. Not so with the ppu. The ppu is proprietary and any documentation that exists was written by individuals who either experimented on an actual NES or read other documentation elsewhere. None of it is written by professional writers and quite frequently is a work in progress. Kind of like this blog.

So now we're getting to the point that my emulator is actually doing something. Not very much, mind you, but something. It runs for the first 20 or so opcodes then it gets into a position where the opcodes are having it read the ppu state registers to determine whether or not the ppu is in a vblank. Because my ppu is largely unwritten and the registers aren't implemented yet, this doesn't happen and the cpu pretty much just sits there waiting.

The systems all interact via various specific positions in memory. For instance, to set various features on the ppu you can write to addresses 0x2000 and 0x2001. To read what state the ppu is currently in you read from 0x2002. In 0x2002 is a number between 0 and 256 (0x00 and 0xFF). Each bit in that number indicates something about the ppu. For instance, the highest bit -- the bit on the furthest left side of the binary number -- indicates whether or not the ppu is in a vblank period, which is the period in which the television is resetting itself so that it's ready for the next frame. Vblank happens 60 times a second. Interestingly, there are some registers that you should only read from or write to during vblank because reading or writing to or from them when the ppu is drawing the frame affects that frame.

So right now I'm in the process of implementing these registers not to mention the timing mechanism for system so that I can attempt to make the system run at the right speed so that lines are being drawn when they're supposed to be drawn and things run at the expected rate. It's tough work and is quite shittily written at this point. The main loop will probably prove to be pretty buggy. It would clearly have been easier to write a 2d game than to try to emulate the NES. That said, in my opinion this is a far cooler project and one that I'm pretty happy to be involved in. I'm trying to be careful in writing my code, but am also of the opinion that writing dirty code that kind of works helps to understand the problem space and helps you to get to the point that you can clean up the code. It's not an efficient process. It's not agile. But it has its benefits. I feel it works pretty well for hobby projects with a large problem space anyway.