| Home CBM Futurama IBM PC-AT Contact Games Glossary Hall of fame Hall of shame Miscellaneous Privacy policy Programming 6502 Twisty puzzles |
6502
This section of my website is about coding for the 6502 family of processors (650x).
There are several members of this family. In the 21st Century, they are mainly used for embedded controllers.
In the past, they were also used as controllers but more often as the primary CPU of products from Apple, Atari, Commodore, and Nintendo (either personal computers or game consoles).
Here are some links to 6502 coding that you may find useful:
The 650x processors do not have a genreric multiply (or divide) machine instructions; it must be done in software! (Binary multiply/division is possible with a simple Rotate Left or Rotate Right instruction.) A simplistic formula has frequently been used/published which is essentially the same as long multiplication (or division) which we all learned (I hope) in elementry school. Several alternatives have been attempted. For example, using logarithms which is analogous to using a slide rule. It can be fast but suffers from accuracy errors. Another method (I'm not sure who developed it) is using a neat algebraic equation: a*b = f(a+b) - f(a-b), where f(x) is simply x*x/4. This is perfectly accurate and much faster than "long multiply" routines! As with most programming designs, there is a cost of memory usage to gain that speed. Below is the fastest code that I have developed (or have read about) to perform an 8-bit by 8-bit multiply (the product is 16 bits). It is for unsigned values. Signed numbers can be used if several changes are made (basically when a+b and a-b are calculated, the overflow flag must be tested and adjustments made when that flag is set).
The code above requires (on average) 46 CPU cycles. The "long multiplication" method requires well over 100 cycles! So it is really a trade-off... do you want fast speed (1280=256*5 bytes for tables) or do you want small size (no tables, but 2x+ slow) ?? Only you can decide!!!
into this:
Some experienced 650x programmers may be wondering were addition (ADC) is being performed, since I already said it was based on f(a PLUS b) - f(a-b). Well, the answer to that is the use of the CPU's own addressing mode: index by X. In other words, instructions like LDA sqrLow,X are implicitly doing the addition (when the CPU adds the X register to the base address). You can reduce the size of tables further (only 256 entries instead of 512) if you use real ADC (not the sneaky index by X trick just described) and apply fix-up code at the end. That version is a bit messy and about 50% slower so I won't show it here.
There are several ways to accomplish this task (reverse the bits within a byte). A lengthy discussion can be found here. To spare you all the details, below is a quick summary of the results from several contributors:
Mafiosino has the best effeciency: about 3x slower than my code, but he uses 4x fewer bytes! Once again, it really depends on your priorty of speed versus size. In summary (0% bias), you should use Mafiosino if size is your primary concern, or a Mega-Table if speed is a primary concern... but I still believe my idea is great if you want to compromise. (I hate to muddy the waters, but there are other factors like RAM and Register use to consider too.) Anyway, here are the codes...
© H2Obsession 2013,2014 |