Arduino library with tables describing the 6502 instruction set architecture (ISA).
The library also includes commands (for a separate command interpreter) that implement a man page browser (for the instructions), an in-line assembler/disassembler, and even a simple "file" based assembler.
There is a test suite for the commands.
This library contains several tables that store the 6502 instruction set. See for example this excellent work. The following tables are available:
- addressing mode
example:ABS
(absolute),ZPG
(zero page) - instruction
example:LDA
(load accumulator),JMP
(jump) - opcode
example: 0xA9 (LDA.IMM
)
In order to save RAM, the tables are stored in flash.
On AVR, that is known as PROGMEM.
Note that the AVRs have a Harvard architecture.
This means that every address occurs twice, once as DATAMEM (RAM), once as PROGMEM (flash).
This means that if a pointer like char * p
has the value 0x1000, it could address location 0x1000 in RAM
or 0x1000 in flash. The compiler doesn't know, the programmer has to help.
Most of this is hidden in the library.
Except for the return values of type char *
.
For example, the library offers the function
/*PROGMEM*/ const char * isa_addrmode_aname ( int aix );
In other words isa_addrmode_aname
returns a pointer to characters in PROGMEM.
- If you want to
print
those, embed them in thef(...)
macro.
Serial.print( f(isa_addrmode_aname(4)) )
- If you want to
printf
those, use the%S
format (capital S).
sprintf(buf, "out[%S]",isa_addrmode_aname(4))
- If you want to compare then, or get their length, use the
_P
version from the standard library:strcmp_P
orstrlen_P
.
if( strcmp_P("ABS",isa_addrmode_aname(4))==0 ) {}
For actual sample code, see isa6502basic or one of the other examples.
For more explanation on PROGMEM see below.
There are examples of increasing code size.
There is a basic example. It includes the 6502 tables from this library, and just prints them.
Note that this example shows how to deal with PROGMEM strings.
For example, f
typecasts the string, so that print
knows to get the chars from PROGMEM.
Serial.print( f(isa_addrmode_aname(aix)) );
This is another example that shows how to deal with PROGMEM strings.
There are _P()
variants of many standard string functions.
Those variants get the chars of the (second) string from PROGMEM.
char aname[4];
strcpy_P( aname, isa_addrmode_aname(aix) );
For details on PROGMEM (f()
and _P()
) see below.
The next example is an interactive variant.
It uses my command interpreter
to offer a man
command, so that the user can query the 6502 tables.
This library comes with multiple commands, but this example, only uses man.
Here is an example of a session.
Welcome to isa6502man, using is6502 lib V5
Type 'help' for help
>> help man
SYNTAX: man
- shows an index of instruction types (eg LDA) and addressing modes (eg ABS)
SYNTAX: man <inst>
- shows the details of the instruction type <inst> (eg LDA)
SYNTAX: man <addrmode>
- shows the details of the addressing mode <addrmode> (eg ABS)
SYNTAX: man <hexnum> | ( <inst> <addrmode> )
- shows the details of the instruction variant with opcode <hexnum>
- alternatively, the variant is identified with type and addressing mode
SYNTAX: man find <pattern>
- lists the instruction types, if <pattern> matches their description
- <pattern> is a series of letters; the match is case insensitive
- <pattern> may contain *, this matches zero or more chars
- <pattern> may contain ?, this matches any char
SYNTAX: man table opcode
- prints a 16x16 table of opcodes
SYNTAX: man table <pattern>
- prints a table of instructions (that match pattern - default pattern is *)
SYNTAX: man regs
- lists details of the registers
>>
>> man LDX
name: LDX (instruction)
desc: load index X with memory
help: X <- M
flag: NvxbdiZc
addr: ABS ABY IMM ZPG ZPY
>>
>> man ZPG
name: ZPG (addressing mode)
desc: Zero page
sntx: OPC *LL
size: 2 bytes
inst: ADC AND ASL BIT CMP CPX CPY DEC EOR INC LDA LDX LDY LSR ORA ROL
inst: ROR SBC STA STX STY
>>
>> man LDX ZPG
name: LDX.ZPG (opcode A6)
sntx: LDX *LL
desc: load index X with memory - Zero page
help: X <- M
flag: NvxbdiZc
size: 2 bytes
time: 3 ticks
>>
>> man A6
name: LDX.ZPG (opcode A6)
sntx: LDX *LL
desc: load index X with memory - Zero page
help: X <- M
flag: NvxbdiZc
size: 2 bytes
time: 3 ticks
>>
>> man find S*X
STX - store index X in memory
STY - store index Y in memory
TAX - transfer accumulator to index X
TAY - transfer accumulator to index Y
TSX - transfer stack pointer to index X
TXA - transfer index X to accumulator
TXS - transfer index X to stack register
TYA - transfer index Y to accumulator
found 8 results
>>
The third example is also based on my command interpreter.
It uses several commands included in this library; not only man
, but also read
and write
, and dasm
and asm
.
Here a demo of a write
that is dasm
'ed.
Welcome to isa6502mem, using is6502 lib V5
Type 'help' for help
>> write fffc 00 00 66 66 78 D8 A2 FF 9A A9 00 8D 00 80 A9 FF 8D 00 80 D0 F4
>> dasm
fffc 00 BRK
fffd 00 BRK
fffe 66 66 ROR *66
0000 78 SEI
0001 d8 CLD
0002 a2 ff LDX #ff
0004 9a TXS
0005 a9 00 LDA #00
>> d
0007 8d 00 80 STA 8000
000a a9 ff LDA #ff
000c 8d 00 80 STA 8000
000f d0 f4 BNE +f4 (0005)
0011 00 BRK
0012 00 BRK
0013 00 BRK
0014 00 BRK
>>
>>
Here a demo of an asm
'd program that is read
.
>> asm
A:0200> sei
A:0201> cld
A:0202> ldx ff
INFO: asm: suggest ZPG instead of ABS (try - for undo)
A:0205> -
A:0202> ldx #ff
A:0204> txs
A:0205> lda #00
A:0207> sta 8000
A:020a> lda #ff
A:020c> sta 8000
A:020f> bne 0205
A:0211>
>>
>> d
0200 78 SEI
0201 d8 CLD
0202 a2 ff LDX #ff
0204 9a TXS
0205 a9 00 LDA #00
0207 8d 00 80 STA 8000
020a a9 ff LDA #ff
020c 8d 00 80 STA 8000
>> d
020f d0 f4 BNE +f4 (0205)
0211 00 BRK
0212 00 BRK
0213 00 BRK
0214 00 BRK
0215 00 BRK
0216 00 BRK
0217 00 BRK
>>
>> r
0200: 78 d8 a2 ff 9a a9 00 8d 00 80 a9 ff 8d 00 80 d0
0210: f4 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0220: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0230: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>
Note
- The
asm
command was not given an address, and defaults to0200
(page 0 and 1 are typically reserved on a 6502). - At address 0202 I made a mistake, I typed
ldx ff
where I intendedldx #ff
. - The assembler gave a warning, because the operand size (1 byte) did not match the instruction (2 bytes).
- I used the "undo" feature (
-
goes one instruction back), and corrected. - At address 020f I used absolute addressing syntax for
bne
, but that instruction only has relative addressing. The assembler silently accepts. - I quit the streaming mode at address 0211 by entering a blank line.
- Then I gave twice the disassemble command (which knows the last written or shown address). This confirms the assembled code is correct.
- Finally, a gave a read command (which also knows the last written address).
The fourth example is also based on my command interpreter.
It adds one commands included in this library prog
.
This command implements a simple assembler.
Recall that a pointer char * p
with a value 0x1000 could address location 0x1000 in DATAMEM (RAM) or 0x1000 in PROGMEM (flash).
Technically the AVR has two different instructions for that, and the C compiler must generate those.
An unqualified *p
in C is always a RAM location. To read from flash we need a modifier like pgm_read_byte(p)
. This inserts
an lpm
instruction (load from program memory) instead of ld
(load from ram).
The run-time library supports PROGMEM; for many functions like memcmp(p1,p2)
(where both p1 and p2 are in RAM) there is
a variant memcmp_P(p1,p2)
(where p2
is in PROGMEM).
To store a (scalar) variable in PROGMEM, add attribute PROGMEM
(and make sure it is constant and global - not on the stack).
To print it, use the above mentioned pgm_read_byte()
/pgm_read_word()
on the address of the integer (signed char).
const signed char i PROGMEM = 5;
Serial.println( (signed char) pgm_read_byte(&i) );
For a real integer
we need to replace pgm_read_byte
with pgm_read_word
and hope int
has the same size as word
.
const int i PROGMEM = 5;
Serial.println( (int) pgm_read_word(&i) );
On my machine PROGMEM is defined in
C:\Users\Maarten\AppData\Local\Arduino15\packages\arduino\tools\avr-gcc\7.3.0-atmel3.6.1-arduino5\avr\include\avr\pgmspace.h
,
roughly (simplified) as #define PROGMEM __attribute__((__progmem__))
.
To store a whole (character) array in PROGMEM, again add PROGMEM
but also note the const
(and it must be global - not on the stack).
To print its first character, use the above mentioned pgm_read_byte()
.
const char s[] PROGMEM = "The content";
Serial.println( (char) pgm_read_byte(&s[0]) );
A special case is a printing the whole string. If a string is in PROGMEM, all its characters have an address in PROGMEM.
Arduino could have made a Serial.print_P()
. But print()
is an overloaded (cpp) method of Serial.
It selects the right variant, print(char)
, print(int)
, print(char*)
, etc, based on the type of the parameter.
However there is no type difference between a char *
in RAM and a char *
in PROGMEM.
The trick Arduino uses is to make an empty class (class __FlashStringHelper
),
and cast a char *
to that class if that pointer points to flash. With the trick, the string has a cpp class type,
and overloading can be used again.
const char s[] PROGMEM = "The content";
Serial.println( (__FlashStringHelper *)(s) );
For some reason, Arduino did not make a handy shortcut for that.
So, I made my own f()
macro
#include <avr/pgmspace.h>
#define f(s) ((__FlashStringHelper *)(s))
so that I can write
const char s[] PROGMEM = "The content";
Serial.println( f(s) );
An alternative to having an array s
, is to have a pointer s
to an array, see fragment below.
Note that s
itself is in RAM, but the chars it points to, the actual array, is in PROGMEM.
We need something to map a literal to PROGMEM; the standard header pgmspace.h
defines a helper macro PSTR
for that.
This is the simplified
version (the real is harder):
# define PSTR(s) ((const PROGMEM char *)(s))
So letting a pointer s
point to characters in PROGMEM is now written as
const char * s = PSTR("Content");
Serial.println( (char) pgm_read_byte(&s[0]) );
Serial.println( f(s) );
For literals in print
, there is no need to make an explicit variable:
Serial.println( f( PSTR("Example") ) );
This combination f(PSTR("..."))
occurs so often, that Arduino has made a macro for that: F("...")
.
Serial.println( F("Example") );
You now maybe understand why my macro is called lower-case f()
.
Why F
is published by Arduino and f
not, is a mystery to me.
By the way, on my system F
is published in
wstring.h:
C:\Program Files (x86)\Arduino\hardware\arduino\avr\cores\arduino\WString.h
.
Storing string gets harder when a table of strings needs to be mapped to flash.
In the code below strings
is an array of three pointers, all of which are in PROGMEM.
So, pgm_read_word
is needed to read the pointer.
That pointed is dressed up as a __FlashStringHelper
so that the right version of print
is selected.
That print
uses pgm_read_byte
to read the characters, which are also in PROGMEM.
static const char s0[] PROGMEM = "foo";
static const char s1[] PROGMEM = "bar";
static const char s2[] PROGMEM = "baz";
static const char * const strings[] PROGMEM = { s0, s1, s2 };
Serial.println( f(pgm_read_word(&strings[2])) );
Unfortunately, PSTR
can only be used within a function, so the following does not work.
I do not know why, but even Arduino site
uses the above structure and not the below.
static const char * const strings[] PROGMEM = { PSTR("foo"), PSTR("bar"), PSTR("baz") };
(end of doc)