Skip to content
Darren Kulp edited this page Mar 23, 2012 · 1 revision

The tenyr object format is in flux, so the best source of documentation is the code in src/obj.h and src/obj.c, but a brief description of the elements contained in a serialised tenyr object as of version 0 follows.

Conventions

For the sake of conciseness, any reference to a "word" in the following text can be taken to mean "an unsigned 32-bit quantity representable by the uint32_t type" unless otherwise specified. A recurrent theme in the tenyr object format is the consistent sizing of fields to fit into 32-bit words. Except for strings (used in symbol and relocation names), all fields are 32 bits (one word) wide.

Every list in the tenyr object format is preceded by a word specifying the number of elements in that list.

The entry point to a linked tenyr object is fixed at 0x1000. There is no way to specify an entry point in the version 0 object format, although the entry point can be configured at load time in the simulator.

Structure

Every tenyr object, as of version 0, consists of four main parts :

  1. the header
  2. a list of records
  3. a list of symbols
  4. a list of relocations

Header

The tenyr object header consists of three main parts :

  1. the magic string "TOV"
  2. a version byte (currently only version \0 is valid)
  3. a flags word (currently no object-wide flags are defined)

The magic string and version byte can be considered to be a single one-word-wide field as well. Thus, the header consists of two words, which can be represented by the following C structure :

struct {
    char     magic[3];
    uint8_t  version;
    uint32_t flags;
} header;

Records

The first list, a list of records (a "record" is a region of serialised contiguous memory), coming directly after the header, consists of a count word followed by count record structures. Each record structure consists of an addr word, specifying the base address of the record, a size word, specifying the extent of the record's data footprint, and size data words. Thus, the records list can be represented by the following pseudo-C structure :

uint32_t count;
struct {
	uint32_t addr;
	uint32_t size;
	uint32_t data[size];
} records[count];

There is no required relationship between the values of the size member of different records ; thus, it is not possible to predict the offset of a particular record without having read all previous records ahead of time. This is a design tradeoff for simplicity, and may be addressed in a subsequent object format revision (with a different version number).

As of this writing, all tenyr objects produced by the assembler and linker contain exactly one record, though this should not be depended upon.

Symbols

The second list, a list of symbols (a "symbol" is a label exported by an object for linking purposes), coming directly after the records list, consists of a count word followed by count symbol structures. Each symbol structure consists of a flags word, specifying flags for that symbol (currently no symbol-specific flags are defined), a name string of SYMBOL_LEN 8-bit characters, and a value word specifying the value of the symbol, which is generally its address relative to the beginning of the object (defined as 0x0). Thus, the symbols list can be represented by the following pseudo-C structure :

uint32_t count;
struct {
	uint32_t flags;
	char     name[SYMBOL_LEN];
	uint32_t value;
} symbols[count];

The value of SYMBOL_LEN is currently 32, which includes the \0 terminator required, for a maximum symbol length of 31 8-bit characters.

Relocations

The third and final list, a list of relocations (a "relocation" is an outstanding modification to the object once it is loaded into memory to render symbol references valid), coming directly after the symbols list, consists of a count word followed by count relocation structures. Each relocation structure consists of a flags word, specifying flags for that relocation (currently no relocation-specific flags are defined), a name string of SYMBOL_LEN 8-bit characters specifying the name of the referenced symbol, an addr word specifying the offset into this object of the word to update when relocations are done, and a width word specifying the width in bits of the portion of the target word to update, starting from the least-significant bit. Thus, the relocations list can be represented by the following pseudo-C structure :

uint32_t count;
struct {
	uint32_t flags;
	char     name[SYMBOL_LEN];
	uint32_t addr;
	uint32_t width;
} relocations[count];

For more information on how relocations work, see the linker document.

Clone this wiki locally