Skip to content

inglorion/ibgc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Itty-Bitty Garbage Collector

Copyright (c) 2022 Robbert Haarman

The itty-bitty garbage collector (IBGC) is a simple garbage collector designed for systems where available code and data space is limited.

Features of IBGC:

  • A memory allocation consists of one or more cells, each of which is large enough to hold a pointer.
  • IBGC maintains a few bits of metadata for each cell. The algorithm uses 3 bits for the first cell of an allocation and 2 bits for subsequent cells in the same allocation.
  • Metadata is kept separate from allocated data, so that the program can freely use the bits in allocated cells (in other words, there is no need for tag bits inside integers or pointers).
  • The metadata bits can be read and modified by the program, provided that bits which are significant to IBGC are set correctly. One possible use case for this is to store type information in the metadata.
  • IBGC uses a non-copying mark-sweep algorithm to reclaim memory.
  • Both tracing and reclamation require only a small amount of stack space that does not vary with the object graph.

The current implementation is a proof of concept used to experiment with the metadata format and algorithms. At the time of writing, it consists of about 120 lines of C code as counted by David A. Wheeler’s ‘SLOCCount’.

Limitations

The Itty-Bitty Garbage Collector was originally designed for small programming language implementations. It makes some pretty strong assumptions: the memory it manages is a fixed-size array of cells, and the program is responsible for informing the collector which values are pointers to be followed, and which aren’t. These assumptions can easily be fulfilled by a newly developed programming language implementation, but are unlikely to be true of software not specifically written to work with IBGC.

The Itty-Bitty Garbage Collector requires that all values it treats as pointers point to the first cell of the pointed-to object. That is, all values for which the pointer tag is set to 1 in the metadata and all pointers passed to gc_trace() must point to the beginning of the allocation the address is part of.

Building

The IBGC source code includes a test program (ibgc_test.c) and a Makefile that can be used to build and run it:

$ make
cc -o ibgc_test -Wall -Os ibgc_test.c
$ make check
./ibgc_test | diff -u ibgc_test.out.expected -
$

If all goes well, this process produces no errors, and diff shows no difference between the expected output from the test program and its actual output.

Usage

IBGC needs some support from programs that use it. Perhaps most notably, IBGC requires the program to tell it which values it should treat as pointers to follow, and which values not to, by setting the pointer bits in the metadata. This means that every assignment to a cell of memory may also require an update to the metadata for that cell.

The test program ibgc_test.c provides a working example of a program that uses IBGC.

A (hopefully complete) list of things a program needs to do to use IBGC is:

  1. Define cell_t (the type of a cell) and CELL_SZ (the size of a cell in bytes).
  2. Define addr_t (the type of an address) and ADDR_MASK (a value that a number can be bitwise anded with to yield a value in the correct range for an address). Note: addr_t must be an integer type, not a pointer type.
  3. #include “ibgc.c”
  4. Call ibgc_init() before using any of the other functions in IBGC.
  5. Ensure that all values are correctly tagged as pointers that IBGC should trace (pointer bit set to 1) or values that IBGC should not trace (pointer bit 0).
  6. Call gc_trace() for each of the garbage collection roots (objects from which all reachable objects can be reached).
  7. After gc_trace() has been called for all roots, call gc_reclaim() to actually reclaim the memory used by unreachable objects.
  8. After calling gc_reclaim(), invert the mark tag: mark_tag ^= MARK_MASK.

Memory is allocated using alloc(), which takes two parameters: the number of cells to allocate, and a tag to store in the metadata.

The tag corresponding to an allocation can be read using gettag() and written using settag(). Bits that are set to 1 in INFO_MASK can freely be used by the program, whereas the other bits in the tag have special meaning to the collector. Of these, the bit indicated by PTR_MASK must be set to 0 when a non-pointer value is stored in a cell, and to 1 if a pointer value is stored in a cell.

About

itty-bitty garbage collector

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published