In this short paper, the author describes an architecture for a memory system that implements cell reference counting in a way that would be useful for LISP-like languages. He argues that reference counting has a useful place in heap management along with garbage collection because reference counting can be done in a decentralized way, while garbage collection is centralized. In a multiprocessor system, reference-counted memory could increase the time between garbage collections and thus decrease the fraction of the time when processors have to wait for a centralized garbage collector to run.
The author’s system implements LISP CONS cells. Each cell consists roughly of a pair of 32-bit pointers and, for each pointer, an 11-bit reference count and some status bits. Each time a pointer is written, it increments the count in the newly pointed-to cell and decrements the count in the formerly pointed-to cell, using a “back door” that does not require the processor to wait. Free space is just a list of cells with zero reference counts. Since the reference-count field is of finite size, counts can occasionally overflow; at this point, a cell’s count is marked infinite and it will not be reclaimed until the next garbage collection.
The author refines this scheme with some “sting” bits, which let each cell act as a test and set interlock, and a “don’t count” bit, which lets circular lists be counted and reclaimed in the common case where the cell that closes the cycle is known when the cycle is created. The memory can also run in a garbage collection mode in which the same hardware that fixed up the reference counts turns out to be useful in implementing the standard pointer reversing garbage collection scheme. Finally, the author describes his efforts at hardware chip implementation, which he admits haven’t gotten very far yet.
This approach seems plausible, and it is indeed likely that most of the lists created by a LISP program are amenable to reclamation by this scheme. Nonetheless, I cannot help but wonder whether one would do better to use more conventional hardware and better software. His proposed chips store about 64K bits apiece; garbage collection is delayed by a factor of 4 compared to that for a memory made of regular 64K chips. I could do the same by using regular 256K chips and quadrupling the size of the memory. Reports of real or simulated performance of Wise’s system would make his case much more persuasive.