Post by Erik Scheirer
I think a pluggable/loadable GC scheme, as long as its really simple to use, is perfect.
So would be a lasting world peace!
Post by Erik Scheirer
There would be some overhead created by making it pluggable, though, but in the scheme of things it would be well worth the small amount of cpu cycles lost.
I just can't see the feasibility of this. Any good GC involves
careful interaction between the parts of the system which mutate
memory and those which manage it. Getting a highly performant GC
almost always involves careful coordinated design of things like:
The low-level layout of objects.
The division of memory into various collections of objects (e.g. in a
GC scheme the old objects and the new object live in different spaces,
sometimes the new space moves each time a minor GC happens.
Efficient detection of whether an object is old or new.
For a GC requiring a 'write-barrier', efficient implementation of that
And to really get the most out of a GC, some of the low level
decisions can be platform and processor specific.
There are cascading design decisions to be be made. For example, lets
say we're making a generation scavenging GC. We need to capture
enough information as the mutator (the program) runs so that we can
find any new objects which are referenced by an old object. This is
the reason for the write barrier. So there are several issues:
How do we detect a store of a reference to a new object into an
old object with the lowest overhead.
How do we remember a store into an old object with the lowest overhead.
There are several strategies for detecting old vs new objects, each
with it's own tradeoffs, for example:
A flag bit in the object header
Address range checking to see which space it's in, or not in.
On some platforms and processors, one might make use of the virtual
memory hardware and access privileges to detect such stores, but this
is highly non-portable and may or may not outperform other approaches.
Flag bits need to be maintained properly, and are expensive, see below.
Address range checking is more common, and goes back to the
interactions with the overall design of the "VM".
And what about how to remember the old objects which need to be
considered during a new object GC.
We could perhaps make a linked or set of "remembered" objects, but
this is expensive both in terms of space and speed.
Most GCs use some form of "card marking" where old space is broken up
in to 'cards' containing a range of memory. Cards are similar to
pages in a virtual memory system, and may or may not be the same in a
particular GC implementation. In such a scheme when a new object
reference is stored in an old object, the fact that has happened is
stored as a change to the card in which the old object resides. The
most obvious way to do this is to have a data structure somewhere
which has a bit for each card.
But on most processors setting individual bits is expensive involving
fetching, masking, and re-storing a larger datatype.
The Self guys recognized this and found that for the processors they
were working on using a byte rather than a bit for the mark, was much
better overall despite requiring eight times the space for the marks.
And these are just some of the variations once one has chosen a
particular GC algorithm or perhaps one of a family of GC algorithms.
Now I know that LLVM attempts to do something like this,
but it apparently hasn't been all that successful:
The problem is that LLVM defines the interface between the mutator and
the GC "framework" in terms of C functions and function callbacks,
e.g. for the write-barrier, whereas a really efficient GC implements
the write barrier(and other GC bookkeeping tasks) in a few machine
I fear that a pluggable GC would only let you play around with pretty
poorly performing GC alternatives.