垃圾回收器的实现

18 篇文章 0 订阅
10 篇文章 0 订阅

In computer science, garbage collection (GC) is a form of automatic memory management. The garbage collector, or just collector, attempts to reclaim garbage, or memory occupied by objects that are no longer in use by the program. Garbage collection was invented by John McCarthy around 1959 to solve problems in Lisp.

This section presentsthe mark-and-sweep  garbage collection algorithm.The mark-and-sweep algorithm was the first garbage collection algorithmto be developed that is able to reclaim cyclic data structures.gifVariations of the mark-and-sweep algorithm continue to be among the mostcommonly used garbage collection techniques.

When using mark-and-sweep,unreferenced objects are not reclaimed immediately.Instead, garbage is allowed to accumulate until all available memoryhas been exhausted.When that happens,the execution of the program is suspended temporarilywhile the mark-and-sweep algorithm collects all the garbage.Once all unreferenced objects have been reclaimed,the normal execution of the program can resume.

The mark-and-sweep algorithm is called a tracing garbage collectorbecause is traces out the entire collection of objectsthat are directly or indirectly accessible by the program.The objects that a program can access directlyare those objects which are referenced by local variableson the processor stack as well as by any static variablesthat refer to objects.In the context of garbage collection,these variables are called the roots .An object is indirectly accessibleif it is referenced by a field in some other(directly or indirectly) accessible object.An accessible object is said to be live .Conversely, an object which is not live is garbage.

The mark-and-sweep algorithm consists of two phases:In the first phase, it finds and marks all accessible objects.The first phase is called the mark phase.In the second phase, the garbage collection algorithm scansthrough the heap and reclaims all the unmarked objects.The second phase is called the sweep phase.The algorithm can be expressed as follows:

for each root variable r
    mark (r);
sweep ();

In order to distinguish the live objects from garbage,we record the state of an object in each object.That is, we add a special boolean field to each objectcalled, say, marked.By default, all objects are unmarked when they are created.Thus, the marked field is initially false.

An object p and all the objects indirectly accessiblefrom p can be marked by using the following recursivemark method:

void mark (Object p)

if (!p.marked)

p.marked = true; for each Object q referenced by p mark (q);

Notice that this recursive mark algorithmdoes nothing when it encounters an object that has already been marked.Consequently, the algorithm is guaranteed to terminate.And it terminates only when all accessible objects have been marked.

In its second phase, the mark-and-sweep algorithmscans through all the objects in the heap,in order to locate all the unmarked objects.The storage allocated to the unmarked objects is reclaimed during the scan.At the same time, the marked field on every live object is set backto false in preparation for the next invocation of themark-and-sweep garbage collection algorithm:

void sweep ()

for each Object p in the heap

if (p.marked) p.marked = false else heap.release (p);

Figure gif illustrates the operation of the mark-and-sweepgarbage collection algorithm.Figure gif (a) shows the conditions before garbage collection begins.In this example, there is a single root variable.Figure gif (b) shows the effect of the mark phaseof the algorithm.At this point, all live objects have been marked.Finally, Figure gif (c) shows the objects left after the sweepphase has been completed.Only live objects remain in memory and the marked fields haveall been set to false again.

   figure30522
Figure: Mark-and-sweep garbage collection.

Because the mark-and-sweep garbage collection algorithmtraces out the set of objects accessible from the roots,it is able to correctly identify and collect garbageeven in the presence of reference cycles.This is the main advantage of mark-and-sweep over the referencecounting technique presented in the preceding section.A secondary benefit of the mark-and-sweep approach is thatthe normal manipulations of reference variables incurs no overhead.

The main disadvantage of the mark-and-sweep approach is the factthat that normal program execution is suspended while thegarbage collection algorithm runs.In particular, this can be a problem in a program that interactswith a human user or that must satisfy real-time execution constraints.For example, an interactive application that uses mark-and-sweepgarbage collection becomes unresponsive periodically.


本篇博客用C语言实现用John McCarthy提出的mark-sweep算法.
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#define STACK_MAX 256
#define INITIAL_GC_THRESHOLD 8
typedef int    bool;
#define true   1
#define false  0
typedef enum {
    OBJ_INT,
    OBJ_PAIR
}ObjectType;

typedef struct object {
    char marked;
    struct object *next;
    ObjectType type;
    union {
        /* OBJ_INT*/
        int value;
        /* OBJ_PAIR*/
        struct {
            struct object *head;
            struct object *tail;
        };
    };
}object;


typedef struct {
    int num_objects;
    int max_objects;
    object * firstobject;
    object *stack[STACK_MAX];
    int stacksize;
}VM;

VM* newVM();
object *newObject(VM *vm, ObjectType type);
bool isEmpty(VM *vm);
bool isFull(VM *vm);

void push(VM *vm, object *ref);
object *pop(VM *vm);

object *pushPair(VM *vm);
void pushInt(VM *vm, int value);

void mark(object *obj);
void markAll(VM *vm);
void sweep(VM *vm);

void gc(VM *vm);
void freeVM(VM *vm);





VM* newVM()
{
    VM* vm = malloc(sizeof(VM));
    vm->stacksize = 0;
    vm->firstobject = NULL;
    vm->num_objects = 0;
    vm->max_objects = INITIAL_GC_THRESHOLD;
    return vm;
}
bool isEmpty(VM *vm)
{
    return vm->stacksize == 0;
}
bool isFull(VM *vm)
{
    return vm->stacksize == STACK_MAX;
}

void push(VM *vm, object *ref)
{
    if(isFull(vm))
    {
        perror("Stack overflow");
        exit(EXIT_FAILURE);
    }
    vm->stack[vm->stacksize ++] = ref;

}

object *pop(VM *vm)
{
    if(isEmpty(vm))
    {
        perror("Stack underflow");
        exit(EXIT_FAILURE);
    }
    return vm->stack[-- vm->stacksize];
}

object *newObject(VM *vm, ObjectType type)
{
    if(vm->num_objects == vm->max_objects)
        gc(vm);
    object *obj = malloc(sizeof(object));
    obj->type = type;
    obj->marked = false;

    obj->next = vm->firstobject;
    vm->firstobject = obj;
    vm->num_objects ++;
    return obj;
}

void pushInt(VM *vm, int value)
{
    object *obj = newObject(vm, OBJ_INT);
    obj->value = value;
    push(vm, obj);
}
//return value

object *pushPair(VM *vm)
{
    object *obj = newObject(vm, OBJ_PAIR);
    obj->tail = pop(vm);
    obj->head = pop(vm);
    push(vm, obj);
    return obj;
}

void markAll(VM *vm)
{
    int i;
    for(i = 0; i < vm->stacksize; i++)
        mark(vm->stack[i]);
}

void mark(object *obj)
{
    /* avoid cyecle refference in the pair*/
    if(obj->marked)
        return;
    obj->marked = true;
    if(obj->type == OBJ_PAIR)
    {
        mark(obj->head);
        mark(obj->tail);
    }
}

void sweep(VM *vm)
{
    object *prev = NULL;
    object *cur = vm->firstobject;
    while(cur)
    {
        object *next = cur->next;
        if(!cur->marked)
        {
            if(prev)
            {
                prev->next = next;
            }
            else
                vm->firstobject = next;
            free(cur);
            vm->num_objects --;
        }
        else
        {
              prev = cur;
              cur->marked = false;
        }

        cur =next;
    }
}

void gc(VM *vm)
{
    int num_object = vm->num_objects;
    markAll(vm);
    sweep(vm);
    vm->max_objects = vm->num_objects * 2;
    printf("collect %d objects, %d objects remain\n", num_object - vm->num_objects, vm->num_objects);
}

void freeVM(VM *vm)
{
    vm->stacksize = 0;
    gc(vm);
    free(vm);
}

void test1()
{
    printf("test1:\n");
    VM *vm = newVM();
    pushInt(vm, 1);
    pushInt(vm, 2);
    gc(vm);
    assert(vm->num_objects == 2);
    freeVM(vm);
}

void test2()
{
  printf("test2:\n");
  VM *vm = newVM();
  pushInt(vm, 1);
  pushInt(vm, 2);
  pop(vm);
  pop(vm);

  gc(vm);
  assert(vm->num_objects == 0);
  freeVM(vm);
}

void test3()
{
    printf("test3:\n");
    VM *vm = newVM();
    pushInt(vm, 1);
    pushInt(vm, 2);
    pushPair(vm);
    pushInt(vm, 3);
    pushInt(vm, 4);
    pushPair(vm);
    pushPair(vm);

    gc(vm);
    assert(vm->num_objects == 7);
    freeVM(vm);
}

void test4()
{
    printf("test4:\n");
    VM *vm = newVM();
    pushInt(vm, 1);
    pushInt(vm, 2);
    object *obj1 = pushPair(vm);

    pushInt(vm ,3);
    pushInt(vm ,4);
    object *obj2 = pushPair(vm);

    /* make the 2, 4 unreachable*/
    obj1->tail = obj2;
    obj2->tail = obj1;
    gc(vm);
    assert(vm->num_objects == 4);
    freeVM(vm);
}


int main(void)
{
    test1();
    test2();
    test3();
    test4();
    perfTest();
    return 0;
}


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值