Skip Lists

Skip lists are linked lists that allow you toskipto the correct node. The performance bottleneck inherent in a sequential scan is avoided, while insertion and deletion remain relatively efficient. Average search time is O(lgn). Worst-case search time is O(n), but is extremely unlikely. An excellent reference for skip lists is Pugh [1990].

Theory

The indexing scheme employed in skip lists is similar in nature to the method used to lookup names in an address book. To lookup a name, you index to the tab representing the first character of the desired entry. In Figure 3-8, for example, the top-most list represents a simple linked list with no tabs. Adding tabs (middle figure) facilitates the search. In this case, level-1 pointers are traversed. Once the correct segment of the list is found, level-0 pointers are traversed to find the specific entry.


Figure 3-8: Skip List Construction

The indexing scheme may be extended as shown in the bottom figure, where we now have an index to the index. To locate an item, level-2 pointers are traversed until the correct segment of the list is identified. Subsequently, level-1 and level-0 pointers are traversed.

During insertion the number of pointers required for a new node must be determined. This is easily resolved using a probabilistic technique. A random number generator is used to toss a computer coin. When inserting a new node, the coin is tossed to determine if it should be level-1. If you lose, the coin is tossed again to determine if the node should be level-2. Another loss and the coin is tossed to determine if the node should be level-3. This process repeats until you win. If only one level (level-0) is implemented, the data structure is a simple linked-list with O(n) search time. However, if sufficient levels are implemented, the skip list may be viewed as a tree with the root at the highest level, and search time is O(lgn).

The skip list algorithm has a probabilistic component, and thus a probabilistic bounds on the time required to execute. However, these bounds are quite tight in normal circumstances. For example, to search a list containing 1000 items, the probability that search time will be 5 times the average is about 1 in 1,000,000,000,000,000,000.

Implementation in C

An ANSI-C implementation for skip lists is included. Typedefs recTypekeyType, and comparison operators compLT and compEQ should be altered to reflect the data stored in the list. In addition, MAXLEVEL should be set based on the maximum size of the dataset.

To initialize, initList is called. The list header is allocated and initialized. To indicate an empty list, all levels are set to point to the header. Function insertallocates a new node, searches for the correct insertion point, and inserts it in the list. While searching, the update array maintains pointers to the upper-level nodes encountered. This information is subsequently used to establish correct links for the newly inserted node. The newLevel is determined using a random number generator, and the node allocated. The forward links are then established using information from the update array. Function delete deletes and frees a node, and is implemented in a similar manner. Function find searches the list for a particular value.

/* skip list */

#include <stdio.h>
#include <stdlib.h>

/* implementation dependent declarations */
typedef enum {
    STATUS_OK,
    STATUS_MEM_EXHAUSTED,
    STATUS_DUPLICATE_KEY,
    STATUS_KEY_NOT_FOUND
} statusEnum;

typedef int keyType;            /* type of key */

/* user data stored in tree */
typedef struct {
    int stuff;                  /* optional related data */
} recType;

#define compLT(a,b) (a < b)
#define compEQ(a,b) (a == b)

/* levels range from (0 .. MAXLEVEL) */
#define MAXLEVEL 15

typedef struct nodeTag {
    keyType key;                /* key used for searching */
    recType rec;                /* user data */
    struct nodeTag *forward[1]; /* skip list forward pointer */
} nodeType;

/* implementation independent declarations */
typedef struct {
    nodeType *hdr;              /* list Header */
    int listLevel;              /* current level of list */
} SkipList;

SkipList list;                  /* skip list information */

#define NIL list.hdr

statusEnum insert(keyType key, recType *rec) {
    int i, newLevel;
    nodeType *update[MAXLEVEL+1];
    nodeType *x;

   /***********************************************
    *  allocate node for data and insert in list  *
    ***********************************************/

    /* find where key belongs */
    x = list.hdr;
    for (i = list.listLevel; i >= 0; i--) {
        while (x->forward[i] != NIL && compLT(x->forward[i]->key, key))
            x = x->forward[i];
        update[i] = x;
    }
    x = x->forward[0];
    if (x != NIL && compEQ(x->key, key)) 
        return STATUS_DUPLICATE_KEY;

    /* determine level */
    for (
      newLevel = 0; 
      rand() < RAND_MAX/2 && newLevel < MAXLEVEL; 
      newLevel++);

    if (newLevel > list.listLevel) {
        for (i = list.listLevel + 1; i <= newLevel; i++)
            update[i] = NIL;
        list.listLevel = newLevel;
    }

    /* make new node */
    if ((x = malloc(sizeof(nodeType) + newLevel*sizeof(nodeType *))) == 0)
        return STATUS_MEM_EXHAUSTED;
    x->key = key;
    x->rec = *rec;

    /* update forward links */
    for (i = 0; i <= newLevel; i++) {
        x->forward[i] = update[i]->forward[i];
        update[i]->forward[i] = x;
    }
    return STATUS_OK;
}

statusEnum delete(keyType key) {
    int i;
    nodeType *update[MAXLEVEL+1], *x;

   /*******************************************
    *  delete node containing data from list  *
    *******************************************/

    /* find where data belongs */
    x = list.hdr;
    for (i = list.listLevel; i >= 0; i--) {
        while (x->forward[i] != NIL && compLT(x->forward[i]->key, key))
            x = x->forward[i];
        update[i] = x;
    }
    x = x->forward[0];
    if (x == NIL || !compEQ(x->key, key)) return STATUS_KEY_NOT_FOUND;

    /* adjust forward pointers */
    for (i = 0; i <= list.listLevel; i++) {
        if (update[i]->forward[i] != x) break;
        update[i]->forward[i] = x->forward[i];
    }

    free (x);

    /* adjust header level */
    while ((list.listLevel > 0)
    && (list.hdr->forward[list.listLevel] == NIL))
        list.listLevel--;

    return STATUS_OK;
}

statusEnum find(keyType key, recType *rec) {
    int i;
    nodeType *x = list.hdr;

   /*******************************
    *  find node containing data  *
    *******************************/

    for (i = list.listLevel; i >= 0; i--) {
        while (x->forward[i] != NIL && compLT(x->forward[i]->key, key))
            x = x->forward[i];
    }
    x = x->forward[0];
    if (x != NIL && compEQ(x->key, key)) {
        *rec = x->rec;
        return STATUS_OK;
    }
    return STATUS_KEY_NOT_FOUND;
}

void initList() {
    int i;

   /**************************
    *  initialize skip list  *
    **************************/

    if ((list.hdr = malloc(
            sizeof(nodeType) + MAXLEVEL*sizeof(nodeType *))) == 0) {
        printf ("insufficient memory (initList)\n");
        exit(1);
    }
    for (i = 0; i <= MAXLEVEL; i++)
        list.hdr->forward[i] = NIL;
    list.listLevel = 0;
}

int main(int argc, char **argv) {
    int i, maxnum, random;
    recType *rec;
    keyType *key;
    statusEnum status;


    /* command-line:
     *
     *   skl maxnum [random]
     *
     *   skl 2000
     *       process 2000 sequential records
     *   skl 4000 r
     *       process 4000 random records
     *
     */

    maxnum = atoi(argv[1]);
    random = argc > 2;

    initList();

    if ((rec = malloc(maxnum * sizeof(recType))) == 0) {
        fprintf (stderr, "insufficient memory (rec)\n");
        exit(1);
    }
    if ((key = malloc(maxnum * sizeof(keyType))) == 0) {
        fprintf (stderr, "insufficient memory (key)\n");
        exit(1);
    }

    if (random) {
        /* fill "a" with unique random numbers */
        for (i = 0; i < maxnum; i++) key[i] = rand();
        printf ("ran, %d items\n", maxnum);
    } else {
        for (i = 0; i < maxnum; i++) key[i] = i;
        printf ("seq, %d items\n", maxnum);
    }

    for (i = 0; i < maxnum; i++) {
        status = insert(key[i], &rec[i]);
        if (status) printf("pt1: error = %d\n", status);
    }

    for (i = maxnum-1; i >= 0; i--) {
        status = find(key[i], &rec[i]);
        if (status) printf("pt2: error = %d\n", status);
    }

    for (i = maxnum-1; i >= 0; i--) {
        status = delete(key[i]);
        if (status) printf("pt3: error = %d\n", status);
    }
    return 0;
}


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值