跳表的实现插入删除搜索算法详解

最新推荐文章于 2025-03-21 21:54:54 发布

FreeeLinux

最新推荐文章于 2025-03-21 21:54:54 发布

阅读量4.6k

点赞数 1

分类专栏：数据结构与算法

本文链接：https://blog.csdn.net/freeelinux/article/details/53114687

版权

数据结构与算法专栏收录该内容

37 篇文章

订阅专栏

本文深入探讨了跳表的基本概念及其高效实现方法。通过对比平衡树结构，介绍跳表如何利用链表实现O(log n)的查找效率。文章还提供了详细的代码实现与测试案例。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

（本文地址：http://blog.csdn.net/freeelinux/article/details/53114687)

跳表久闻其名，以前写过AVL树，红黑树，都是平衡结构。但是跳表呢，它不是树形结构，它是通过链表实现的。

我们知道，在链表中查找一个元素的时间复杂度是O(n)。

在一个有序链表中，我们如果在链表中部额外增加一个结点，那么查找某个元素的次数将会是 n/2+1 次。类似于二分查找。

有了这个思想，那么现在就可以实行跳跃表了。

先看图，有个印象，再解释。

如图，每个结点都有一个指针数组域，指向它下一个结点。普通链表成为0级链，增加一级成为1级链。可见结点指针数组域最大的大小是级数level+1。

为什么需要指针数组呢?我们利用数组下标的特性，将同一级链表的所有元素都与同一下标关联起来，同一下标中的指针存放的是下一个同一级跳表元素的地址。

比如索引20，那么有两种方式，(我们称第一个结点为头结点head_，link_[]是指针数组，详细见下面定义)：

1.head_->link[0]

2.head_->link[1]

那么head_->link_[3]呢，它直接索引到40去了。通过这种层次结构，我们可以很容易实现不同层级链表的遍历，后续代码也会给出。

另外指针数组精髓不止这一处，当我们要查找元素时，我们从跳表的当前最高级数开始查找，比如我们要查找30，我们首先先从i=2(下标从0开始)层查找，首先索引到40，

发现30<40，这是我们试一试下标减1降级。数组的好处就体现出来了，i现在值为1,查找link_[1]，索引到24，到这里可能有一种熟悉的感觉，查找范围一下缩小了一半，难道这不是另一种形式的二分查找吗?

是的，查找元素，大于就降级，最终剩下一个元素，要么找到了，要么跳表中就没有。这种方法，使跳表的查找由普通链表的O(n)降到了O(lgn)，这无疑是性能的飞跃。

一个有n个元素的跳表理性情况下链级数为log2n，即跳表的最高级数为log2n-1。

指针数组相当于虚拟出了额外的几层链表，实际上真实数据只有第0级，其他高于0级的索引都指向第0级真实元素。

对于跳表的设计还有一个特别之处，就是我们再使用last_[]来存放查找到某个元素的所有层级的前驱地址。这样在删除，加入某个元素时，就可以很方便的把每一层级的链表都更新，而不是某一层断掉。另外，有一点要注意，当我们更新跳表当前级数levels_时，要注意将新的last_[levels]指向tail_，不然我们后续增删都是通过levels_从最高增递减方式访问，会出现错误。

下面是skip_node类型的定义：

const int DEFAULT_SIZE = 100;

template <typename T>
struct skip_node {
    T            data_;
    skip_node<T> **link_;    //pointer array size default 100
    skip_node(int size = DEFAULT_SIZE){
        link_ = new(std::nothrow) skip_node<T>*[size];
        assert(link_ != NULL);
    }   
    ~skip_node(){
        delete []link_;
    }

在上面的跳表结点中，我们采用用template机制来达到泛型的目的。data_元素可以是整型，字符型，甚至自定义的类型，自定义类型需要重载相关运算符。

下面是skip_list的定义：

template <typename T>
class skip_list {
public:
    skip_list(const T& large, int max_level);
    ~skip_list();
public:
    bool insert(const T& el);
    bool remove(const T& el);
    bool search(const T& el);
    void traverse() const;
protected:
    bool assert_is_valid(const T& el);
    int  pick_level();
    skip_node<T>* save_search(const T& k);
private:
    int          max_level_;
    int          levels_;
    T            tail_key_;
    skip_node<T> *head_;
    skip_node<T> *tail_;
    skip_node<T> **last_;
};

max_level_：最大级数值，由于级数过多会影响效率，我们需要控制级数的极值。

levels_：当前跳表最大级数值

tail_key_：该值是你定义的T类型的最大值，若T类型是结构体，那还需要传入定制好的该结构体的最大值。该值存放在tai_节点中。

head_：这是链表的头结点，为了算法通用，不存放任何数据，只有指向各级节点的指针域有用。

tail_：跳表的尾节点，我们定义了一个极大值，通过该结点来辅助二分，该节点不存放任何数据。

last_：存放查找到节点的所有层级的前驱。在save_search中用到。

search()函数是公有供用户使用的函数，save_search()函数是内部插入删除访问的，因为插入删除牵扯到该结点每一层级的的改动，需要将该结点关联的每一层级前驱保存下来，以备更新。

代码实现如下：

#ifndef _SKIP_LIST_H
#define _SKIP_LIST_H

#include <iostream>
#include <assert.h>
#include <limits.h>

const int DEFAULT_SIZE = 100;

template <typename T>
struct skip_node {
	T            data_;
	skip_node<T> **link_;    //pointer array size default 100
	skip_node(int size = DEFAULT_SIZE){
		link_ = new(std::nothrow) skip_node<T>*[size];
		assert(link_ != NULL);
	}
	~skip_node(){
		delete []link_;
	}
};

template <typename T>
class skip_list {
public:
	skip_list(const T& large, int max_level);
	~skip_list();
public:
	bool insert(const T& el);
	bool remove(const T& el);
	bool search(const T& el);
	void traverse() const;
protected:
	bool assert_is_valid(const T& el);
	int  pick_level(); 
	skip_node<T>* save_search(const T& k);
private:
	int          max_level_;
	int          levels_;
	T            tail_key_;
	skip_node<T> *head_;
	skip_node<T> *tail_;
	skip_node<T> **last_;
};

template <typename T>
skip_list<T>::skip_list(const T& large, int max_level)
	: tail_key_(large), max_level_(max_level)
{
	head_ = new(std::nothrow) skip_node<T>(max_level_+1);
	assert(head_ != NULL);
	tail_ = new(std::nothrow) skip_node<T>(0);
	assert(tail_ != NULL);
	tail_->data_ = large;
	last_ = new(std::nothrow) skip_node<T>*[max_level_+1];
	assert(last_ != NULL);
		
	for(int i=0; i<=max_level_; ++i)
		head_->link_[i] = tail_;
	levels_ = 0;
}

template <typename T>
skip_list<T>::~skip_list()
{
		skip_node<T>* tmp = NULL;;
		while(tmp != tail_){
			tmp = head_->link_[0];
			delete head_;
			head_ = tmp;
		}
		delete tail_;
		delete []last_;
}

template <typename T>
bool skip_list<T>::search(const T& key)
{
	if(!assert_is_valid(key))
		return false;
	
	skip_node<T>* p = head_;
	for(int i=levels_; i>=0; --i){ //notice != tail_, for tail_ does not assigned  link_ space
		while(p->link_[i]->data_ < key)
			p = p->link_[i];
	}
	return p->link_[0]->data_ == key ? true : false;
}

template <typename T>
skip_node<T>* skip_list<T>::save_search(const T& key)
{
	if(!assert_is_valid(key))
		return NULL;
	
	skip_node<T>* p = head_;
	for(int i=levels_; i>=0; --i){
		while(p->link_[i]->data_ < key)    //operator <
			p = p->link_[i];
		last_[i] = p;        //last_[i] storge node ptrs which point the k in all levels
	}
	return p->link_[0];
}

template <typename T>
int skip_list<T>::pick_level()
{
	int cnt = 0;
	while(rand() < (RAND_MAX>>1))
		++cnt;
	return cnt > max_level_ ? max_level_ : cnt;
}

template <typename T>
bool skip_list<T>::assert_is_valid(const T& el)
{
	return (el >= (T)0 && el < tail_key_);    //if not POD, pls operation reload    
}


template <typename T>
bool skip_list<T>::insert(const T& el)
{
	skip_node<T>* p = save_search(el);
	if(p == NULL || p->data_ == el)
		return false;
	
	int lev = pick_level();
	if(lev > levels_){
		++levels_;
		last_[levels_] = head_;  //don't forget this line, if you add the levels, pls modify the last array
	}

	skip_node<T>* new_node = new(std::nothrow) skip_node<T>(levels_+1);
	assert(new_node != NULL);
	new_node->data_ = el;

	for(int i=levels_; i>=0; --i){
		new_node->link_[i] = last_[i]->link_[i];
		last_[i]->link_[i] = new_node;
	}
	return true;
}

template <typename T>
bool skip_list<T>::remove(const T& el)
{
	skip_node<T>* p = save_search(el);
	if(p == NULL || p == tail_)
		return false;
	
	for(int i=0; i<=levels_ && last_[i]->link_[i]->data_ == el; ++i){//if you remove the low levels elem, maybe the high levels does
		last_[i]->link_[i] = p->link_[i];                     //not contain it, so ensure last_[i]->data_ == el;
	}
	while(levels_ > 0 && head_->link_[levels_] == tail_)
		--levels_;
	return true;
}

using namespace std;

template <typename T>
void skip_list<T>::traverse() const
{
	skip_node<T> *p = NULL;
	for(int i=levels_; i>=0; --i){
		std::cout<<"head->";
		p = head_->link_[i];
		while(p->data_ < tail_key_){
			std::cout<<p->data_<<"->";
			p = p->link_[i];
		}
		std::cout<<"tail";
		std::cout<<std::endl;
	}
}

#endif

测试用例：

#include "skip_list.h"

#include <iostream>
#include <limits.h>
using namespace std;

struct key_value {
	int key_;
	int value_;
	key_value() {}      //must have this line, the data_ in skip_node when not assigned value will call default constructor.
	key_value(int key, int value = 0)
		: key_(key), value_(value)
	{}
	~key_value() {}
	bool operator<(const key_value& other) const{
		return key_ < other.key_;
	}
	bool operator>=(const key_value& other) const{
		return key_ >= other.key_;
	}
	key_value& operator=(const key_value& other){
		if((void *)this != (void *)&other){
			key_ = other.key_;
			value_ = other.value_;
		}
		return *this;
	}
	bool operator==(const key_value& other) const{
		return key_ == other.key_;
	}
	friend ostream& operator<<(ostream& out, const key_value& obj);
};

ostream& operator<<(ostream& out, const key_value& obj)
{
	out<<'('<<obj.key_<<','<<obj.value_<<')';
	return out;
}

#if 1
int main()
{
	key_value v1(1, 1);
	key_value v2(2, 1);
	key_value v3(2, 1);
	key_value v4(3, 1);
	key_value v5(4, 1);
	key_value v6(9, 1);
	key_value large(INT_MAX, 0);
	skip_list<key_value> sl(large, 7);

	sl.insert(v1);
	sl.insert(v2);
	sl.insert(v3);
	sl.insert(v4);
	sl.insert(v5);
	sl.insert(v6);

	sl.traverse();

	sl.remove(v6);
	sl.remove(v1);
	sl.remove(v3);
	sl.remove(v4);
	sl.remove(v5);
	sl.remove(v2);

	sl.traverse();

	return 0;
}
#endif

#if 0
int main()
{
	skip_list<int> sl(INT_MAX, 7);

	sl.insert(1);
	sl.insert(2);
	sl.insert(3);
	sl.insert(2);
	sl.insert(9);
	sl.insert(4);
	sl.insert(5);
	sl.insert(10);
	sl.insert(300);
	
	sl.traverse();

	cout<<sl.search(1)<<endl;	
	cout<<sl.search(2)<<endl;	
	cout<<sl.search(3)<<endl;	
	cout<<sl.search(4)<<endl;	
	cout<<sl.search(5)<<endl;	
	cout<<sl.search(9)<<endl;	
	cout<<sl.search(10)<<endl;	
	cout<<sl.search(300)<<endl;	
	cout<<sl.search(250)<<endl;	
	cout<<sl.search(-1)<<endl;	

/*
	sl.traverse();
	sl.remove(2);
	sl.traverse();
	sl.remove(300);
	sl.traverse();
	sl.remove(1);
	sl.traverse();
	sl.remove(4);
	sl.traverse();
	sl.remove(9);
	sl.remove(3);
	sl.remove(5);
	sl.remove(300);
	sl.remove(10);
	sl.traverse();
*/
	return 0;
}
#endif

自定义类型需要重载相关运算符。上述代码全部通过测试，并利用valgrind检测无内存泄漏。