稀疏表是一种重要的数据结构。 在表的两个维度空间都很大,而表中大量数据都为空的情况时,这种存储方式很节省存储空间。
稀疏表可以使用链表来实现。需要在两位维度上保持两个一维数组。一个数组用来指示行,一个数组用来指示列。
比如说,一个商城系统,假设有10000种商品,有100000个顾客。每个顾客会买商品中的很小一部分,而每种商品也不可能被每个顾客购买,因此,顾客与商品在两个维度上就构成了稀疏表。有交集的部分表示顾客购买了该商品,同时该商品被这个顾客所购买。这样,就从顾客角度与商品角度两个维度通过两个链表来保存资源。
再比如说,一个学生选课系统,假设有500门课程,有10000学生。同样,也是稀疏表。
上述两个例子中,我们不可能使用10000商品*1000000顾客或者500课程*10000学生的表来记录信息,那样,将会产生大量的资源浪费。
我们可以如下定义数据结构:
template<class T1, class T2>
struct SparseLinkNode
{
unsigned int firstDataIndex;
T1 firstData;
unsigned int secondDataIndex;
T2 secondData;
SparseLinkNode<T1, T2>* firstLevelNext;
SparseLinkNode<T1, T2>* secondLevelNext;
};
这表示一个节点的数据。T1为第一纬度的数据,T2为第二纬度的数据。比方说,firstDataIndex为顾客标号,firstData就可以为顾客的某个(些)属性(姓名,性别,年龄等等)。secondDataIndex为商品标号,secondData为商品的某些属性(价格,数量等等)。firstLevelNext从第一个纬度串起数据。secondLevelNext从第二个纬度串起数据。注意,稀疏表的两个纬度的index必须是有序的,唯一的。
这样,我们表中存在的所有节点,都是有意义的。都表示实际的商品/顾客对应关系。并可以从商品与顾客两个纬度进行搜索。
稀疏表在插入时,要注意,该坐标位置应该为空(否则就是累加),在行纬度与列纬度均需要进行相应链表的插入。
稀疏表在删除时,要注意,该坐标位置应该有数据,在行纬度与列纬度均需要进行相应链表的删除,这一点在C++中尤为重要,一定要确保内存被释放并且只被释放一次。
引申思考:
这种思想在数据库中使用的比较多。如果我们想保存商品与顾客的购买情况信息时,我们一定是新创建一个商品购买表,而绝不可能在商品表中增加顾客属性与在顾客表中增加商品属性。
注意:
1.使用模板时,vs2015有时候也会识别不出来导致没有提示.这种情况下要相信自己,并非语法错误.尝试编译链接通过,应该就不是语法错误.
2.我只定义了2个模板参数,却一直报error C2977,模板参数太多的错误.检查后发现,原来是在另一个文件中定义了同样名称的结构,而那个文件中只有一个模板参数. 将结构名称更改后,问题消失. 其实原因在于编译器不知道具体使用哪一个模板了.
3.在使用模板时,如果变量定义与模板函数名称重复,也会导致编译器不知如何解析. 类似开始时,我定义bool isOverStack= isOverStack(firstIndex, secondIndex); 编译器报错,模板无法接受参数. 显然将我的布尔变量isOverStack认成了函数isOverStack.
C++及C++模板的编译原理:
C++编译器将.h与.cpp文件生成.obj文件.链接器将.obj文件进行链接,生成可执行文件.若没有使用模板,在.h文件中有函数的声明,.cpp文件中有函数的实现.函数被调用时,连接器可以找到对应的函数.因此编译链接过程正常.C++中的对象占用空间的大小,在编译期间就可以被确定.
但在使用了模板之后,就变得不一样了.编译期间不知道具体对象占用空间的大小.只有在具体实例化的时候,编译器才能知道对象所占用的空间大小. 因此编译器可以解析模板定义并检查语法,但不能生成成员函数的obj代码.因为要生成对应的obj代码,编译器必须知道具体的类型,而不是模板参数.具体类型只有在实例化的时候才会知道.在实例化时,编译器尝试去寻找对应的函数,但不幸的是,前面并未有函数定义的生成,因此编译器尝试在此时生成确定的具体的函数定义,但前提条件必须是模板的实现被包括在实例化代码发生的文件中.这也就是为什么使用模板,通常做法是把函数实现也写在.h中的原因。否则,就需要#include xxx.cpp这样的写法。其目的都是让对象在实例化的时候,需要知道函数的定义。
使用下列方法来测试代码,可以看到内存使用很稳定,没有变化,确保无内存泄露问题发生。
while (true){
SparseTable<int, string>* sparseTable = new SparseTable<int, string>();
sparseTable->insert(92, -17, 2, "922922");
sparseTable->insert(2, 23, 3, "2323");
sparseTable->insert(1, 12, 2, "1212");
sparseTable->insert(1, 13, 3, "1313");
sparseTable->insert(2, 21, 1, "2121");
sparseTable->insert(3, 32, 2, "3232");
sparseTable->insert(2, 22, 2, "2222");
sparseTable->deleteViaIndex(2, 2);
sparseTable->deleteViaIndex(0, 3);
sparseTable->deleteViaIndex(45, 32);
delete sparseTable;
sparseTable = NULL;
}
#ifndef _SPARSE_TABLE_H_
#define _SPARSE_TABLE_H_
#include "Constant.h"
template<class T1, class T2>
struct SparseLinkNode
{
unsigned int firstDataIndex;
T1 firstData;
unsigned int secondDataIndex;
T2 secondData;
SparseLinkNode<T1, T2>* firstLevelNext;
SparseLinkNode<T1, T2>* secondLevelNext;
};
template<class T1, class T2>
struct IndexNode
{
SparseLinkNode<T1, T2>* next;
};
template<class T1, class T2>
class SparseTable
{
public:
SparseTable();
~SparseTable();
bool insert(const unsigned int firstIndex, const T1 firstData, const unsigned int secondIndex, const T2 secondData);
bool deleteViaIndex(const unsigned int firstIndex, const unsigned int secondIndex);
private:
bool isOverStack(const unsigned int firstIndex, const unsigned int secondIndex);
bool insertAtRow(const unsigned int firstIndex, const unsigned int secondIndex, SparseLinkNode<T1, T2> *node);
bool insertAtCol(const unsigned int firstIndex, const unsigned int secondIndex, SparseLinkNode<T1, T2> *node);
bool delAtRow(const unsigned int firstIndex, const unsigned int secondIndex);
bool delAtCol(const unsigned int firstIndex, const unsigned int secondIndex);
IndexNode<T1, T2> rowIndexNodeArray[SPARSE_TABLE_ROW];
IndexNode<T1, T2> colIndexNodeArray[SPARSE_TABLE_COL];
};
#endif // !_SPARSE_TABLE_H_
#include "SparseTable.h"
template<class T1, class T2>
SparseTable<T1, T2>::SparseTable()
{
for (int i = 0; i < SPARSE_TABLE_ROW; i++)
{
rowIndexNodeArray[i].next = NULL;
}
for (int i = 0; i < SPARSE_TABLE_COL; i++)
{
colIndexNodeArray[i].next = NULL;
}
}
template<class T1, class T2>
SparseTable<T1, T2>::~SparseTable()
{
for (int i = 0; i < SPARSE_TABLE_ROW; i++)
{
if (NULL == rowIndexNodeArray[i].next)
{
continue;
}
else
{
SparseLinkNode<T1, T2>* rowIter = rowIndexNodeArray[i].next;
SparseLinkNode<T1, T2>* needDelIter = NULL;
while (NULL != rowIter)
{
needDelIter = rowIter;
rowIter = rowIter->firstLevelNext;
needDelIter->firstLevelNext = NULL;
rowIndexNodeArray[i].next = rowIter;
}
}
}
for (int i = 0; i < SPARSE_TABLE_COL; i++)
{
if (NULL == colIndexNodeArray[i].next)
{
continue;
}
else
{
SparseLinkNode<T1, T2>* colIter = colIndexNodeArray[i].next;
SparseLinkNode<T1, T2>* needDelIter = NULL;
while (NULL != colIter)
{
needDelIter = colIter;
colIter = colIter->secondLevelNext;
needDelIter->secondLevelNext = NULL;
delete needDelIter;
needDelIter = NULL;
colIndexNodeArray[i].next = colIter;
}
}
}
}
template<class T1, class T2>
bool SparseTable<T1, T2>::insert(const unsigned int firstIndex, const T1 firstData, const unsigned int secondIndex, const T2 secondData)
{
bool rs = true;
bool isOverFlow = isOverStack(firstIndex, secondIndex);
if (!isOverFlow)
{
SparseLinkNode<T1, T2> *node = new SparseLinkNode<T1, T2>();
node->firstDataIndex = firstIndex;
node->firstData = firstData;
node->secondDataIndex = secondIndex;
node->secondData = secondData;
node->firstLevelNext = NULL;
node->secondLevelNext = NULL;
rs = (insertAtRow(firstIndex, secondIndex, node) && insertAtCol(firstIndex, secondIndex, node));
}
else
{
rs = false;
}
return rs;
}
template<class T1, class T2>
bool SparseTable<T1, T2>::isOverStack(const unsigned int firstIndex, const unsigned int secondIndex)
{
if ((firstIndex >= SPARSE_TABLE_ROW) || (secondIndex >= SPARSE_TABLE_COL))
{
return true;
}
return false;
}
template<class T1, class T2>
bool SparseTable<T1, T2>::insertAtRow(const unsigned int firstIndex, const unsigned int secondIndex, SparseLinkNode<T1, T2> *node)
{
bool rs = false;
SparseLinkNode<T1, T2> *iter = rowIndexNodeArray[firstIndex].next;
if (NULL == iter)
{
//This row is empty, insert the node after head
rowIndexNodeArray[firstIndex].next = node;
rs = true;
}
else
{
SparseLinkNode<T1, T2> *preIter = NULL;
while(NULL != iter)
{
unsigned int secondDataIndex = iter->secondDataIndex;
if (secondIndex > secondDataIndex)
{
preIter = iter;
iter = iter->firstLevelNext;
continue;
}
else if (secondIndex == secondDataIndex)
{
//duplicate, error case
rs = false;
break;
}
else //Need insert
{
if (NULL == preIter)
{
//At row level, insert the node after head and before iter
node->firstLevelNext = iter;
rowIndexNodeArray[firstIndex].next = node;
rs = true;
break;
}
else
{
//At row level, insert the node before iter and after preIter
node->firstLevelNext = iter;
preIter->firstLevelNext = node;
rs = true;
break;
}
}
}
if (!rs && (NULL == iter))
{
//At col level, insert the node after iter
if (NULL == preIter)
{
//Impossible
rs = false;
}
else
{
preIter->firstLevelNext = node;
rs = true;
}
}
}
return rs;
}
template<class T1, class T2>
bool SparseTable<T1, T2>::insertAtCol(const unsigned int firstIndex, const unsigned int secondIndex, SparseLinkNode<T1, T2> *node)
{
bool rs = false;
SparseLinkNode<T1, T2> *iter = colIndexNodeArray[secondIndex].next;
if (NULL == iter)
{
//This row is empty, insert the node after head
colIndexNodeArray[secondIndex].next = node;
rs = true;
}
else
{
SparseLinkNode<T1, T2> *preIter = NULL;
while (NULL != iter)
{
unsigned int firstDataIndex = iter->firstDataIndex;
if (firstIndex > firstDataIndex)
{
preIter = iter;
iter = iter->secondLevelNext;
continue;
}
else if (firstIndex == firstDataIndex)
{
//duplicate, error case
rs = false;
break;
}
else //Need insert
{
if (NULL == preIter)
{
//At row level, insert the node after head and before iter
node->secondLevelNext = iter;
colIndexNodeArray[secondIndex].next = node;
rs = true;
break;
}
else
{
//At row level, insert the node before iter and after preIter
node->secondLevelNext = iter;
preIter->secondLevelNext = node;
rs = true;
break;
}
}
}
if (!rs && (NULL == iter))
{
//At col level, insert the node after iter
if (NULL == preIter)
{
//Impossible
rs = false;
}
else
{
preIter->secondLevelNext = node;
rs = true;
}
}
}
return rs;
}
template<class T1, class T2>
bool SparseTable<T1, T2>::deleteViaIndex(const unsigned int firstIndex, const unsigned int secondIndex)
{
bool rs = false;
/******注意,使用模板时,变量名称不可与模板函数名称一致,
原因仍在于模板在产生实例化时才会调用相关代码,若变量与函数同名,
编译器无法区分******/
bool isOverFlow = isOverStack(firstIndex, secondIndex);
if (!isOverFlow)
{
//The && priority is very important, follow the left -> right sequence
rs = (delAtRow(firstIndex, secondIndex) && (delAtCol(firstIndex, secondIndex)));
}
else
{
rs = false;
}
return rs;
}
template <class T1, class T2>
bool SparseTable<T1, T2>::delAtRow(const unsigned int firstIndex, const unsigned int secondIndex)
{
bool rs = false;
SparseLinkNode<T1, T2> *iter = rowIndexNodeArray[firstIndex].next;
if (NULL == iter)
{
rs = false;
}
else
{
SparseLinkNode<T1, T2> *preIter = NULL;
while (NULL != iter)
{
unsigned int secondLevelIndex = iter->secondDataIndex;
if (secondIndex > secondLevelIndex)
{
preIter = iter;
iter = iter->firstLevelNext;
continue;
}
else if (secondIndex == secondLevelIndex) //Find the node, need delete
{
//Be careful, not call delete to release the memory here, because the col level handle will still need it
if (NULL == preIter) //delete the first
{
rowIndexNodeArray[firstIndex].next = iter->firstLevelNext;
iter->firstLevelNext = NULL;
rs = true;
break;
}
else
{
preIter->firstLevelNext = iter->firstLevelNext;
iter->firstLevelNext = NULL;
rs = true;
break;
}
}
else //Not find the node, break
{
rs = false;
break;
}
}
}
return rs;
}
template <class T1, class T2>
bool SparseTable<T1, T2>::delAtCol(const unsigned int firstIndex, const unsigned int secondIndex)
{
bool rs = false;
SparseLinkNode<T1, T2> *iter = colIndexNodeArray[secondIndex].next;
if (NULL == iter)
{
rs = false;
}
else
{
SparseLinkNode<T1, T2> *preIter = NULL;
while (NULL != iter)
{
unsigned int firstLevelIndex = iter->firstDataIndex;
if (firstIndex > firstLevelIndex)
{
preIter = iter;
iter = iter->secondLevelNext;
continue;
}
else if (firstIndex == firstLevelIndex) //Find the node, need delete
{
//Be careful, need to release the memory here
if (NULL == preIter) //delete the first
{
colIndexNodeArray[secondIndex].next = iter->secondLevelNext;
iter->secondLevelNext = NULL;
delete iter;
iter = NULL;
rs = true;
break;
}
else
{
preIter->secondLevelNext = iter->secondLevelNext;
iter->secondLevelNext = NULL;
delete iter;
iter = NULL;
rs = true;
break;
}
}
else //Not find the node, break
{
rs = false;
break;
}
}
}
return rs;
}