C++知识点33——使用C++标准库（无序关联容器unordered_(multi)map，unordered_(multi)set）

最新推荐文章于 2023-07-21 22:07:33 发布

Master Cui

最新推荐文章于 2023-07-21 22:07:33 发布

阅读量448

点赞数

分类专栏： C++基础文章标签： c++ 数据结构

本文链接：https://blog.csdn.net/Master_Cui/article/details/108765842

版权

C++基础专栏收录该内容

65 篇文章 51 订阅

订阅专栏

C++中，无序关联容器一共有4个，unordered_map，unordered_set，unordered_multimap，unordered_multiset

这四个和有序关联容器最大的区别就是无需关联容器没有根据key或者value进行排序，内部元素是无序的，无序关联容器只能表明元素是否在容器中，并不能对元素进行排序

无序关联容器用哈希表来实现，如下图

通过哈希函数，将元素放在对应的桶中，每个桶是一个链表，所以一般情况下，无序容器的查找速度快于关联容器

unordered_map和unordered_multimap的区别和map与multimap的区别类似

unordered_set和unordered_multiset的区别和set与multiset的区别类似

1.四个无序容器的声明与构造函数

template < class Key,                                    // unordered_map::key_type
           class T,                                      // unordered_map::mapped_type
           class Hash = hash<Key>,                       // unordered_map::hasher
           class Pred = equal_to<Key>,                   // unordered_map::key_equal
           class Alloc = allocator< pair<const Key,T> >  // unordered_map::allocator_type
           > class unordered_map;

template < class Key,                        // unordered_set::key_type/value_type
           class Hash = hash<Key>,           // unordered_set::hasher
           class Pred = equal_to<Key>,       // unordered_set::key_equal
           class Alloc = allocator<Key>      // unordered_set::allocator_type
           > class unordered_set;

unordered_map和unordered_multimap的声明只有名字不同，其余相同。unordered_set和unordered_multiset也是

class Hash用来定义哈希函数，默认是hash<Key>，hash<Key>是个函数对象，在头文件<functional>中，一般不用重新指定

class Pred用来定义等价准则，接受两个同类型的参数，返回值必须是bool，默认使用operator==判断两个元素是否相等，unordered_map和unordered_set中的任何两个元素都使该函数对象返回false，一般情况也不用重新指定

class Alloc就是内存分配器，用来分配内存，一般也不用重新实现

unordered_map和unordered_multimap， unordered_set和unordered_multiset构造函数

unordered_map();//默认构造函数
explicit unordered_map ( size_type n,
                         const hasher& hf = hasher(),
                         const key_equal& eql = key_equal(),
                         const allocator_type& alloc = allocator_type() );//指定桶的个数的构造函数，size_type n是桶的个数
explicit unordered_map ( const allocator_type& alloc );
unordered_map ( size_type n, const allocator_type& alloc );
unordered_map ( size_type n, const hasher& hf, const allocator_type& alloc );
	
 template <class InputIterator>//根据迭代器范围初始化的构造函数
 unordered_map ( InputIterator first, InputIterator last,
                  size_type n = /* see below */,
                  const hasher& hf = hasher(),
                  const key_equal& eql = key_equal(),
                  const allocator_type& alloc = allocator_type() );
                  
template <class InputIterator>
unordered_map ( InputIterator first, InputIterator last,
                  size_type n, const allocator_type& alloc );//根据迭代器范围初始化并指定桶的个数
                  
template <class InputIterator>
unordered_map ( InputIterator first, InputIterator last,
                  size_type n, const hasher& hf, const allocator_type& alloc );

unordered_map ( const unordered_map& ump );//拷贝构造函数
unordered_map ( const unordered_map& ump, const allocator_type& alloc );

unordered_map ( initializer_list<value_type> il,
                size_type n = /* see below */,
                const hasher& hf = hasher(),
                const key_equal& eql = key_equal(),
                const allocator_type& alloc = allocator_type() );//列表初始化
unordered_map ( initializer_list<value_type> il,
                size_type n, const allocator_type& alloc );
unordered_map ( initializer_list<value_type> il,
                size_type n, const hasher& hf, const allocator_type& alloc );

其余三个容器的构造函数和unordered_map只是名字不同，其余都相同

示例

void unorderedinit()
{
	unordered_set<int> s1={5,4,7,8,2};
	unordered_multiset<int> s2(15);
	cout<<s2.bucket_count()<<endl;

	unordered_map<string, int> m1;
	m1["one"]=1;
	m1["two"]=2;
	m1["three"]=3;

	unordered_multimap<string, int> m2(m1.begin(), m1.end(), 20);
	cout<<m2.bucket_count()<<endl;

	unordered_map<string, int> m3={{"four", 4}, {"five", 5}};
}

无序容器在初始化时指定的桶数量通常小于等于无序容器实际的桶数量

2.四个无序容器的赋值

四个无序容器只能通过=或者swap进行赋值，和前四个关联有序容器相同，都没有assign操作

此外，和有序关联容器一样，所有的模板参数的类型必须相同（具体实现可以不同），否则不能赋值

示例参考博客https://blog.csdn.net/Master_Cui/article/details/108690877

3.四个无序容器的查找

size_type count ( const key_type& k ) const;

iterator find ( const key_type& k );
const_iterator find ( const key_type& k ) const;

pair<iterator,iterator> equal_range ( const key_type& k );
pair<const_iterator,const_iterator> equal_range ( const key_type& k ) const;

四个无序容器的查找和有序关联容器类似，虽然少了upper_bound和lower_bound，但是上述三个函数的意义和有序关联容器都是一样的

示例参考博客https://blog.csdn.net/Master_Cui/article/details/108690877

4.四个无序容器的添加

pair<iterator,bool> insert ( const value_type& val );//unordered_map和unordered_set的插入操作
iterator insert ( const value_type& val );//unordered_multimap和unordered_multiset的插入操作

iterator insert ( const_iterator hint, const value_type& val );

template <class InputIterator>
void insert ( InputIterator first, InputIterator last );

void insert ( initializer_list<value_type> il );

参考博客https://blog.csdn.net/Master_Cui/article/details/108690877， https://blog.csdn.net/Master_Cui/article/details/108696599和https://blog.csdn.net/Master_Cui/article/details/108712358

5.四个无序容器的删除

iterator erase ( const_iterator position );
size_type erase ( const key_type& k );
iterator erase ( const_iterator first, const_iterator last );

四个无序容器的删除操作的形式完全一样

参考博客https://blog.csdn.net/Master_Cui/article/details/108690877， https://blog.csdn.net/Master_Cui/article/details/108696599和https://blog.csdn.net/Master_Cui/article/details/108712358

6.四个无序容器的下标操作

四个无序容器只有unordered_map有[]和at操作，效果和map一样

参考博客https://blog.csdn.net/Master_Cui/article/details/108690877

7.四个无序容器的一般操作

size，empty和clear这种的一般操作和有序关联容器是相同的，但是因为无序，所以无序关联容器没有＞和＜操作，比较操作只有==和!=

8.四个无序容器的特殊接口

四个无序容器之所以有特殊接口，是因为其内部结构导致的，再次把这个图片拿出来

上图中，bucket表示桶，桶是无序容器哈希表内部一个位置，根据key对应的哈希值把元素分配到对应的桶中，一个桶中的元素越多，那么访问该桶中特定元素所花费的时间也会越多

为了保证无序容器查找元素的速度，无序容器会设置一个负载因子，负载因子是容器中元素数量桶数量之间的比率：load_factor =size/ bucket_count，负载因子越大，两个元素位于同一存储桶中的概率就越高。负载因子越大，也说明无序容器中桶的数量不够了，所以，当无序容器的负载因子达到max_load_factor时，容器会自动增加桶数，以将负载因子保持在max_load_factor以下，保证每个桶中没有太多的元素，从而保证无序容器的查找速度。每次需要增加存储桶数时都会进行重新哈希处理。此时会将所有元素根据哈希值重新放置在新的桶中，所以，之前保存的迭代器就会失效

正因为无序容器的特殊性，所以才有了下面的9个接口

size_type bucket ( const key_type& k ) const;//返回key所对应的元素所在桶的编号
size_type bucket_size ( size_type n ) const;//返回桶n中的元素个数

size_type bucket_count() const noexcept;//返回容器对象中现有桶的数目
size_type max_bucket_count() const noexcept;//返回容器对象中桶的最大数量

float load_factor() const noexcept;//返回当前容器对象的负载因子
float max_load_factor() const noexcept;//返回当前容器的最大负载因子
void max_load_factor ( float z );//设置当前容器的最大负载因子

void rehash( size_type n );//重新设置当前容器中桶的个数为n
void reserve ( size_type n );//为容器设置能容纳n个元素的空间

对于函数rehash，rehash导致哈希表的重建，只要负载因子超过max_load_factor，容器就会自动rehash。容器中的所有元素根据其哈希值重新放到新的桶中。这会使迭代器失效。

如果n大于容器中的当前桶数，则将强制进行重新哈希处理。新的存储桶计数大于等于n。

如果n小于容器中当前的存储桶数，则该函数可能对存储桶数没有影响，可能不会强制进行重新哈希。

对于函数reserve

如果n大于当前的bucket_count乘以max_load_factor，则增加容器的bucket_count并rehash。

如果n小于该值，则可能并不会rehash。

由于无序容器的特殊结构，无序容器的迭代器函数可以按桶遍历元素

local_iterator begin ( size_type n );//返回桶n中的第一个元素的迭代器
const_local_iterator begin ( size_type n ) const;

local_iterator end (size_type n);//返回桶n中的尾后迭代器
const_local_iterator end (size_type n) const;

示例

void buckettest()
{	
	unordered_map<string, int> m1;
	m1["one"]=1;
	m1["two"]=2;
	m1["three"]=3;

	cout<<m1.bucket_count()<<","<<m1.load_factor()<<endl;

	for (int i=0;i<m1.bucket_count();++i) {
		cout<<m1.bucket_size(i)<<endl;
	}

	for (int i=0;i<m1.bucket_count();++i) {
		for (unordered_map<string, int>::local_iterator it=m1.begin(i);
			it!=m1.end(i);++it) {
			cout<<it->first<<it->second<<"in"<<i<<endl;
			cout<<m1.bucket(it->first)<<endl;
		}
	}
}

上述代码打印出桶数和负载因子，二者相乘正好符合元素的个数

同时上述代码也通过桶迭代器遍历无序容器并打印出了每个桶中有多少元素，并且知道了每个元素所在桶的序号，注意，桶迭代器是local_iterator

9.四个无序容器与迭代器失效

无序容器因为其结构特殊，有可能rehash，所以迭代器失效的情况比有序关联容器多。

有序关联容器迭代器失效的情况，无序容器依然适用，见博客https://blog.csdn.net/Master_Cui/article/details/108690877， https://blog.csdn.net/Master_Cui/article/details/108696599和https://blog.csdn.net/Master_Cui/article/details/108712358

理论上，无序容器有可能因为直接或者间接rehash导致迭代器失效，

导致rehash有以下几种可能：

1.程序直接调用rehash

2.向无序容器中插入一个元素，导致负载因子大于最大负载因子，从而容器添加了桶的个数导致rehash

3.程序直接调用了reserve

但是实际上，在Ubuntu18.04下实验发现，无论是桶迭代器还是一般迭代器，在rehash之后，迭代器并没有失效

void iteratorfailed()
{
	unordered_map<string, int> m1;
	m1["one"]=1;
	m1["two"]=2;
	m1["three"]=3;
	for (auto n:m1) {
		cout<<n.first<<n.second<<endl;
	}

	unordered_map<string, int>::iterator it=
		m1.find("one");

	cout<<m1.bucket_count()<<endl;

	m1.rehash(100);
	//m1.reserve(1000);
	cout<<m1.bucket_count()<<endl;
	cout<<it->first<<it->second<<endl;

	for (auto n:m1) {
		cout<<n.first<<n.second<<endl;
	}
}

void iteratorfailed2()
{
	unordered_map<string, int> m1;
	m1["one"]=1;
	m1["two"]=2;
	m1["three"]=3;
	unordered_map<string, int>::local_iterator itv;

	for (int i=0;i<m1.bucket_count();++i) {
		if (m1.bucket_size(i)>0) {
			for (unordered_map<string, int>::local_iterator it=m1.begin(i);it!=m1.end(i);++it) {
				if (it->first=="one") {
					itv=it;
					break;
				}
			}
		}
	}

	m1.rehash(1000);
	//m1.reserve(1000);
	cout<<m1.bucket_count()<<endl;
	cout<<itv->first<<itv->second<<endl;
}