C++【STL】【map和set】

桜キャンドル淵

已于 2022-10-29 10:26:06 修改

阅读量1.3k

点赞数

分类专栏：数据结构(C/C++) 文章标签： c++ 开发语言数据结构

于 2022-10-29 00:23:42 首次发布

本文链接：https://blog.csdn.net/weixin_62684026/article/details/127580730

版权

数据结构(C/C++) 专栏收录该内容

55 篇文章 17 订阅

订阅专栏

lower_bound和upper_bound

一、什么是map和set

map和set是关联性容器

在我们的STL中分为两类

序列式容器：vector list deque

关联式容器：map set

二、set

set的本质是我们之前学的key的模型（确认在不在）

C++【搜索二叉树】_桜キャンドル淵的博客-CSDN博客

初始化

①一个一个插入数据

    set<int> s;
	s.insert(4);
	s.insert(2);
	s.insert(1);

②在c++11中支持的，直接初始化多个值

	set<int> s = { 1, 2, 1, 6, 3, 8, 5 };

③使用迭代器区间进行初始化

	int a[] = { 1, 2, 1, 6, 3, 8, 5 };
	set<int> s(a, a + sizeof(a) / sizeof(int));

迭代器

// 排序 + 去重
	set<int>::iterator it = s.begin();
	while (it != s.end())
	{
		//*it = 10;
		cout << *it << " ";
		++it;
	}
	cout << endl;

范围for

（只要支持了迭代器就能够支持范围for，从底层上说，这个范围for和迭代器是一样的）

for (auto e : s)
	{
		cout << e << " ";
	}
	cout << endl;

我们这里打印出来的顺序都是从小到大的，因为它底层就是一棵搜索二叉树，所以能帮助我们排序和去重

void test_set()
{

	int a[] = { 1, 2, 1, 6, 3, 8, 5 };
	set<int> s(a, a + sizeof(a) / sizeof(int));
//	set<int, greater<int>> s(a, a + sizeof(a) / sizeof(int));

	// 排序 + 去重
	set<int>::iterator it = s.begin();
	while (it != s.end())
	{
		//*it = 10;
		cout << *it << " ";
		++it;
	}
	cout << endl;

	for (auto e : s)
	{
		cout << e << " ";
	}
	cout << endl;

}

降序（仿函数）

#include <iostream>
#include <functional>
#include <set>
void test_set()
{

	int a[] = { 1, 2, 1, 6, 3, 8, 5 };
//	set<int> s(a, a + sizeof(a) / sizeof(int));
	set<int, greater<int>> s(a, a + sizeof(a) / sizeof(int));

	// 排序 + 去重
	set<int>::iterator it = s.begin();
	while (it != s.end())
	{
		//*it = 10;
		cout << *it << " ";
		++it;
	}
	cout << endl;

	for (auto e : s)
	{
		cout << e << " ";
	}
	cout << endl;
}

降序（反向迭代器）

    set<int>::reverse_iterator rit = s.rbegin();
	while (rit != s.rend())
	{
		//*it = 10;
		cout << *rit << " ";
		++rit;
	}
	cout << endl;

这里我们的迭代器并没有暴露底层的数据结构

迭代器是不支持修改的！

// 排序 + 去重
	set<int>::iterator it = s.begin();
	while (it != s.end())
	{
		*it = 10;
		cout << *it << " ";
		++it;
	}

erase

#include <iostream>
#include <functional>
#include <set>
#include <string>
using namespace std;

void test_set()
{

	int a[] = { 1, 2, 1, 6, 3, 8, 5 };
	set<int> s(a, a + sizeof(a) / sizeof(int));

	// 排序 + 去重
	set<int>::iterator it = s.begin();
	while (it != s.end())
	{
		cout << *it << " ";
		++it;
	}
	cout << endl;

	s.erase(3);
	for (auto e : s)
	{
		cout << e << " ";
	}
	cout << endl;
}

find+erase组合使用

#include <iostream>
#include <functional>
#include <set>
#include <string>
using namespace std;

void test_set()
{

	int a[] = { 1, 2, 1, 6, 3, 8, 5 };
	set<int> s(a, a + sizeof(a) / sizeof(int));

	// 排序 + 去重
	set<int>::iterator it = s.begin();
	while (it != s.end())
	{
		cout << *it << " ";
		++it;
	}
	cout << endl;

	set<int>::iterator pos = s.find(2);
	if (pos != s.end())
	{
		s.erase(pos);
	}

	for (auto e : s)
	{
		cout << e << " ";
	}
	cout << endl;
}

上面两种find有什么区别呢？

第一种find是没有问题的，如果这个数在的话，就直接删掉了，不存在就不删

第二种find+erase的组合，如果不存在就会报错，因为我们的find如果没有找到目标的数据，就会返回end，然后我们的erase就会直接删除这个end所指向的位置，就会报错。

所以我们需要加上下面的判断，来保证我们找到了才删除

set<int>::iterator pos = s.find(20);
	if (pos != s.end())
	{
		s.erase(pos);
	}

count统计某个值的计数

可以用来判断某个值在不在

#include <iostream>
#include <functional>
#include <set>
#include <string>
using namespace std;

void test_set()
{
	int a[] = { 1, 2, 1, 6, 3, 8, 5 };
	set<int> s(a, a + sizeof(a) / sizeof(int));

	// 排序 + 去重
	set<int>::iterator it = s.begin();
	while (it != s.end())
	{
		cout << *it << " ";
		++it;
	}
	cout << endl;
	cout << s.count(3) << endl;
	cout << s.count(30) << endl;
}

这里的count是为了保证我们接口的一致性

我们的multiple-key set允许键值冗余，比方说插入了两个相同的2，这个容器是允许的

为了让这两个容器兼容，所以这个count都有。

lower_bound和upper_bound

void test_set2()
{
	std::set<int> myset;
	std::set<int>::iterator itlow, itup;

	for (int i = 1; i < 10; i++)
		myset.insert(i * 10); // 10 20 30 40 50 60 70 80 90

	itlow = myset.lower_bound(25); // >= val
	itup = myset.upper_bound(60);  // > val             //
	cout << "["<<*itlow <<","<<*itup<<"]"<<endl;

	myset.erase(itlow, itup);                     // 10 20 70 80 90

	std::cout << "myset contains:";
	for (std::set<int>::iterator it = myset.begin(); it != myset.end(); ++it)
		std::cout << ' ' << *it;
	std::cout << '\n';
}

我们这里erase(itlow,itup)之后，就将我们下限到上限之间的区间的所有的数都删除掉了

这里我们的指定的下限是25，但是它其实帮我们改成了30，我们的下限指定的是60，但是它其实帮我们改成了70.

所以 low_bound其实帮我们找的是大于等于我们给定的val的值

所以upper_bound其实是帮助我们找到小于等于我们给定的val的值

pair二元组

下面就是pair的定义

template <class T1, class T2>
struct pair
{
    typedef T1 first_type;
    typedef T2 second_type;
    T1 first;
    T2 second;
    pair(): first(T1()), second(T2())
    {}
    pair(const T1& a, const T2& b): first(a), second(b)
    {}
};

用来表示具有一一对应关系的一种结构，该结构中一般只包含两个成员变量key和value，key代表键值，value表示与key对应的信息。比如：现在要建立一个英汉互译的字典，那该字典中必然有英文单词与其对应的中文含义，而且，英文单词与其中文含义是一一对应的关系，即通过该应该单词，在词典中就可以找到与其对应的中文含义。

pair二元组和equal_range

void test_set2()
{
	std::set<int> myset;

	for (int i = 1; i <= 5; i++) myset.insert(i * 10);   // myset: 10 20 30 40 50

	std::pair<std::set<int>::const_iterator, std::set<int>::const_iterator> ret;
	ret = myset.equal_range(40); // xx <= val < yy

	std::cout << "the lower bound points to: " << *ret.first << '\n';
	std::cout << "the upper bound points to: " << *ret.second << '\n';
}

这里的equal_range中传入40，它就会帮助我们找到第一个小于等于40的值，也就是我们这里的40，和第一个大于等于我们的40的值，也就是我们这里的50

所以这两个值也就分别变成了我们这个二元素的first和second

multiset

multiset和set的区别不是很大，做大的区别就是允许键值冗余

void test_set3()
{
	int a[] = { 3,1, 2, 1, 6, 3, 8,3, 5,3};
	multiset<int> s(a, a + sizeof(a) / sizeof(int));

	// 排序
	for (auto e : s)
	{
		cout << e << " ";
	}
	cout << endl;
}

count返回对应的个数

void test_set3()
{
	// 21:16继续
	int a[] = { 3,1, 2, 1, 6, 3, 8,3, 5,3};
	multiset<int> s(a, a + sizeof(a) / sizeof(int));

	// 排序
	for (auto e : s)
	{
		cout << e << " ";
	}
	cout << endl;

	cout << s.count(1) << endl;
}

find

有多个时，返回中序的第一个

void test_set3()
{
	int a[] = { 3,1, 2, 1, 6, 3, 8,3, 5,3};
	multiset<int> s(a, a + sizeof(a) / sizeof(int));

	// 排序
	for (auto e : s)
	{
		cout << e << " ";
	}
	cout << endl;

	// find时，如果有多个值，返回中序的第一个
	auto pos = s.find(3);
	while (pos != s.end())
	{
		cout << *pos << " ";
		++pos;
	}
	cout << endl;
}

下面我们打印出来的就是中序的第一个3

erase

直接删除的话，就是删除所有的目标数据，

如果删除的是迭代器位置的话，就仅仅是删除那一个

直接删除目标数据

void test_set3()
{

	int a[] = { 3,1, 2, 1, 6, 3, 8,3, 5,3};
	multiset<int> s(a, a + sizeof(a) / sizeof(int));

	// 排序
	for (auto e : s)
	{
		cout << e << " ";
	}
	cout << endl;


//	// 删除所有的3
	s.erase(3);
	for (auto e : s)
	{
		cout << e << " ";
	}
	cout << endl;
}

删除迭代器指向的位置

void test_set3()
{
	int a[] = { 3,1, 2, 1, 6, 3, 8,3, 5,3};
	multiset<int> s(a, a + sizeof(a) / sizeof(int));

	// 排序
	for (auto e : s)
	{
		cout << e << " ";
	}
	cout << endl;

	// find时，如果有多个值，返回中序的第一个
	auto pos = s.find(3);
	pos = s.find(1);
	if (pos != s.end())
	{
		s.erase(pos);
	}
	for (auto e : s)
	{
		cout << e << " ";
	}
	cout << endl;
}

那如果我们找到了第一个3，我们想要找到第二个3怎么办？

我们找到的第一个3的位置++，也就是将我们的迭代器++，也就是找到了我们的第二个3

三、map

初始化

map<string, string> dict;

insert

pair<string, string> kv1("sort", "排序");
dict.insert(kv1);

void test_map1()
{
	map<string, string> dict;

	dict.insert(pair<string, string>("sort", "排序"));
	dict.insert(pair<string, string>("test", "测试"));
	dict.insert(pair<string, string>("string", "字符串"));
    //或者直接typedef一下，这样就不用输入pair<string, string>，而是直接输入DictKv就可以了
    //typedef就是为了解决我们这里类型太长的问题的。
	typedef pair<string, string> DictKV;
	dict.insert(DictKV("string", "xxx"));
    //直接使用make_pair，使用函数模板
	dict.insert(make_pair("left", "左边"));

	//map<string, string>::iterator it = dict.begin();
	auto it = dict.begin();
	while (it != dict.end())
	{
		//cout << (*it).first << (*it).second <<endl;
		cout << it->first << it->second << endl;
		++it;
	}
	cout << endl;

	for (const auto& kv : dict)
	{
		cout << kv.first << ":" << kv.second << endl;
	}
	cout << endl;
}

这里我们观察到我们的map其实是按照键值进行排序的，也就是ASC码值，并且具有去重功能

要是我们的键值对索引已经存在了！也就是key已经存在

那么要是再输入一个相同的key的键值对，无论value是不是相同的，这里的都不会更新的

也就是我们这里的xxx是不会把“字符串”给更新掉的！

make_pair

在上面的测试代码中，我们其实已经使用过make_pair来创建二元组了，下面我们再来深入了解一下make_pair

它是一个函数模板，优势是可以自动推导传入的参数类型，其实就是构造一个二元组进行返回。

这里的好处就是不用显式地去写模板参数

这个函数因为比较简短，所以其实一般都会被定义成inline函数，所以还是运行效率比较高的。

map的遍历

上面的测试代码中其实已经说过，我们这里再重新那说明一下

使用迭代器遍历

void test_map1()
{
	map<string, string> dict;

	dict.insert(pair<string, string>("sort", "排序"));
	dict.insert(pair<string, string>("test", "测试"));
	dict.insert(pair<string, string>("string", "字符串"));

	//map<string, string>::iterator it = dict.begin();
    //如果感觉太长了就直接用auto
	auto it = dict.begin();
	while (it != dict.end())
	{
        //我们可以将这个二元组的first元素和second元素都分别打印出来
		//cout << (*it).first << (*it).second <<endl;
        //it->->第一个->调用operator->，表示我们这一整个二元组的指针，也就是我们的pair*
        //第二个箭头就是用于指向我们的二元组中的第一个元素和第二个元素，
        //但是这样写太丑了，所以我们的编译器将这两个箭头合并成了一个箭头
		cout << it->first << it->second << endl;
		++it;
	}
	cout << endl;

}

范围for遍历

void test_map1()
{
	map<string, string> dict;

	dict.insert(pair<string, string>("sort", "排序"));
	dict.insert(pair<string, string>("test", "测试"));
	dict.insert(pair<string, string>("string", "字符串"));

	for (const auto& kv : dict)
	{
		cout << kv.first << ":" << kv.second << endl;
	}
	cout << endl;
}

map::operator[]

词频统计

#include<iostream>
#include<string>
#include<map>
using namespace std;

void test_map1()
{
    string arr[]={"苹果","香蕉","西瓜","苹果","香蕉","苹果","香蕉","西瓜","苹果","香蕉","猕猴桃","八月炸"};
    map<string,int> countMap;
    for(auto& str:arr)
    {
        map<string,int>::iterator it=countMap.find(str);
        if(it!=countMap.end())
        {
//            (*it).second++;
            it->second++;
        }
        else
        {
            countMap.insert(make_pair(str,1));
        }
    }
    map<string ,int>::iterator it=countMap.begin();
    while(it!=countMap.end())
    {
        cout<<it->first<<":"<<it->second<<endl;
        it++;
    }
}
int main()
{
    test_map1();
    return 0;
}

使用operator[]的词频统计

我们原来的数组之类的[]是用来支持随机访问的，这里的[]是给一个key，返回&value

这里我们的呢key就是水果，value就是次数

#include<iostream>
#include<string>
#include<map>
using namespace std;

void test_map1()
{
    string arr[]={"苹果","香蕉","西瓜","苹果","香蕉","苹果","香蕉","西瓜","苹果","香蕉","猕猴桃","八月炸"};
    map<string,int> countMap;
    for(auto& str:arr)
    {
        countMap[str]++;
    }
    map<string ,int>::iterator it=countMap.begin();
    while(it!=countMap.end())
    {
        cout<<it->first<<":"<<it->second<<endl;
        it++;
    }
}
int main()
{
    test_map1();
    return 0;
}

所以说这样的写法都是可以的

    map<string ,string >dict;
    dict.insert(make_pair("sort","排序"));
    dict["insert"];
    dict["insert"]="插入";
    dict["left"]="左边";

dist["insert"]就是插入的功能

dist["insert"]="插入"就是查找+修改的功能

dist["left"]="左边"就是插入+修改的功能

1、map中有这个key，返回value的引用。（查找，修改value）

2、map中没有key，就调用默认构造，插入一个新元素pair(key,v());，返回value的引用（插入+修改）

（这里的v()是一个匿名对象，也就是调用一个默认构造）

与at对比一下

相比于我们上面的[]，at的功能仅仅是查找+修改，找不到就抛异常

所以我们再来看我们上面的词频统计的代码

如果

1、str不在countMap中，插入pair(str,int())，然后再对返回的次数++

2、str在countMap中，返回value（次数）的引用，次数++

那是如何实现的呢

mapped_type& operator[] (const key_type& k);
{
    return (*((this->insert(make_pair(k,mapped_type()))).first)).second;
}

这里的insert，我们先去查找，如果已经有了，就不插入，这里的bool就是FALSE，

如果没有，就插入，这里的bool就是TRUE

这里的pair的模板的第一个参数是一个迭代器，

指向新插入的元素或者是跟当前的key相等的元素。

1、key已经在map中，返回pair(key_iterator,false)，（这里的key_iterator就是key所在的迭代器）

2、key不在map中，返回pair(new_key_iterator,true)，

理解了insert的返回值之后，我们再来看上面之前的operator[]的实现

这里我们将那个复杂的式子拆开，像下面这样进行查看

V& operator[](const K&key)
{
    pair<iterator,bool> ret=insert(make_pair(key,V()));
    //返回这里的迭代器指向的元素的second
    return ret.first->second;

}

这里的insert能返回iterator的好处就是我们这里的operator[]就能够直接获得这个迭代器的位置，否则我们的operator[]还需要先找我们的key，找得到的话返回其value，如果没有的话再调用insert，进行插入。

这样就需要遍历两边我们的树，计较麻烦。

multimap

multimap相对于map没有了operator[]，因为multimap允许键值冗余，所以我们根本不知道应该返回的是哪一个冗余的数据的value。

key不管你有没有存在，都进行插入，每一次插入都相当于是新的插入

    multimap<string ,string > mdict;
    mdict.insert(make_pair("sort","排序"));
    mdict.insert(make_pair("left","左边"));
    mdict.insert(make_pair("left","左边"));
    mdict.insert(make_pair("left","剩余"));
    mdict.insert(make_pair("sort","排序"));