[UVA10815]安迪的第一个字典 Andy‘s First Dictionary 题解(集合:set详解)

本文链接：https://blog.csdn.net/qq_73635134/article/details/128228829

Andy, 8, has a dream - he wants to produce his very own dictionary. This is not an easy task for him, as the number of words that he knows is, well, not quite enough. Instead of thinking up all the words himself, he has a briliant idea. From his bookshelf he would pick one of his favourite story books, from which he would copy out all the distinct words. By arranging the words in alphabetical order, he is done! Of course, it is a really time-consuming job, and this is where a computer program is helpful. You are asked to write a program that lists all the different words in the input text. In this problem, a word is defined as a consecutive sequence of alphabets, in upper and/or lower case. Words with only one letter are also to be considered. Furthermore, your program must be CaSe InSeNsItIvE. For example, words like “Apple”, “apple” or “APPLE” must be considered the same.

Input

The input file is a text with no more than 5000 lines. An input line has at most 200 characters. Input is terminated by EOF.

Output

Your output should give a list of different words that appears in the input text, one in a line. The words should all be in lower case, sorted in alphabetical order. You can be sure that he number of distinct words in the text does not exceed 5000.

Sample Input

Adventures in Disneyland

Two blondes were going to Disneyland when they came to a fork in the road. The sign read: "Disneyland Left."

So they went home.

Sample Output

adventures

blondes

came

disneyland

fork

going

home

left

read

road

sign

the

they

two

went

were

when

题意翻译

输入一个文本，找出所有不同的单词（连续的字母序列），按字典序从小到大输出，单词不区分大小写。

题解

关于set

set作为一个容器也是用来存储同一数据类型的数据类型，并且能从一个数据集合中取出数据，在set中每个元素的值都唯一，而且系统能根据元素的值自动进行排序。应该注意的是set中数元素的值不能直接被改变。C++ STL中标准关联容器set, multiset, map, multimap内部采用的就是一种非常高效的平衡检索二叉树：红黑树，也成为RB树(Red-Black Tree)。RB树的统计性能要好于一般平衡二叉树，所以被STL选择作为了关联容器的内部结构。

注意：

1、set中的元素都是排好序的

2、set集合中没有重复的元素

set中常用的方法

begin()     　　       //返回set容器第一个元素的迭代器
end() 　　　　         //返回一个指向当前set末尾元素的下一位置的迭代器
clear()   　　         //删除set容器中的所有的元素
empty() 　　　         //判断set容器是否为空
max_size() 　          //返回set容器可能包含的元素最大个数
size() 　　　　        //返回当前set容器中的元素个数
rbegin()　　　　       //返回的值和end()相同
rend()　　　　         //返回的值和begin()相同
count()               //用来查找set中某个某个键值出现的次数
erase(iterator)       //删除定位器iterator指向的值
erase(first,second)   //删除定位器first和second之间的值
erase(key_value)      //删除键值key_value的值
find()                //返回给定值值得定位器，如果没找到则返回end()
lower_bound(key_value)//返回第一个大于等于key_value的定位器
upper_bound(key_value)//返回最后一个大于等于key_value的定位器

还要注意begin() 和 end()函数是不检查set是否为空的，使用前最好使用empty()检验一下set是否为空

标准答案

#include<iostream>
#include<string>
#include<set>
#include<sstream>
using namespace std;

set<string> dict;				//string集合
 
int main(){
	string s,buf;
	while(cin>>s){				//一个一个读取
		for(int i=0;i<s.length();i++){
			if(isalpha(s[i])){
				s[i]=tolower(s[i]);//将读到的字母转为小写
			}else{
				s[i]=' ';		//如果是空格就不看
			}
		}
		stringstream ss(s);
		while(ss>>buf){
			dict.insert(buf);
		}
	}	
	for(set<string>::iterator it=dict.begin();it!=dict.end();++it){
		cout<<*it<<"\n";
	}
	
	return 0;
}

迭代器（iterator）是一种可以遍历容器元素的数据类型。迭代器是一个变量，相当于容器和操纵容器的算法之间的中介。C++更趋向于使用迭代器而不是数组下标操作，因为标准库为每一种标准容器（如vector、map和list等）定义了一种迭代器类型，而只有少数容器（如vector）支持数组下标操作访问容器元素。可以通过迭代器指向你想访问容器的元素地址，通过*x打印出元素值。这和我们所熟知的指针极其类似。

C语言有指针，指针用起来十分灵活高效。
C++语言有迭代器，迭代器相对于指针而言功能更为丰富。

vector，是数组实现的，也就是说，只要知道数组的首地址，就能访问到后面的元素。所以，我们可以通过访问vector的迭代器来遍历vector容器元素。
list，是链表实现的，我们知道，链表的元素都存储在一段不是连续的地址空间中。我们需要通过next指针来访问下一个元素。那么，我们也可以通过访问list的迭代器来实现遍历list容器元素。