POJ - 2408 Anagram Groups(让人窒息的字符串桶排序)

Anagram Groups
Time Limit: 1000MS Memory Limit: 65536K
Description
World-renowned Prof. A. N. Agram's current research deals with large anagram groups. He has just found a new application for his theory on the distribution of characters in English language texts. Given such a text, you are to find the largest anagram groups. 
A text is a sequence of words. A word w is an anagram of a word v if and only if there is some permutation p of character positions that takes w to v. Then, w and v are in the same anagram group. The size of an anagram group is the number of words in that group. Find the 5 largest anagram groups.
Input
The input contains words composed of lowercase alphabetic characters, separated by whitespace(or new line). It is terminated by EOF. You can assume there will be no more than 30000 words.
Output
Output the 5 largest anagram groups. If there are less than 5 groups, output them all. Sort the groups by decreasing size. Break ties lexicographically by the lexicographical smallest element. For each group output, print its size and its member words. Sort the member words lexicographically and print equal words only once.
Sample Input
undisplayed
trace
tea
singleton
eta
eat
displayed
crate
cater
carte
caret
beta
beat
bate
ate
abet
Sample Output
Group of size 5: caret carte cater crate trace .
Group of size 4: abet bate beat beta .
Group of size 4: ate eat eta tea .
Group of size 1: displayed .

Group of size 1: singleton .

主要就是排序,但是应该创建一个结构体,所有相等的字符串都具有相同的最小字典序状态(即把一个字符串的字符拆分组合之后,形成新的字符串是最小字典序状态。)

#include <iostream>
#include <cstdio>
#include <cstring>
#include <map>
#include <algorithm>
using namespace std;

map<string,int> ma;

map<string,bool> ak;
struct node{
	char init[25];///初始状态
	char str[25];///最小字典序状态
	int len;
}str[30005];
struct work{
	bool flag;
	char init[25];
}fa[5][30005];///用来做桶排序
bool cmp (node a,node b)
{
	int temp = strcmp(a.init,b.init);
	if (temp <=0){
		return true;
	}
	return false;
}
void fun(node* s) ///桶排序,对所有字符串打乱成最小字典序状态,将这个状态称为该字符串源头
{
	int code[26] = {0},len = strlen(s->init);
	for (int i = 0;i < len; i++){
		code[s->init[i] - 'a']++;
	}
	int por = 0;
	for (int i = 0;i < 26;i ++){
		while (code[i]--){
			s->str[por++] = 'a'+i;
		}
	}
}
int main ()
{
	int i = 0;
	int c = 0;
	while (~scanf ("%s",str[i].init)){
		fun(&str[i]);
		str[i].len = strlen(str[i].init);
		i++;
	}
	//cout << '\n';
	sort(str,str+i,cmp);///用最小字典序状态作为标准来排序
	for (int j = 0;j < i; j++){
		//cout << str[j].init << '\n';
		ma[str[j].str]++;///记录该最小字典序出现次数,用来做桶排序
	}
	for (int j = 0;j < i; j++){
		for (int i = 0;i < 5; i++){///这里也是个桶排序,按照出现次数的多少将其来排序,并且桶中存放的是最小字典序状态
			///问题就在于会出现重复次数相同的字符串,所以用多维数组做桶排序,当前一维已经存放过了就放到下一维
			///由于遍历顺序按照字典序遍历,又最多五组答案,所以如果相同的重复次数超过五次,那么之后第六次的字典序一定小于
			///前几次的,不需要记录,所以只要“五个桶”。
			if (!fa[i][ma[str[j].str]].flag){
				strcpy(fa[i][ma[str[j].str]].init,str[j].str);
				fa[i][ma[str[j].str]].flag = true;
				ma[str[j].str] = 0;
				break;
			}
		}
	}
	int cut = 0;
	for (int j = 29999;j > 0; j--){///从后往前遍历桶
		for (int k = 0;k < 5; k++){
			if (fa[k][j].flag){///桶中有元素
				cout << "Group of size "<< j << ":";
				for (int cur = 0;cur < i; cur++){ ///对于桶中每一个元素,遍历输出给的数据中的具有相同最小字典序状态的字符串
					if (!strcmp(fa[k][j].init,str[cur].str) && ak[str[cur].init] != true){
						cout << ' ' <<str[cur].init;
						ak[str[cur].init] = true;///注意输出过的数据不可重复输出
					}
				}
				cout << " .\n";
				cut++;
			}
			if (cut == 5){///输出了五组答案就结束
				j = 0;
				break;
			}
		}
	}
	return 0;
}

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值