【CROC 2016 — QualificationC】【STL大联合】Hostname Aliases 输出拥有相同后缀集合的全体字符串

本文链接：https://blog.csdn.net/snowy_smile/article/details/50935129

There are some websites that are accessible through several different addresses. For example, for a long time Codeforces was accessible with two hostnames codeforces.com and codeforces.ru.

You are given a list of page addresses being queried. For simplicity we consider all addresses to have the form http://<hostname>[/<path>], where:

<hostname> — server name (consists of words and maybe some dots separating them),
/<path> — optional part, where <path> consists of words separated by slashes.

We consider two <hostname> to correspond to one website if for each query to the first <hostname> there will be exactly the same query to the second one and vice versa — for each query to the second <hostname> there will be the same query to the first one. Take a look at the samples for further clarifications.

Your goal is to determine the groups of server names that correspond to one website. Ignore groups consisting of the only server name.

Please note, that according to the above definition queries http://<hostname> and http://<hostname>/ are different.

Input

The first line of the input contains a single integer n (1 ≤ n ≤ 100 000) — the number of page queries. Then follow n lines each containing exactly one address. Each address is of the form http://<hostname>[/<path>], where:

<hostname> consists of lowercase English letters and dots, there are no two consecutive dots, <hostname> doesn't start or finish with a dot. The length of <hostname> is positive and doesn't exceed 20.
<path> consists of lowercase English letters, dots and slashes. There are no two consecutive slashes, <path> doesn't start with a slash and its length doesn't exceed 20.

Addresses are not guaranteed to be distinct.

Output

First print k — the number of groups of server names that correspond to one website. You should count only groups of size greater than one.

Next k lines should contain the description of groups, one group per line. For each group print all server names separated by a single space. You are allowed to print both groups and names inside any group in arbitrary order.

Examples

input

10
http://abacaba.ru/test
http://abacaba.ru/
http://abacaba.com
http://abacaba.com/test
http://abacaba.de/
http://abacaba.ru/test
http://abacaba.de/test
http://abacaba.com/
http://abacaba.com/t
http://abacaba.com/test

output

1
http://abacaba.de http://abacaba.ru

input

14
http://c
http://ccc.bbbb/aba..b
http://cba.com
http://a.c/aba..b/a
http://abc/
http://a.c/
http://ccc.bbbb
http://ab.ac.bc.aa/
http://a.a.a/
http://ccc.bbbb/
http://cba.com/
http://cba.com/aba..b
http://a.a.a/aba..b/a
http://abc/aba..b/a

output

2
http://cba.com http://ccc.bbbb 
http://a.a.a http://a.c http://abc

#include<stdio.h>
#include<iostream>
#include<string.h>
#include<string>
#include<ctype.h>
#include<math.h>
#include<set>
#include<map>
#include<vector>
#include<queue>
#include<bitset>
#include<algorithm>
#include<time.h>
using namespace std;
void fre() { freopen("c://test//input.in", "r", stdin); freopen("c://test//output.out", "w", stdout); }
#define MS(x,y) memset(x,y,sizeof(x))
#define MC(x,y) memcpy(x,y,sizeof(x))
#define MP(x,y) make_pair(x,y)
#define ls o<<1
#define rs o<<1|1
typedef long long LL;
typedef unsigned long long UL;
typedef unsigned int UI;
template <class T1, class T2>inline void gmax(T1 &a, T2 b) { if (b>a)a = b; }
template <class T1, class T2>inline void gmin(T1 &a, T2 b) { if (b<a)a = b; }
const int N = 1e5+10, M = 0, Z = 1e9 + 7, ms63 = 0x3f3f3f3f;
int casenum, casei;
int n;
char a[N][50],s[50];
map<string, set<string>> SerToPath;
map<set<string>, set<string>> PathToSer;
map<set<string>, set<string>>::iterator it;
set<string>::iterator itt;
int main()
{
	while (~scanf("%d", &n))
	{
		SerToPath.clear();
		PathToSer.clear();
		//第一步：把字符串所对应的后缀全体插入这个字符串的set中
		for (int i = 1; i <= n; ++i)
		{
			scanf("%s", s); 
			int l = strlen(s);
			int j; for (j = 7; j<l; ++j)if (s[j] == '/')break;//http://
			if (j == l) SerToPath[s].insert("^");
			else
			{
				s[j] = 0;
				SerToPath[s].insert(s + j + 1);
			}
			strcpy(a[i], s);
		}
		//第二步：把字符串对应的后缀集合的set插入字符串
		for (int i = 1; i <= n; ++i)
		{
			strcpy(s, a[i]);
			if (SerToPath.count(s))
			{
				PathToSer[SerToPath[s]].insert(s);
				SerToPath.erase(s);
			}
		}
		int ans= 0;
		for (it = PathToSer.begin(); it != PathToSer.end(); ++it)
		{
			//|| *it->first.begin() == ""
			//if (it->first.size() == 1 && (*it->first.begin() == "^" ))continue;
			if (it->second.size() > 1)++ans;
		}
		printf("%d\n", ans);
		for (it = PathToSer.begin(); it != PathToSer.end(); ++it)
		{
			//|| *it->first.begin() == ""
			//if (it->first.size() == 1 && (*it->first.begin() == "^"))continue;
			if (it->second.size() > 1)
			{
				for (itt = it->second.begin(); itt != it->second.end(); ++itt)
				{
					cout << *itt << " ";
				}
				puts("");
			}
		}
	}
	return 0;
}
/*
【trick&&吐槽】
Ignore groups consisting of the only server name.
you should count only groups of size greater than one.

the only表示"唯一的"，以上这两句话的意思是相同的。
而并不是说，你要忽略输出以下这样的group——
2
http://123.com
http://456.com

【题意】
all addresses to have the form http://<hostname>[/<path>],
<hostname> — server name (consists of words and maybe some dots separating them),
只有小写英文字母和'.'，没有两个连续的'.'，不以'.'开头或结尾，长度为[1,20]

/<path> — optional part, where <path> consists of words separated by slashes.
只有小写英文字母和'.','/'，没有两个连续的'/'，不以'/'开头或结尾，长度为[0,20]

为属于同一个网站的不同域名分组，忽略只有server name的组，输出size>1的组

输出：
k组数
每组输出所有的server name，每组的后缀名的可能的种类相同

【类型】
STL-SET && STL-MAP

【分析】
我们随便拿一个，然后找其对应的所有后缀，扔进set里排序去重。
再把这有着所有后缀的set，对应着所有的前缀

【时间复杂度&&优化】
O(>_<)

*/