Restoration of string
A substring of some string is called the most frequent, if the number of its occurrences is not less than number of occurrences of any other substring.
You are given a set of strings. A string (not necessarily from this set) is called good if all elements of the set are the most frequent substrings of this string. Restore the non-empty good string with minimum length. If several such strings exist, restore lexicographically minimum string. If there are no good strings, print "NO" (without quotes).
A substring of a string is a contiguous subsequence of letters in the string. For example, "ab", "c", "abc" are substrings of string "abc", while "ac" is not a substring of that string.
The number of occurrences of a substring in a string is the number of starting positions in the string where the substring occurs. These occurrences could overlap.
String a is lexicographically smaller than string b, if a is a prefix of b, or a has a smaller letter at the first position where a and b differ.
Input
The first line contains integer n (1 ≤ n ≤ 105) — the number of strings in the set.
Each of the next n lines contains a non-empty string consisting of lowercase English letters. It is guaranteed that the strings are distinct.
The total length of the strings doesn't exceed 105.
OutputPrint the non-empty good string with minimum length. If several good strings exist, print lexicographically minimum among them. Print "NO" (without quotes) if there are no good strings.
Examples4 mail ai lru cf
cfmailru
3 kek preceq cheburek
NO
One can show that in the first sample only two good strings with minimum length exist: "cfmailru" and "mailrucf". The first string is lexicographically minimum.
给出很多字符串,要求构造一个字符串,使得所有给出的字符串在这个串当中都是出现次数最多子串。输出长度最短的答案。如果有多个答案,输出字典序最小的。
首先,答案中每个字母最多出现一次。因为如果一个字母出现多次,那么这单个的字母就是出现次数最多的子串了。
接着,我们发现答案中的字母有一定的顺序,顺序必须和所给的若干个单词中的顺序一致。要求有序,我们想到了构造有向图进行拓扑排序,把每个字母当做不同的点,最多26个点,把每个单词当中的前一个字母c1向后一个字母c2连边,表示答案当中c1必须在c2前面。
实际上,因为顺序不能矛盾,所以构造的图去掉重边之后,只能是很多条链组成的森林,不能有环和自环,否则答案就是NO。
接着就遍历所有链,把这些链构成的若干个词连起来就是答案。
需要注意的是一个串长度为1的情况,这时若这单个的字母没有连边,可能会被答案忽略。特判把它加上就好了。
code:
#include <iostream>
#include <cstdio>
#include <cstring>
#include <string>
using namespace std;
const int maxn = 1e5+10;
int in[35],out[35];
int G[35][35];
string a[maxn],ans;
int single[35],vis[35];
bool dfs(int now){//递归一条链
vis[now] = 1;
ans += (char)(now + 'a');
for(int i = 0; i < 26; i++){
if(G[now][i]){//如果其中的字符出现过了,说明这个字母出现了多次返回false
if(vis[i]) return false;
return dfs(i);
}
}
return true;//最终返回true
}
int main(){
int n;
scanf("%d",&n);
memset(G,0,sizeof(G));
memset(single,0,sizeof(single));
for(int i = 1; i <= n; i++){
cin >> a[i];
int len = a[i].length();
for(int j = 0; j < len-1; j++){
G[a[i][j]-'a'][a[i][j+1]-'a'] = 1;
}//存图,字符串前一个指向后一个,保留顺序
if(len == 1){//如果只有一个字母,单独保存
single[a[i][0]-'a'] = 1;
}
}
for(int i = 0; i < 26; i++){
if(G[i][i]){//有自环 如aa
printf("NO\n");
return 0;
}
}
memset(in,0,sizeof(in));
memset(out,0,sizeof(out));
for(int i = 0; i < 26; i++){
for(int j = 0; j < 26; j++){
if(G[i][j]){
in[j]++;
out[i]++;//计算每个顶点的出度和入度
}
}
}
for(int i = 0; i < 26; i++){
if(in[i] > 1 || out[i] > 1){//这种情况是有多个串指向同一个字母或同一个字母作为了多个串的起点,都造成一个字母出现多次
printf("NO\n");//如 abc ade 或者 apq beq
return 0;
}
}
memset(vis,0,sizeof(vis));
ans = "";
for(int i = 0; i < 26; i++){
if(out[i] != 0 && in[i] == 0){
if(!dfs(i)){//如果在遍历一个链的过程中这个链中某个字母还指向了其他已经遍历过的字母,说明字母出现多次
printf("NO\n");//比如有ahpq bcpz
return 0;
}
}
//因为单独一个点的情况出度入度都是0不会走上面的判断语句只能单独判断比如单独一个a
if(single[i] && in[i] == 0 && out[i] == 0)
ans += (char)(i + 'a');
}
//如果本身就是一个环,所有点的入度都大于1,不会走上面的判断语句,所以还要单独判断
for(int i = 0; i < 26; i++){
if(out[i] != 0 || in[i] != 0){
if(!vis[i]){//肯定有环类似abca这种
printf("NO\n");
return 0;
}
}
}
cout << ans << endl;
return 0;
}