（HDU）1247 - Hat’s Words 【字典树】+细心 or【耍流氓大法map】

最新推荐文章于 2017-10-10 10:02:35 发布

ACDoge

最新推荐文章于 2017-10-10 10:02:35 发布

阅读量374

点赞数

分类专栏： ☆9.C++ STL相关 ☆3.高级数据结构 ★HDOJ(杭电OJ) ----字典树(Trie树)

本文链接：https://blog.csdn.net/qq_35504607/article/details/60783378

版权

★HDOJ(杭电OJ) 同时被 3 个专栏收录

19 篇文章 0 订阅

订阅专栏

----字典树(Trie树)

12 篇文章 0 订阅

订阅专栏

☆3.高级数据结构

11 篇文章 0 订阅

订阅专栏

Hat’s Words

Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 65536/32768 K (Java/Others)
Total Submission(s): 14476 Accepted Submission(s): 5179

Problem Description

A hat’s word is a word in the dictionary that is the concatenation of exactly two other words in the dictionary.
You are to find all the hat’s words in a dictionary.

Input

Standard input consists of a number of lowercase words, one per line, in alphabetical order. There will be no more than 50,000 words.
Only one case.

Output

Your output should contain all the hat’s words, one per line, in alphabetical order.

Sample Input

  
  
   
   a
ahat
hat
hatword
hziee
word

Sample Output

  
  
   
   ahat
hatword

Author

戴帽子的

题意：给出一堆字符串，找出里面满足条件（可以由两个其中字符串）拼接而成的字符串。

分析：这一道题，我WA了许多发。首先给出一些样例的提示。

输入样例1：

ab
c
abc（前置空格）

输出样例1：

abc

输入样例2：

hat
word
hatword
hatword
hathat

输出样例2：

hatword
hatword
hathat （可见一个字符串可以重复使用）

输入样例3：

abcde
ab
cd
cde
abcd

输出样例3：

abcde （有人可能会漏了这个）
abcd

我们首先把字符串全部插入字典树，再一个个判断查询（暂时没想到能够查询判断一气呵成的办法）。

此题要求按照字典序输出，但是由于输入数据按照字典序输入，就不需要额外排序了。

像字典树结构和插入函数应该都会写了，重点是查询。我先给出改了N发后的AC代码。（数组字典树）

#include <bits/stdc++.h>

using namespace std;

struct node
{
    int next[26];
    bool isword;
    void init()
    {
        isword=false;
        memset(next,-1,sizeof(next));
    }
}T[1000000];

int tot,n;
char s[50010][500];

void insert(char* s)
{
    int i,p=0,len=strlen(s);
    for(i=0;i<len;i++)
    {
        int x=s[i]-'a';
        if(T[p].next[x]==-1)
        {
            T[tot].init();
            T[p].next[x]=tot++;
        }
        p=T[p].next[x];
    }
    T[p].isword=true;
}

bool search(char* s)           //查找函数
{
    int flag=0;
    int i,j,p=0,len=strlen(s);
    for(i=0;i<len;i++)         //第一个for循环找到所有分段点
    {
        int x=s[i]-'a';
        if(T[p].next[x]==-1) return false;
        else p=T[p].next[x];
        if(T[p].isword&&p)        //表示可从当前位置分段
        {
            int q=0;               //这一步很关键，从根结点重新查询
            for(j=i+1;j<len;j++)          //第二个for循环是判断能匹配否到字符串尾
            {
                int y=s[j]-'a';
                if(T[q].next[y]==-1) break; /*return false;*/
                else q=T[q].next[y];
            }
            if(T[q].isword&&q&&j==len) flag=1;  //还要记住判断字符串尾是否构成单词
        }
    }
    return flag;
}

int main()
{
    T[0].init(); tot=1; n=0;
    while(gets(s[n]))
    {
        insert(s[n++]);
    }
    for(int i=0;i<n;i++)
    {
        if(search(s[i]))
            printf("%s\n",s[i]);
    }
    return 0;
}

我对于search函数要进行多一点的解释：

错误（一）：在第二个for循环时，int q=0 原先错误写法为p=0。即使我们知道了要从根节点重新匹配查询，也不能轻易重置p的值，因为外层循环还没有结束。如果重置p，在内层循环结束后，外层循环实际上已经被破坏了。

错误（二）：内循环break处，原先错误写法为

/*return false;*/

我这个错误很容易导致辅助样例三的错误。假设我给出了字符串abcde、abc、de、ab、cd。原本abcde可以由abc和de组成，但是由于外层循环的顺序，会先判断ab为前字符串，而后面匹配到cd就返回false值，显然不对。这表明函数在循环内，不要轻易的return值来终止，有可能忽略情况。

错误（三）：判断字符串尾是否构成单词，正确写法——

if(T[q].isword&&q&&j==len) flag=1;

错误写法——

if(T[q].isword&&q) flag=1;

首先解释为什么我喜欢加上一个&&q，因为q=0的时候是根节点，有时候考虑根节点会产生难以注意的错误。

然后为什么要加上j==len呢？我们假设一下不加上这个条件。给出字符串abcde、ab、cd。对于abcde，判断出ab为前字符串后，匹配后面的字符，cd之后break内循环，但是错误写法的if语句是成立的。因为cd单词的确存在啊！加上j==len，来确保匹配进行到了原先字符串的末尾。

p.s:关于要不要在search函数加入辅助变量flag，我原先使用return 语句debug的时候，总是会漏情况或者判断错误，索性来了这么个稳妥的方式，但我个人认为可以不使用flag这个辅助变量。

说了这么多，search函数好难写啊，我在看其他人代码时，看到了一个人使用指针字典树，其它地方写法和我异曲同工，但是search函数巧妙的加上了一个参数，并且补充了一个临时的字符串。他在找出前字符子串后，直接将后面部分拆出来，并且使用同一个函数判断是否在字典树中是一个单词。妙哉妙哉~收下Doge的膝盖。

#include<iostream>
#include<cstdio>
#include<algorithm>
#include<cstring>
using namespace std;
struct node{
    node *next[26];
    int mark;
}*head,point[90000];
int iii=0;
char chr[200];
char cha[50004][200];
node *new_node(){
    node *p=&point[iii++];
    for(int i=0;i<26;i++)
        p->next[i]=NULL;
    p->mark=0;
    return p;
}
void insert_node(char cha[]){
    node *p=head;
    for(int i=0;cha[i];i++){
        if(p->next[cha[i]-'a']==NULL)
            p->next[cha[i]-'a']=new_node();
        p=p->next[cha[i]-'a'];
    }
    p->mark=1;
}
int find_node(char cha[],int flag){
    node *p=head;
    for(int i=0;cha[i];i++){
        if(p->next[cha[i]-'a']==NULL)
            return 0;
        p=p->next[cha[i]-'a'];
        if(p->mark&&flag){
            int j;
            for(j=i+1;cha[j];j++)
                chr[j-(i+1)]=cha[j];
            chr[j-(i+1)]='\0';
            if(find_node(chr,0)==1)
                return 2;
        }
    }
    if(p->mark) return 1;
    else return 0;
}
int main(){
    int num=0;
    head=new_node();
    while(~scanf("%s",cha[num])){
        insert_node(cha[num++]);
    }
    for(int i=0;i<num;i++){
        if(find_node(cha[i],1)==2)
            printf("%s\n",cha[i]);
    }
    return 0;
}

最牛逼的还是不按套路出牌的map+string写法了：

#include <iostream>  
#include <string>  
#include <map>  
  
using namespace std;  
  
map <string, int> m_v;  
  
string str[50006];  
  
int main() {  
    int k(-1);                            //统计总字符串个数
    while(cin >> str[++k]) {  
        m_v[str[k]] = 1;                  //存在的标记1
    }  
    for(int i = 0; i <= k; i++) {         //遍历每个字符串进行分拆
        int e = str[i].size()-1;  
        for(int j = 1; j < e; j++) {      //j表示str[i]分拆的位置
            string s1(str[i], 0, j);      //前部分
            string s2(str[i], j);         //后部分
            if(m_v[s1] == 1 && m_v[s2] == 1) {  
                cout << str[i] << endl;  
                break;  
            }  
        }                                  //牛逼啊！
    }  
    return 0;  
}