字典树（）第一接触字典树感觉都是板子吧

最新推荐文章于 2024-01-09 11:05:17 发布

SDUTyangkun

最新推荐文章于 2024-01-09 11:05:17 发布

阅读量822

点赞数

分类专栏：图论

图论专栏收录该内容

22 篇文章 0 订阅

订阅专栏

转自http://blog.csdn.net/piaocoder/article/details/47836559

字典树

字典树，又称单词查找树，Trie树，是一种树形结构，哈希表的一个变种。用于统计，排序和保存大量的字符串（也可以保存其

的）。

优点就是利用公共的前缀来节约存储空间。在这举个简单的例子：比如说我们想储存3个单词，nyist、nyistacm、nyisttc。如果只是

单纯的按照以前的字符数组存储的思路来存储的话，那么我们需要定义三个字符串数组。但是如果我们用字典树的话，只需要定义

一个树就可以了。在这里我们就可以看到字典树的优势了。

基本的操作

1.定义（即定义结点）

struct node{  
    int cnt;  
    struct node *next[26];  
    node(){  
        cnt=0;  
        memset(next,0,sizeof(next));  
    }  
};

next是表示每层有多少种类的数，如果只是小写字母，则26即可，若改为大小写字母，则是52，若再加上数字，则是62了，这里根

据题意来确定。

cnt可以表示一个字典树到此有多少相同前缀的数目，这里根据需要应当学会自由变化。

2.插入（即建树过程）

构建Trie树的基本算法也很简单，无非是逐一把每则单词的每个字母插入Trie。插入前先看前缀是否存在。如果存在，就共享，否则

创建对应的节点和边。比如要插入单词add（已经插入了单词“ad”），就有下面几步：

考察前缀"a"，发现边a已经存在。于是顺着边a走到节点a。

考察剩下的字符串"dd"的前缀"d"，发现从节点a出发，已经有边d存在。于是顺着边d走到节点ad

考察最后一个字符"d"，这下从节点ad出发没有边d了，于是创建节点ad的子节点add，并把边ad->add标记为d。

<pre name="code" class="cpp">void buildtrie(char *s){  
    node *p = root;  
    node *tmp = NULL;  
    int l = strlen(s);  
    for(int i = 0; i < l; ++i){  
        if(p->next[s[i]-'a'] == NULL){  
            tmp = new node;  
            p->next[s[i]-'a'] = tmp;  
        }  
        p = p->next[s[i]-'a'];  
        p->cnt++;  
    }  
}

3.查找

（1）每次从根结点开始进行搜索；

（2）取要查找关键词的第一个字母，并根据该字母选择对应的子树并转到该子树继续进行检索；

（3）在相应的子树上，取得要查找关键词的第二个字母,并进一步选择对应的子树进行检索；　

（4）迭代刚才过程。。。

（5）直到在某个结点处：

——关键词的所有字母都被取出，则读取附在该结点上的信息，即完成查找。

——该结点没有任何信息，则输出该关键词不在此字典树里。

void findtrie(char *s){  
    node *p = root;  
    int l = strlen(s);  
    for(int i = 0; i < l; ++i){
        if(p->next[s[i]-'a'] == NULL){
            printf("0\n");  
            return;  
        }  
        p = p->next[s[i]-'a'];  
    }
    printf("%d\n",p->cnt);  
}

4.释放内存

有些题目，数据比较大，需要查询完之后释放内存（比如：hdu1671 Phone List）

递归释放内存

void del(node *root){
	for(int i = 0; i < 10; ++i)
		if(root->next[i])
            del(root->next[i]);
    delete(root);
}

注意事项：

1.用G++交会出现Memory Limit Exceeded（就算释放了内存，还是Memory Limit Exceeded）

2.根结点要初始化root=new node;

练习：

hdu 1251 统计难题

题目链接：http://acm.hdu.edu.cn/showproblem.php?pid=1251

解题思路：http://blog.csdn.net/piaocoder/article/details/41552691

hdu 2072 单词数

题目链接：http://acm.hdu.edu.cn/showproblem.php?pid=2072

解题思路：http://blog.csdn.net/piaocoder/article/details/41902793

hdu 1671 Phone List

题目链接：http://acm.hdu.edu.cn/showproblem.php?pid=1671

解题思路：http://blog.csdn.net/piaocoder/article/details/47951011

POJ 2001 Shortest Prefixes

题目链接：http://poj.org/problem?id=2001

解题思路：http://blog.csdn.net/piaocoder/article/details/47731321

POJ 2418 Hardwood Species

题目链接：http://poj.org/problem?id=2418

解题思路：http://blog.csdn.net/piaocoder/article/details/47731453

POJ 2503 Babelfish

题目链接：http://poj.org/problem?id=2503

解题思路：http://blog.csdn.net/piaocoder/article/details/47731701

Hardwood Species

Time Limit: 10000MS		Memory Limit: 65536K
Total Submissions: 23210		Accepted: 9032

Description

Hardwoods are the botanical (植物学的) group of trees that have broad leaves, produce a fruit or nut, and generally go dormant (休眠的) in the winter.
America's temperate (温和的) climates produce forests with hundreds of hardwood species -- trees that share certain biological characteristics. Although oak, maple and cherry all are types of hardwood trees, for example, they are different species. Together, all the hardwood species represent 40 percent of the trees in the United States.

On the other hand, softwoods, or conifers (针叶树), from the Latin word meaning "cone-bearing," have needles. Widely available US softwoods include cedar, fir, hemlock (铁杉), pine, redwood, spruce (云杉) and cypress. In a home, the softwoods are used primarily as structural lumber such as 2x4s and 2x6s, with some limited decorative applications.

Using satellite imaging technology, the Department of Natural Resources has compiled an inventory of every tree standing on a particular day. You are to compute the total fraction of the tree population represented by each species.

Input

Input to your program consists of a list of the species of every tree observed by the satellite; one tree per line. No species name exceeds 30 characters. There are no more than 10,000 species and no more than 1,000,000 trees.

Output

Print the name of each species represented in the population, in alphabetical order, followed by the percentage of the population it represents, to 4 decimal places.

Sample Input

Red Alder
Ash
Aspen
Basswood
Ash
Beech
Yellow Birch
Ash
Cherry
Cottonwood
Ash
Cypress
Red Elm
Gum
Hackberry
White Oak
Hickory
Pecan
Hard Maple
White Oak
Soft Maple
Red Oak
Red Oak
White Oak
Poplan
Sassafras
Sycamore
Black Walnut
Willow

Sample Output

Ash 13.7931
Aspen 3.4483
Basswood 3.4483
Beech 3.4483
Black Walnut 3.4483
Cherry 3.4483
Cottonwood 3.4483
Cypress 3.4483
Gum 3.4483
Hackberry 3.4483
Hard Maple 3.4483
Hickory 3.4483
Pecan 3.4483
Poplan 3.4483
Red Alder 3.4483
Red Elm 3.4483
Red Oak 6.8966
Sassafras 3.4483
Soft Maple 3.4483
Sycamore 3.4483
White Oak 10.3448
Willow 3.4483
Yellow Birch 3.4483

Hint

This problem has huge input, use scanf instead of cin to avoid time limit exceeded.

#include<stdio.h>
#include<string.h>
using namespace std;
struct node
{
    int cnt;
    struct node *next[130];
    node()
    {
        cnt=0;
        memset(next,0,sizeof(next));

    }
};

struct node *root=NULL;
int n;
void build(char *s)
{
    struct node *p=root,*tmp;
    int i,len =strlen(s);
    for(i=0;i<len;i++)
    {
        if(p->next[s[i]]==NULL)
        {
            tmp=new node;
            p->next[s[i]]=tmp;
        }
        p=p->next[s[i]];
    }
    p->cnt++;

}
void dfs(struct node *root)
{
    static int pos;
    static char word[31];

    int i;
    if(root->cnt)
    {
        word[pos]='\0';
        printf("%s %.4lf\n",word,root->cnt*100*1.0/n);
    }
    for(i=0;i<130;i++)
    {
        if(root->next[i])
        {
            word[pos++]=i;
            dfs(root->next[i]);
            pos--;
        }
    }
}
int main()
{
    char str[31];
     n=0;
     root=new node;
     while(gets(str)&&str[0]!='\0')
     {
         build(str);
         n++;
     }
     dfs(root);
     return 0;
}