字典树的学习旅程

最新推荐文章于 2024-07-31 07:39:23 发布

置顶池塘的蜗牛

最新推荐文章于 2024-07-31 07:39:23 发布

阅读量554

点赞数

分类专栏：算法

本文链接：https://blog.csdn.net/zh533749/article/details/17224165

版权

算法专栏收录该内容

52 篇文章 0 订阅

订阅专栏

字典树是一种类似于二叉搜索树的数据类型，不过在我看到的所有数据结构书中，还没有一个介绍到

关于字典树的相关文章已经有很多文章做过详细的介绍，所以在此我就不做过多介绍，大家可以参考：http://www.cnblogs.com/tanky_woo/archive/2010/09/24/1833717.html

http://blog.csdn.net/v_july_v/article/details/6897097。这两篇文章中都对字典书做过详细介绍，不过实现代码有些许问题，所以在此我只对实现过程做一下介绍：

#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#include<unistd.h>
//###############################################

#define Max 26
#define Debug

struct Tree 
{
	struct Tree *next[Max];
	int v[Max]; 
};

struct Tree *Head;

void CreateHead()
{
	int i = 0;
	Head = (struct Tree *)malloc(sizeof(struct Tree));
	for(i = 0; i < Max; i++)
	{
		Head->next[i] = NULL;
		Head->v[i] = 0;
	}
	
}
	

void CreateTree(char *str)
{
	int len = strlen(str);
	int i,j;
	struct Tree* ptr = Head;
	struct Tree* temp = NULL;
	int id;
	for(i = 0; i < len; i++)
	{
		id = str[i] - 'a';
		if(id <0 ||id >26 )
			break;
		
		if(ptr->next[id] == NULL)
		{
			temp= (struct Tree *)malloc(sizeof(struct Tree));
			temp->v[id] = 1;
				
			for(j = 0; j < Max; j++)
			{
				temp->next[j] = NULL;
				
				if(j != id)
				temp->v[j]= 0;
			}
			ptr->next[id] = temp;
			ptr = ptr->next[id];
		}
		
		else
		{
			
			(ptr->next[id]->v[id])++;
			ptr= ptr->next[id];
		}
	}

}

int FindTree(char *str)
{
	int len = strlen(str);
	struct Tree *ptr = Head;
	int i;
	int id;
	for(i = 0; i < len; i++)
	{
		id = str[i] - 'a';
		if(id <0 ||id >26 )
			break;
		
		if(ptr->next[id] == NULL)
		{
			printf(" not find the string----------------\n");
			return 0;
		}
		
		if(ptr->next[id] != NULL && (i ==len -1))
		{
			printf("@------%c is a sub string-----------------------\n",str[i]);
			printf("the common qian zhui num:  %d\n", ptr->next[id]->v[id]);
			sleep(1);
			return 0;
		}
		ptr = ptr->next[id];
	}
	
}

int main()
{
	int num;
	FILE *fp;
	fp = fopen("1.txt", "r");
	char *string[26] = {"a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"};
	char str[15];
	memset(str, '\0', sizeof(str));
	CreateHead();
	int i = 0;
	while(!feof(fp))
	{
		fgets(str, 15, fp);
		CreateTree(str);   //创建树的过程
		//printf("%d----------%s",i, str);
		i++;
	}
	
	for(num=0; num < 26; num ++)
		FindTree(string[num]);   //搜索以每一个字母开头的单词的个数，当然你也可以用“abc”，不过这样就是一abc开头单词的个数。
	//printf("num is :%d\n",i);
}

为了对程序进行测试我专门，下载了一个有7000多单词的字典：http://wenku.baidu.com/view/5c993ed5360cba1aa811daec.html此为下载地址：不过其中有很多汉字等其它字符。所以我有写了一个python脚本对这个txt文档进行过滤。

#!/usr/bin/python
import os
import time 
import re
fp = open("direc.txt", 'r+')

i = 0;
if not os.path.exists("2.txt"):
	print "file is not exit touch one"
	os.system("touch 1.txt")
	time.sleep(5)
	,,
fp1 = open("2.txt",'w+')
##########################################################################################
pattern = re.compile(r'^\w*')    ##匹配一所有字符开头的行
#pattern = re.compile(r'^abc*')
#######################################################################################33
for line in fp:
	match = pattern.match(line)
	print "--------------------------------------"
	if match:
		i = 0
		#match.group() = match.group() + '\n'
		
		b =match.group() + '\n'
		
		print match.group()
		fp1.write(b)
fp.close()
fp1.close()

这样就会产生我们所需要的字典，小插曲（在写这个python脚本时，出了点小问题，那天晚上11点，我运行了一下脚本，感觉结果是对的，所以我就让程序自动执行，电脑不断刷屏，我以为执行慢，所以就回宿舍了，第二天，发现程序执行完了，可是打不开最后查了下文件属性，200M多，原来是程序死循环

，好可怕幸好不是在公司，不然。。。。其是我还是一个学生）

程序比较简单所以，就不做介绍，写这篇博客也就是，给自己的学习过程留下一点脚印。

池塘的蜗牛

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
字典树的学习旅程

字典树是一种类似于二叉搜索树的数据类型，不过在我看到的所有数据结构书中，还没有一个介绍到关于字典树的相关文章已经有很多文章做过详细的介绍，所以在此我就不做过多介绍，大家可以参考：http://www.cnblogs.com/tanky_woo/archive/2010/09/24/1833717.html http://blog.csdn.net/v_july_v/article/de
复制链接

扫一扫

专栏目录