C++基于字典树的文章词频统计

算法原理:

1.将文章中的大写改为小写

2.遍历字符串得到字典树

3.从字典树中遍历得到单词个数

利用这种结构可以得到各单词个数,且复杂度为O(n),对于大型数据处理有很好的效果

运行结果:

代码:

#include <iostream>
#include <string>

typedef struct word_node {
	char s;
	word_node *nextChar[26];
	int count;
}word_node,*word_tree;

//遍历字符串,建立字典树
void addNode(word_tree &w_node, char *str, int &index) {
	if (w_node == NULL) {
		w_node = (word_tree)malloc(sizeof(word_node));
		for (int i = 0; i < 26; i++)
			w_node->nextChar[i] = NULL;
		w_node->count = 0;
	}
	w_node->s = str[index];
	index++;
	if (str[index] >= 'a'&&str[index] <= 'z')
		addNode(w_node->nextChar[str[index] - 97], str, index);
	else w_node->count++;
}

//遍历字典树,每输出一个词组,就回到树根,并禁止再次访问
void getWord(word_tree &tree) {
	std::cout << tree->s;
	int i = 0;
	bool show = false;
	for (i = 0; i < 26; i++) {
		if (tree->nextChar[i] != NULL) {
			show = true;
			getWord(tree->nextChar[i]);
			break;
		}
	}
    i=0;
	for (; i < 26; i++)
		if (tree->nextChar[i] != NULL)
			break;
	if (i >= 26) {
		if (show == false) {
			std::cout << ":" << tree->count << std::endl;
			free(tree);
			tree = NULL;
		}
		if (show == true && tree->count == 0) {
			free(tree);
			tree = NULL;
		}
	}
}

int main()
{
	char str[] = "There are moments in life when you miss someone so much that you just want to pick them from your dreams and hug them for real! Dream what you want to dream;go where you want to go;be what you want to be,because you have only one life and one chance to do all the things you want to do."\
"May you have enough happiness to make you sweet, enough trials to make you strong, enough sorrow to keep you human, enough hope to make you happy ? Always put yourself in others’shoes.If you feel that it hurts you, it probably hurts the other person, too."\
"The happiest of people don’t necessarily have the best of everything; they just make the most of everything that comes along their way.Happiness lies for those who cry, those who hurt, those who have searched, and those who have tried, for only they can appreciate the importance of people"\
"who have touched their lives.Love begins with a smile, grows with a kiss and ends with a tear.The brightest future will always be based on a forgotten past, you can’t go on well in lifeuntil you let go of your past failures and heartaches."\
"When you were born, you were crying and everyone around you was smiling.Live your life so that when you die, you're the one who is smiling and everyone around you is crying."\
"Please send this message to those people who mean something to you, to those who have touched your life in one way or another, to those who make you smile when you really need "\
"it, to those that make you see the brighter side of things when you are really down, to those who you want to let them know that you appreciate their friendship.And if you don’t, don’t worry, nothing bad will happen to you, you will just miss out on the opportunity to brighten someone’s day with this message.";
	std::cout << str << std::endl;
	int length = 0;
	while (str[length] != '\0')length++;

	//全部改为小写
	for (int i = 0; i < length; i++) {
		if (str[i] >= 'A'&&str[i] <= 'Z') {
			str[i] = str[i] + 32;
		}
	}

	//初始化字典树
	word_tree w_tree;
	w_tree = (word_tree)malloc(sizeof(word_node));
	for (int j = 0; j < 26; j++)
		w_tree->nextChar[j] = NULL;
	w_tree->count = 0;

	//建立字典树
	int index = 0;
	while(str[index]!='\0'){
		if (str[index] >= 'a'&&str[index] <= 'z') {
			addNode(w_tree->nextChar[str[index]-97], str, index);
		}
		index++;
	}

	//不断获取字典树中的词组
	int i = 0;
	while(i<26){
		while (w_tree->nextChar[i] == NULL)i++;
		if(i<26)getWord(w_tree->nextChar[i]);
	}
	return 0;
}

  • 2
    点赞
  • 24
    收藏
    觉得还不错? 一键收藏
  • 7
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 7
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值