算法与数据结构学习之——简单的字典树实现

字典树,我小结起来就是通过为信息的每个组成单元建立顺序的对应关系,产生树形的数据结构存储这些信息。具体知识参考:这里

我实现的目标是将一篇英文文章中的单词逐个往字典树中添加相应的树枝,最后能够通过这棵树查找某个单词是否在这棵树中。

实际发现利用字典树查找单词,比按顺序查找节省约5/6的时间。

下面是我自己实现的字典树代码、字典树测试代码以及测试例子。


TrieTree.java

public class TrieTree {

	TrieNode root;
	
	public TrieTree() {
		root = new TrieNode();
	}
	
	public void add(String word) {
		char[] cs = word.toCharArray();
		root.add(cs, 0);
	}
	
	public boolean hasWord(String word) {
		char[] cs = word.toCharArray();
		for (char c : cs) {
			if (c < 97 || c > 122) {
				try {
					throw new Exception("'"+word+"' is not an English word.");
				} catch (Exception e) {
					// TODO Auto-generated catch block
					e.printStackTrace();
				} 
				return false;
			}
		}
		return root.find(cs, 0) == 1 ? true : false;
	}
	
	class TrieNode {
		char value;
		TrieNode[] childNodes = new TrieNode[26];
		boolean isLeaf = false;
		
		public TrieNode() {
			value = '\0';
			init(childNodes);
		}
		
		public TrieNode(char v) {
			value = v;
			init(childNodes);
		}
		
		/**
		 * 逐位扫描单词的字母对单词是否存在进行搜索
		 * @param cs char数组存储的单词
		 * @param index 目前调用栈中扫描的字母的下标
		 * @return 返回是否查找到的结果,0是找不到,1是找到
		 */
		public int find(char[] cs, int index) {
			if (index > cs.length-1) { //到达单词的尽头了
				if (isLeaf) { //如果刚好是叶子节点说明就是这个单词
					return 1;
				} else { //如果不是叶子节点,说明这个单词不在此树中
					return 0;
				}
			}
			if (hasChild(cs[index]) == false) { //如果没找到对应的字母,说明这个单词不在此树中
				return 0;
			} else { //找到这个位置的字母就继续递归寻找下一个字母是否有被记录
				TrieNode tNode = childNodes[(int)(cs[index]) - 97];
				return tNode.find(cs, index+1);
			}
		}
		
		public void add(char[] cs, int index) {
			if (index > cs.length-1) { //单词到达尽头,记此节点为叶节点
				isLeaf = true;
				return ;
			}
			if (hasChild(cs[index])) { //树中已经有这个位置的这个字母的节点,继续往下创建分枝
				TrieNode tNode = childNodes[(int)(cs[index]) - 97];
				tNode.add(cs, index+1);
			} else { //树中还没有这个位置的这个字母的节点,先创建这个节点,继续往下创建分枝
				TrieNode tNode = new TrieNode(cs[index]);
				childNodes[(int)(cs[index]) - 97] = tNode;
				tNode.add(cs, index+1);
			}
		}
		
		public boolean hasChild(char c) {
			if (childNodes[(int)c - 97] != null) {
				return true;
			}
			return false;
		}
		
		/**
		 * 初始化这个节点,所有的分枝暂先记为null
		 * @param a
		 */
		private void init(TrieNode[] a) {
			for (int i=0; i<a.length; i++) {
				a[i] = null;
			}
		}
	}
}


TestTrieTree.java

import java.io.BufferedReader;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;

public class TestTrieTree {
	public static void main(String[] args) {
		System.out.println("test trie tree\n------------------------------");
		StringBuffer sb = new StringBuffer();
		try {
			BufferedReader br = new BufferedReader(new FileReader(new File("src/article1.txt")));
			while(br.ready()) {
				sb.append(br.readLine().toLowerCase());
			}
		} catch (FileNotFoundException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
			return ; 
		} catch (IOException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
			return ;
		}
		String article = sb.toString();
		//将文章中无关的符号去掉
		article = article.replaceAll("\"", "");
		article = article.replaceAll("\\.", " ");
		article = article.replaceAll("  ", " ");
		article = article.replaceAll(",", "");
		article = article.replaceAll("-", "");
		article = article.replaceAll("'", "");
		article = article.replaceAll("\\(", "");
		article = article.replaceAll("\\)", "");
		System.out.println("article content:");
		System.out.println(article);
		String[] words = article.split(" ");
		ArrayList<String> wordList = new ArrayList<>();
		System.out.println("\ndrop words:");
		for (String word : words) {
			boolean legal = true;
			for (char c : word.toCharArray()) {
				if (c < 97 || c > 122) {
					System.out.print(word + ", ");
					legal = false;
					break;
				}
			}
			if (legal) {
				wordList.add(word);
			}
		}
		System.out.println("\nretain words:");
		System.out.println(wordList);
		System.out.println();

		System.out.println("contruct trie tree...");
		TrieTree tree = new TrieTree();
		for (String word : wordList) {
			tree.add(word);
		}
		System.out.println("contruct trie tree complete");

		System.out.println("------------------------------");
		System.out.println("find test 1st");
		System.out.println("is 'the' in tree: " + tree.hasWord("the"));
		
		System.out.println("------------------------------");
		System.out.println("find test 2nd");
		System.out.println("is 'cat' in tree: " + tree.hasWord("cat"));
		
		System.out.println("------------------------------");
		System.out.println("find test 3rd");
		System.out.println("is 'has2' in tree: " + tree.hasWord("has2"));

		System.out.println("------------------------------");
		System.out.println("find test 4th");
		System.out.println("is 'weapons' in tree: " + tree.hasWord("weapons"));

		//比较字典树查找和顺序一个个查找的效率,大概字典树查找的时间是按顺序一个个查找的1/6
		System.out.println("------------------------------");
		System.out.println("time consumption difference:");
		long start = System.nanoTime();
		boolean result = tree.hasWord("the");
		long end = System.nanoTime();
		System.out.println("is 'the' in tree: " + result + " in " + (end - start) + "ns");
		start = System.nanoTime();
		result = wordList.contains("the");
		end = System.nanoTime();
		System.out.println("is 'the' in list: " + result + " in " + (end - start) + "ns");
	}
}


测试用的英文文章,和代码一同放在src目录下

article1.txt

A group seeking an international ban on nuclear weapons has won the 2017 Nobel Peace Prize.

The Norwegian Nobel Committee is giving the prize to the International Campaign to Abolish Nuclear Weapons, or ICAN.

The head of the committee, Berit Reiss-Andersen, made the announcement on Friday.

Beatrice Fihn, executive director of the International Campaign to Abolish Nuclear Weapons (ICAN), celebrates after winning the Nobel Peace Prize 2017, in Geneva, Switzerland, Oct. 6, 2017. 
Beatrice Fihn, executive director of the International Campaign to Abolish Nuclear Weapons (ICAN), celebrates after winning the Nobel Peace Prize 2017, in Geneva, Switzerland, Oct. 6, 2017.
She said, "We live in a world where the risk of nuclear weapons being used is greater than it has been for a long time."

The Nobel committee said ICAN won for its work to bring attention to the catastrophic humanitarian effects of any use of nuclear weapons. The statement also praised the group for "its ground-breaking efforts to achieve a treaty-based prohibition of such weapons."

ICAN describes itself as a coalition of non-government groups in more than 100 countries. It began in Australia and was officially launched in Vienna in 2007.

ICAN's main goal is to support enactment of a United Nations treaty banning nuclear weapons. The treaty was approved in New York on July 7, 2017. The agreement, however, did not include nuclear powers, such as Britain, China, France, Russia and the United States.

The peace prize announcement comes as U.S. President Donald Trump has threatened Iran and North Korea over their nuclear activities.

President Trump told the U.N. General Assembly last month that he may be forced to "totally destroy" North Korea because of its nuclear program.

U.S. officials now say Trump is likely to decertify the international nuclear agreement with Iran. He has called the agreement the "worst deal ever negotiated."

The president is expected to announce his plans in a speech next week. The officials expect him to say the deal is not in the U.S. national interest. This would not cancel the 2015 agreement, but instead return it to Congress. Lawmakers would then have 60 days to decide whether to re-establish sanctions that were suspended under the agreement.

A decertification could possibly lead to talks on renegotiating the deal, although Iranian President Hassan Rouhani has said that is not under consideration.

I'm Caty Weaver.

VOANews reported this story. George Grow adapted the report for Learning English. His story includes information from the Associated Press and the Reuters news agency. Caty Weaver was the editor.

We want to hear from you. Write to us in the Comments Section.

以上,如果发现各种bug,欢迎留言。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值