在MapReduce下实现的id3算法

最新推荐文章于 2021-02-22 20:57:34 发布

时长河

最新推荐文章于 2021-02-22 20:57:34 发布

阅读量802

点赞数

分类专栏： hadoop linux 文章标签： id3 MapReduc

本文链接：https://blog.csdn.net/qq_33016527/article/details/82731798

版权

linux 同时被 2 个专栏收录

4 篇文章 0 订阅

订阅专栏

hadoop

2 篇文章 0 订阅

订阅专栏

(需实现装好hadoop分布式环境配好jdk，安装eclipse linux版

数据源source.txt

青绿,蜷缩,浊响,清晰,凹陷,硬滑,是
乌黑,蜷缩,沉闷,清晰,凹陷,硬滑,是
乌黑,蜷缩,浊响,清晰,凹陷,硬滑,是
青绿,蜷缩,沉闷,清晰,凹陷,硬滑,是
浅白,蜷缩,浊响,清晰,凹陷,硬滑,是
青绿,稍蜷,浊响,清晰,稍凹,软粘,是
乌黑,稍蜷,浊响,稍糊,稍凹,软粘,是
乌黑,稍蜷,浊响,清晰,稍凹,硬滑,是
乌黑,稍蜷,沉闷,稍糊,稍凹,硬滑,否
青绿,硬挺,清脆,清晰,平坦,软粘,否
浅白,硬挺,清脆,模糊,平坦,硬滑,否
浅白,蜷缩,浊响,模糊,平坦,软粘,否
青绿,稍蜷,浊响,稍糊,凹陷,硬滑,否
浅白,稍蜷,沉闷,稍糊,凹陷,硬滑,否
乌黑,稍蜷,浊响,清晰,稍凹,软粘,否
浅白,蜷缩,浊响,模糊,平坦,硬滑,否
青绿,蜷缩,沉闷,稍糊,稍凹,硬滑,否

将数据源上传到hdfs上

hdfs dfs -mkdir /id3
hdfs dfs -mkdir /id3/source
hdfs dfs -put source.txt /id3/source/

用于去重统计各自变因素各属性在对应决策下出现的次数

CountData.java------------------------------第一个MapReduce的过程

import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.StringTokenizer;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.DoubleWritable;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

public class CountData {
	private String[] path;
	private static List<String> already;

	public CountData(String[] path, List<String> already) {
		this.path = path;
		CountData.already = already;
	}

	public static class EntropyMapper extends Mapper<Object, Text, Text, IntWritable> {
		public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
			IntWritable one = new IntWritable(1);
			StringTokenizer sto = new StringTokenizer(value.toString());
			String[] rowtitle = { "1", "2", "3", "4", "5", "6", "7" };
			while (sto.hasMoreTokens()) {

				String[] rowditails = sto.nextToken("\n").split(",");
				List<String> rowlist = Arrays.asList(rowditails);
				boolean b = true;
				for (int j = 0; j < already.size(); j++) {
					b = b && rowlist.contains(already.get(j));
				}
				if (b == true) {
					String keyString = rowditails[rowditails.length - 1];
					Text k = new Text(keyString);
					context.write(k, one);
					if (already.size() == 0) {
						for (int i = 0; i < rowditails.length - 1; i++) {

							String keyString1 = rowtitle[i] + "&" + rowditails[i] + "&"
									+ rowditails[rowditails.length - 1];
							Text k1 = new Text(keyString1);
							context.write(k1, one);
						}
					} else {
						for (int i = 0; i < rowditails.length - 1; i++) {

							if (already.contains(rowditails[i])) {
								System.out.println(rowditails[i]);
							}else {
							String keyString1 = rowtitle[i] + "&" + rowditails[i] + "&"
									+ rowditails[rowditails.length - 1];
							Text k1 = new Text(keyString1);
							context.write(k1, one);}
						}
					}

				}

			}
		}
	}

	public static class Entropyreducer extends Reducer<Text, IntWritable, Text, IntWritable> {
		private IntWritable result = new IntWritable();

		public void reduce(Text key, Iterable<IntWritable> values, Context context)
				throws IOException, InterruptedException {
			int sum = 0;
			for (IntWritable val : values) {
				sum += val.get();
			}
			result.set(sum);
			context.write(key, result);
		}

	}

//	public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
//		Configuration conf = new Configuration();
//		conf.set("fs.defaultFS", "hdfs://127.0.0.1:9000");
//		conf.set("mapred.textoutputformat.separator", "&");
//		Job job = Job.getInstance(conf, "countdata");
//		job.setInputFormatClass(TextInputFormat.class);
//		FileInputFormat.setInputPaths(job, new Path("/id3/source/*"));
//		job.setJarByClass(CountData.class);
//		job.setMapperClass(EntropyMapper.class);
//		job.setMapOutputKeyClass(Text.class);
//		job.setMapOutputValueClass(IntWritable.class);
//
//		job.setReducerClass(Entropyreducer.class);
//
//		job.setOutputKeyClass(Text.class);
//		job.setOutputValueClass(IntWritable.class);
//		job.setOutputFormatClass(TextOutputFormat.class);
//		FileOutputFormat.setOutputPath(job, new Path("/id3/System_output2"));
//		job.waitForCompletion(true);
//	}

	public void domr() throws IOException, ClassNotFoundException, InterruptedException {
		Configuration conf = new Configuration();
		conf.set("fs.defaultFS", "hdfs://127.0.0.1:9000");
		conf.set("mapred.textoutputformat.separator", "&");
		Job job = Job.getInstance(conf, "countdata");
		job.setInputFormatClass(TextInputFormat.class);
		FileInputFormat.setInputPaths(job, new Path(path[0]));
		job.setJarByClass(CountData.class);
		job.setMapperClass(EntropyMapper.class);
		job.setMapOutputKeyClass(Text.class);
		job.setMapOutputValueClass(IntWritable.class);

		job.setReducerClass(Entropyreducer.class);

		job.setOutputKeyClass(Text.class);
		job.setOutputValueClass(IntWritable.class);
		job.setOutputFormatClass(TextOutputFormat.class);
		FileOutputFormat.setOutputPath(job, new Path(path[1]));
		job.waitForCompletion(true);
	}

}

第一趟结果

hdfs dfs -cat /id3/system_output/*

1&乌黑&否&2
1&乌黑&是&4
1&浅白&否&4
1&浅白&是&1
1&青绿&否&3
1&青绿&是&3
2&硬挺&否&2
2&稍蜷&否&4
2&稍蜷&是&3
2&蜷缩&否&3
2&蜷缩&是&5
3&沉闷&否&3
3&沉闷&是&2
3&浊响&否&4
3&浊响&是&6
3&清脆&否&2
4&模糊&否&3
4&清晰&否&2
4&清晰&是&7
4&稍糊&否&4
4&稍糊&是&1
5&凹陷&否&2
5&凹陷&是&5
5&平坦&否&4
5&稍凹&否&3
5&稍凹&是&3
6&硬滑&否&6
6&硬滑&是&6
6&软粘&否&3
6&软粘&是&2
否&9
是&8

//计算个各列（各自变因素）各属性信息熵

DataEntroy.java-------------------------------------------------第二个MapReduce的过程


import java.io.IOException;
import java.util.List;
import java.util.Set;
import java.util.StringTokenizer;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.DoubleWritable;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.Reducer.Context;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

import com.wdx.test.CountData.EntropyMapper;
import com.wdx.test.CountData.Entropyreducer;

public class DataEntroy {
	public static class EntropyMapper extends Mapper<Object, Text, Text,Text> {
		public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
			//IntWritable one = new IntWritable(1);
			StringTokenizer sto = new StringTokenizer(value.toString());
			//String[] rowtitle = {"1","2","3","4","5","6","7"};
			while (sto.hasMoreTokens()) {
				String[] rowditails = sto.nextToken("\n").split("&");
				String keyString;
				String valueString;
				if(rowditails.length==4) {
					keyString = rowditails[0]+"&"+rowditails[1];
					valueString = rowditails[2]+"&"+rowditails[3];
				}else {
					keyString = "system&";
					valueString = rowditails[0]+"&"+rowditails[1];
				}
					
				
				Text k = new Text(keyString);
				Text v = new Text(valueString);
				
				context.write(k, v);
			}
		}
	}

	public static class Entropyreducer extends Reducer<Text, Text, Text, Text> {
		//private DoubleWritable result = new DoubleWritable();

		public void reduce(Text key, Iterable<Text> values, Context context)
				throws IOException, InterruptedException {
			double sum = 0;
			int i=0;
			double[] val = new double[2];
			String[] type = new String[2];
			for (Text val1 : values) {
				type[i]=val1.toString().split("&")[0];
				sum += Integer.parseInt(val1.toString().split("&")[1]);
				val[i++]=Integer.parseInt(val1.toString().split("&")[1]);
			}
			if(val[1]!=0) {
				double n1 = (val[0]/sum);
				double n2 = (val[1]/sum);
				double r = -n1*Math.log(n1)/Math.log(2)-n2*Math.log(n2)/Math.log(2);
				Text  result = new Text(r+"&"+sum);
				context.write(key, result);
			}else {
				if(type[0].trim().equals("是")) {
					
					DoubleWritable result1 = new DoubleWritable(1);
					Text  result = new Text(result1+"&"+sum);
					context.write(key, result);
				}else {
					DoubleWritable result1 = new DoubleWritable(0);
					Text  result = new Text(result1+"&"+sum);
					context.write(key, result);
				}
				
			}
			
		}

	}

//	public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
//		Configuration conf = new Configuration();
//		conf.set("fs.defaultFS", "hdfs://127.0.0.1:9000");
//		conf.set("mapred.textoutputformat.separator", "&");
//		Job job = Job.getInstance(conf, "countdata");
//		job.setInputFormatClass(TextInputFormat.class);
//		FileInputFormat.setInputPaths(job, new Path("/id3/System_output/*"));
//		job.setJarByClass(DataEntroy.class);
//		job.setMapperClass(EntropyMapper.class);
//		job.setMapOutputKeyClass(Text.class);
//		job.setMapOutputValueClass(Text.class);
//
//		job.setReducerClass(Entropyreducer.class);
//
//		job.setOutputKeyClass(Text.class);
//		job.setOutputValueClass(DoubleWritable.class);
//		job.setOutputFormatClass(TextOutputFormat.class);
//		FileOutputFormat.setOutputPath(job, new Path("/id3/SystemEntroy_output8"));
//		job.waitForCompletion(true);
//	}
	
	
	
	

}

第一趟结果

hdfs dfs -cat /id3/system_output1/*

1&乌黑&0.9182958340544896&6.0
1&浅白&0.7219280948873623&5.0
1&青绿&1.0&6.0
2&硬挺&0.0&2.0
2&稍蜷&0.9852281360342516&7.0
2&蜷缩&0.954434002924965&8.0
3&沉闷&0.9709505944546688&5.0
3&浊响&0.9709505944546688&10.0
3&清脆&0.0&2.0
4&模糊&0.0&3.0
4&清晰&0.7642045065086203&9.0
4&稍糊&0.7219280948873623&5.0
5&凹陷&0.863120568566631&7.0
5&平坦&0.0&4.0
5&稍凹&1.0&6.0
6&硬滑&1.0&12.0
6&软粘&0.9709505944546688&5.0
system&&0.9975025463691153&17.0

//获取相对信息增益（各列信息熵）由于信息增益＝系统信息熵－自变因素信息熵；所以用自变因素信息熵最小的相对表示信息增益最大

EntroyCreamnet.java------------------------------------------第三个MapReduce的过程



import java.io.IOException;
import java.util.StringTokenizer;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.DoubleWritable;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;


public class EntroyCreamnet {
	public static class EntropyMapper extends Mapper<Object, Text, Text,DoubleWritable> {
		public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
			//IntWritable one = new IntWritable(1);
			StringTokenizer sto = new StringTokenizer(value.toString());
			//String[] rowtitle = {"1","2","3","4","5","6","7"};
			while (sto.hasMoreTokens()) {
				String[] rowditails = sto.nextToken("\n").split("&");
				if(!(rowditails[0].equals("system"))) {
					String keyString = rowditails[0];
					double valueString =Double.parseDouble(rowditails[2])*Double.parseDouble(rowditails[3]);
						
					
					Text k = new Text(keyString);
					DoubleWritable v = new DoubleWritable(valueString);
					
					context.write(k, v);
				}
				
			}
		}
	}

	public static class Entropyreducer extends Reducer<Text, DoubleWritable, Text, DoubleWritable> {
			private DoubleWritable result = new DoubleWritable();
		
		public void reduce(Text key, Iterable<DoubleWritable> values, Context context) throws IOException, InterruptedException {
			double sum = 0;
			for (DoubleWritable val : values) {
				sum += val.get();
			}
			result.set(sum);		
			context.write(key, result);		
		}
			
		}


//	public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
//		Configuration conf = new Configuration();
//		conf.set("fs.defaultFS", "hdfs://127.0.0.1:9000");
//		conf.set("mapred.textoutputformat.separator", "&");
//		Job job = Job.getInstance(conf, "EntroyCreamnet");
//		job.setInputFormatClass(TextInputFormat.class);
//		FileInputFormat.setInputPaths(job, new Path("/id3/SystemEntroy_output8/*"));
//		job.setJarByClass(EntroyCreamnet.class);
//		job.setMapperClass(EntropyMapper.class);
//		job.setMapOutputKeyClass(Text.class);
//		job.setMapOutputValueClass(DoubleWritable.class);
//
//		job.setReducerClass(Entropyreducer.class);
//
//		job.setOutputKeyClass(Text.class);
//		job.setOutputValueClass(DoubleWritable.class);
//		job.setOutputFormatClass(TextOutputFormat.class);
//		FileOutputFormat.setOutputPath(job, new Path("/id3/Systemcreament_output1"));
//		job.waitForCompletion(true);
//	}
//	
	
	
	
}

第一趟结果

hdfs dfs -cat /id3/system_output2/*

1&15.119415478763749
2&14.532068975639483
3&14.564258916820032
4&10.487481033014394
5&12.041843979966417
6&16.854752972273346

主要算法实现过程

ID3.java



import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Comparator;
import java.util.HashSet;
import java.util.List;
import java.util.Set;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.DoubleWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

import com.wdx.test.DataEntroy.EntropyMapper;
import com.wdx.test.DataEntroy.Entropyreducer;

public class ID3 {
	public static GTree<String> tree = new GTree<>();// 一颗通用树
	public String[] title = {"1","2","3","4","5","6","7"};//自变因素列表
	public String[][]  a = {
			{"青绿","乌黑","浅白"},
			{"蜷缩","稍缩","硬挺"},
			{"浊响","沉闷","清脆"},
			{"清晰","稍糊","模糊"},
			{"凹陷","稍凹","平坦"},
			{"硬滑","软粘"}		
	};//自变因素属性矩阵

	//数据去重分类计数
	public static void countData(String[] path,List<String> already) throws IOException, ClassNotFoundException, InterruptedException {
		CountData  countData = new CountData(path,already);
		countData.domr();
	}
	
	//计算个各列各属性信息熵
	public static void dataEntroy(String[] path) throws IOException, ClassNotFoundException, InterruptedException {
		Configuration conf = new Configuration();
		conf.set("fs.defaultFS", "hdfs://127.0.0.1:9000");
		conf.set("mapred.textoutputformat.separator", "&");
		Job job = Job.getInstance(conf, "DataEntroy");
		job.setInputFormatClass(TextInputFormat.class);
		FileInputFormat.setInputPaths(job, new Path(path[0]));
		job.setJarByClass(DataEntroy.class);
		job.setMapperClass(DataEntroy.EntropyMapper.class);
		job.setMapOutputKeyClass(Text.class);
		job.setMapOutputValueClass(Text.class);

		job.setReducerClass(DataEntroy.Entropyreducer.class);

		job.setOutputKeyClass(Text.class);
		job.setOutputValueClass(Text.class);
		job.setOutputFormatClass(TextOutputFormat.class);
		FileOutputFormat.setOutputPath(job, new Path(path[1]));
		job.waitForCompletion(true);
	}
	//获取相对信息增益（各列信息熵）由于信息增益＝系统信息熵－自变因素信息熵；所以用自变因素信息熵最小的相对表示信息增益最大
	public static void entroyCreamnet(String[] path) throws IOException, ClassNotFoundException, InterruptedException {
		Configuration conf = new Configuration();
		conf.set("fs.defaultFS", "hdfs://127.0.0.1:9000");
		conf.set("mapred.textoutputformat.separator", "&");
		Job job = Job.getInstance(conf, "EntroyCreamnet");
		job.setInputFormatClass(TextInputFormat.class);
		FileInputFormat.setInputPaths(job, new Path(path[0]));
		job.setJarByClass(EntroyCreamnet.class);
		job.setMapperClass(EntroyCreamnet.EntropyMapper.class);
		job.setMapOutputKeyClass(Text.class);
		job.setMapOutputValueClass(DoubleWritable.class);

		job.setReducerClass(EntroyCreamnet.Entropyreducer.class);

		job.setOutputKeyClass(Text.class);
		job.setOutputValueClass(DoubleWritable.class);
		job.setOutputFormatClass(TextOutputFormat.class);
		FileOutputFormat.setOutputPath(job, new Path(path[1]));
		job.waitForCompletion(true);
	}
	
	
	
	
	
	
	// 构建树
	
	public TreeNode<String> makeTree(String[] path,String name,TreeNode root,List<String> already) throws ClassNotFoundException, IOException, InterruptedException {

		
		countData(path,already);
		String[] path1 = {path[1]+"/*","/id3/"+name+"_output1"};
		dataEntroy(path1);
		String[] path2= {path1[1]+"/*","/id3/"+name+"_output2"};
		entroyCreamnet(path2);
		Configuration conf = new Configuration();
		HdfsDAO hdao = new HdfsDAO(conf);
		String all = hdao.cat(path2[1]+"/part-r-00000");
		String[] lines = all.split("\n");
		//int index=0;
		// 找信息增益最大的自变因素
		String maxname="";
		double min=10000;
		for(int i=0;i<lines.length;i++) {
			String[] details = lines[i].split("&");
			if(min>Double.parseDouble(details[1].trim()) ){
				min = Double.parseDouble(details[1].trim());
				maxname = details[0];
				
			}
		
		}
		int index = Arrays.binarySearch(title, maxname);
		TreeNode node = new TreeNode(maxname,null);
//		tree.insert(null, node);
		if (root == null) {
			
			root = node;
			tree.insert(null, root);
		} else {
			

			tree.insert(root, node);
		}
		//获取该列自变因素属性列表
		String[] clomndetails= a[index];
		index=index+1;
		Double[] entroy = new Double[clomndetails.length];
		//读取ｏｕｔout1文件
		String t = hdao.cat(path1[1]+"/part-r-00000");
		String[] tdatils =  t.split("\n");
		//获取自变因素对应熵
		for(int i=0;i<tdatils.length;i++) {
			for(int j=0;j<clomndetails.length;j++) {
				if(tdatils[i].split("&")[0].equals(""+index)&&tdatils[i].split("&")[1].equals(clomndetails[j])) {
					entroy[j]=Double.parseDouble(tdatils[i].split("&")[2]);
				}
			}
		}
		for(int k=0;k<clomndetails.length;k++) {
			List newalready = new ArrayList<>(already);//记录已经确定的属性，
			if(entroy[k]==0) {
				TreeNode node1 = new TreeNode(clomndetails[k],null);
				tree.insert(node, node1);
				TreeNode n = new TreeNode("no",null);
				tree.insert(node1, n);
			}else if(entroy[k]==1) {
				TreeNode node1 = new TreeNode(clomndetails[k],null);
				tree.insert(node, node1);
				TreeNode n = new TreeNode("yes",null);
				tree.insert(node1, n);
			}else {
				newalready.add(clomndetails[k]);
				TreeNode n = new TreeNode(clomndetails[k],null);
				tree.insert(node, n);
				String[] newpath = {"/id3/source/*","/id3/"+clomndetails[k]+"_output"};
				makeTree(newpath,clomndetails[k],n,newalready);
			}
		}
		return root;

	}

	public static void main(String[] args) throws ClassNotFoundException, IOException, InterruptedException {
		ID3 id3 = new ID3();
		String[] path = {"/id3/source/*","/id3/system_output"};
		List<String> already = new ArrayList<>();
		TreeNode<String> treenode = id3.makeTree(path,"system",null,already);
		tree.Travelsal1(treenode, 1);
	}

}

数据结构支持



import java.util.ArrayList;
import java.util.List;

//通用树的节点
public class TreeNode<T>{
	private Object value;//数据区
	private List<TreeNode<T>> childlist;//孩子节点指针集合
	public TreeNode(){	
		value = null;
		childlist = new ArrayList<>();
	}
	
	public TreeNode(Object value,List<TreeNode<T>> childList) {
		this.value = value;
		if(childList!=null) {
			this.childlist = childList;
		}else {
			this.childlist=new ArrayList<>();
		}
		
	}

	public Object getValue() {
		return value;
	}

	public void setValue(Object value) {
		this.value = value;
	}

	public List<TreeNode<T>> getChildlist() {
		return childlist;
	}

	public void setChildlist(List<TreeNode<T>> childlist) {
		this.childlist = childlist;
	}
	
	
}



public class GTree<T> {
	// 根节点
	public TreeNode<T> root = null;

	// 插入
	public boolean insert(TreeNode<T> parent, TreeNode<T> node) {
		if (root == null) {
			root = node;
			return true;
		} else {
			if (findOne(root, parent)) {

				// 留待考虑
				// TODO 这里会不会直接修改节点的list,待考虑
				return parent.getChildlist().add(node);
			}
		}
		return false;
	}

	/**
	 * 
	 * @param tRoot要参照的根节点
	 * @param one要查找的节点
	 * @return 是否存在这个节点
	 */
	public boolean findOne(TreeNode<T> tRoot, TreeNode<T> one) {
		boolean b = false;
		// 参照根结点为空，则该节点一定不存在
		if (tRoot == null) {
			return false;
		}
		//
		if (tRoot == one) {
			return true;
		}

		if (tRoot.getChildlist() != null) {
			int length = tRoot.getChildlist().size();
			for (int i = 0; i < length; i++) {
				TreeNode<T> node = tRoot.getChildlist().get(i);
				if (node == one) {
					return true;
				} else {
					if (node.getChildlist().size() != 0) {
						b = b || findOne(node, one);
					}
				}
			}
		} else {
			return false;
		}

		return b;
	}

	// 遍历
	/**
	 * 
	 * @param root
	 *            根节点
	 * @param l
	 *            层数
	 */
	public void Travelsal(TreeNode<String> root, int l) {
		int temp = l * 10;

		if (root != null) {
			if (l == 1) {
				System.out.printf("|--%-10s--", root.getValue().toString());
			}

			if (root.getChildlist() != null && root.getChildlist().size() != 0) {
				l++;
				int length = root.getChildlist().size();
				for (int i = 0; i < length; i++) {
					TreeNode<String> node = root.getChildlist().get(i);
					System.out.printf("|--%-10s--", node.getValue());

					Travelsal(node, l);

					System.out.print("\n");
					int temp1 = temp;
					temp = temp + (temp / 10) * 5;
					System.out.printf("%-" + temp + "s", " ");
					temp = temp1;

				}
			}
		}
	}

	public void Travelsal1(TreeNode<String> root, int l) {

		System.out.print("|");
		int length = l * 3;
		for (int i = 1; i < length + 1; i++) {
			System.out.print("-");
			if(i%3==0) {
				System.out.print("|");
			}
		}
		//System.out.print("|");
		System.out.println(root.getValue());
		int clength = root.getChildlist().size();
		for (int j = 0; j < clength; j++) {
			Travelsal1(root.getChildlist().get(j), l + 1);
		}
	}

}

最终结果

|---|4
|---|---|清晰
|---|---|---|3
|---|---|---|---|浊响
|---|---|---|---|---|1
|---|---|---|---|---|---|青绿
|---|---|---|---|---|---|---|yes
|---|---|---|---|---|---|乌黑
|---|---|---|---|---|---|---|6
|---|---|---|---|---|---|---|---|硬滑
|---|---|---|---|---|---|---|---|---|yes
|---|---|---|---|---|---|---|---|软粘
|---|---|---|---|---|---|---|---|---|no
|---|---|---|---|---|---|浅白
|---|---|---|---|---|---|---|yes
|---|---|---|---|沉闷
|---|---|---|---|---|yes
|---|---|---|---|清脆
|---|---|---|---|---|no
|---|---|稍糊
|---|---|---|6
|---|---|---|---|硬滑
|---|---|---|---|---|no
|---|---|---|---|软粘
|---|---|---|---|---|yes
|---|---|模糊
|---|---|---|no

在hdfs上生成的中间文件列表

hdfs dfs -ls /id3/

hadoop@ubuntu:~/id3$ hdfs dfs -ls /id3/
18/09/16 15:37:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 16 items
drwxr-xr-x   - hadoop supergroup          0 2018-09-16 14:53 /id3/source
drwxr-xr-x   - hadoop supergroup          0 2018-09-16 14:53 /id3/system_output
drwxr-xr-x   - hadoop supergroup          0 2018-09-16 14:53 /id3/system_output1
drwxr-xr-x   - hadoop supergroup          0 2018-09-16 14:53 /id3/system_output2
drwxr-xr-x   - hadoop supergroup          0 2018-09-16 14:53 /id3/乌黑_output
drwxr-xr-x   - hadoop supergroup          0 2018-09-16 14:53 /id3/乌黑_output1
drwxr-xr-x   - hadoop supergroup          0 2018-09-16 14:53 /id3/乌黑_output2
drwxr-xr-x   - hadoop supergroup          0 2018-09-16 14:53 /id3/浊响_output
drwxr-xr-x   - hadoop supergroup          0 2018-09-16 14:53 /id3/浊响_output1
drwxr-xr-x   - hadoop supergroup          0 2018-09-16 14:53 /id3/浊响_output2
drwxr-xr-x   - hadoop supergroup          0 2018-09-16 14:53 /id3/清晰_output
drwxr-xr-x   - hadoop supergroup          0 2018-09-16 14:53 /id3/清晰_output1
drwxr-xr-x   - hadoop supergroup          0 2018-09-16 14:53 /id3/清晰_output2
drwxr-xr-x   - hadoop supergroup          0 2018-09-16 14:53 /id3/稍糊_output
drwxr-xr-x   - hadoop supergroup          0 2018-09-16 14:53 /id3/稍糊_output1
drwxr-xr-x   - hadoop supergroup          0 2018-09-16 14:53 /id3/稍糊_output2
hadoop@ubuntu:~/id3$