Cygwin安装和配置
参考文档:http://blog.csdn.net/liu_jason/article/details/7705484 完成cygwin的安装
配置cygwin到window环境变量中,否则后边允许程序会出错 path加上C:\cygwin\bin
Hadoop安装和配置
参考文档:http://blog.csdn.net/liu_jason/article/details/7706781
注释:我这里用的是hadoop0.20.2。且注意文档中的文件目录,sysdata下的文件全部为自动生成
Eclipse中运行
参考文档:http://www.cnblogs.com/flyoung2008/archive/2011/12/09/2281400.html
注释:其中第二张图,因为都是在本机上建立的伪分布的环境,所以
Map/Reduce Master
localhost
9001
DFS Master
localhost
9000
hdfs://centos1:9000/tmp/wordcount/word.txt hdfs://centos1:9000/tmp/wordcount/out
改成
hdfs://localhost:9000/tmp/wordcount/word.txt hdfs://localhost:9000/tmp/wordcount/out 如果运行中提示 Output directory hdfs://localhost:9000/tmp/wordcount/out already exists 删除tmp/wordcount/out文件夹 如果提示在配置文件hdfs-site.xml中添加 <property> <name>dfs.datanode.max.transfer.threads</name> <value>8096</value> </property> <property> <name>dfs.datanode.max.xcievers</name> <value>8096</value> <description></description> </property>
- $ bin/hadoop fs -cat /tmp/wordcount/word.txt
- 12/06/29 18:07:53 INFO hdfs.DFSClient: No node available for block: blk_-6169034246478912245_1004 file=/tmp/wordcount/word.txt
- 12/06/29 18:07:53 INFO hdfs.DFSClient: Could not obtain block blk_-6169034246478912245_1004 from any node: java.io.IOException: No live nodes contain current block
- 12/06/29 18:07:56 INFO hdfs.DFSClient: No node available for block: blk_-6169034246478912245_1004 file=/tmp/wordcount/word.txt
- 12/06/29 18:07:56 INFO hdfs.DFSClient: Could not obtain block blk_-6169034246478912245_1004 from any node: java.io.IOException: No live nodes contain current block
- 12/06/29 18:07:59 INFO hdfs.DFSClient: No node available for block: blk_-6169034246478912245_1004 file=/tmp/wordcount/word.txt
- 12/06/29 18:07:59 INFO hdfs.DFSClient: Could not obtain block blk_-6169034246478912245_1004 from any node: java.io.IOException: No live nodes contain current block
- 12/06/29 18:08:02 WARN hdfs.DFSClient: DFS Read: java.io.IOException: Could not obtain block: blk_-6169034246478912245_1004 file=/tmp/wordcount/word.txt
- at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1812)
- at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1638)
- at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1767)
- at java.io.DataInputStream.read(DataInputStream.java:83)
- at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:47)
- at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85)
- at org.apache.hadoop.fs.FsShell.printToStdout(FsShell.java:114)
- at org.apache.hadoop.fs.FsShell.access$100(FsShell.java:49)
- at org.apache.hadoop.fs.FsShell$1.process(FsShell.java:352)
- at org.apache.hadoop.fs.FsShell$DelayedExceptionThrowing.globAndProcess(FsShell.java:1898)
- at org.apache.hadoop.fs.FsShell.cat(FsShell.java:346)
- at org.apache.hadoop.fs.FsShell.doall(FsShell.java:1543)
- at org.apache.hadoop.fs.FsShell.run(FsShell.java:1761)
- at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
- at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
- at org.apache.hadoop.fs.FsShell.main(FsShell.java:1880)