1. Hadoop安装
======================================
1. 找几个能ping通的机器(我的情况是2台)
======================================
2. 在/etc/hosts末尾加(两台机器一样)
我猜测这是一个主机名与ip的映射关系, 比如 ping ubuntu, 就会转换成 ping 192.168.56.102
192.168.56.102 ubuntu ubuntu 192.168.56.101 ubuntu1 ubuntu1
======================================
3. hadoop要求安装目录相同, 两台机器同时建一个名为hadoop的用户
adduser hadoop
我用的是hadoop-1.0.4版本的
=======================================
4. 安装ssh, 并且设置ssh无密码登陆, 我这里master是ubuntu, slave是ubuntu1, 要求master可以ssh无密码登陆到任意的slave机器上, 记得每次要启动sshd, 我是源码安装ssh的, 要下载3个文件:
zlib-1.2.3.tar.gz openssl-1.0.1.tar.gz openssh-5.3p1.tar.gz
然后make安装即可, 无密码登陆需先生成公钥和私钥:
ssh-keygen -t rsa
把master的id_rsa.pub内容追加到slaves的.ssh/authorized_keys里, 添加信任, 然后试试可不可以:
ssh ubuntu1(在主机上测)
===============================================
5. 然后就是配置hadoop, 在conf目录下:
*************************主机配置********************************
主机ubuntu:
①hadoop-env.sh文件中把java设好:
export JAVA_HOME=/usr/src/jdk1.6.0_37
②core-site.xml:
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>hadoop.tmp.dir</name> <value>/home/hadoop/hadoop-1.0.4/hadoop-${user.name}</value> </property> <property> <name>fs.default.name</name> <value>hdfs://ubuntu:9000</value> </property> </configuration>
③hdfs-site.xml:
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.replication</name> <value>2</value> </property> </configuration>
④mapred-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>mapred.job.tracker</name> <value>ubuntu:9001</value> </property> </configuration>
⑤masters:
ubuntu
⑥slaves:(主机也要作为DataNode和TaskTracker)
ubuntu1
ubuntu
****************************************************
*********************从机配置*************************
从机ubuntu1:
①hadoop-env.sh文件中把java设好:
export JAVA_HOME=/usr/src/jdk1.6.0_37
②core-site.xml:
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>hadoop.tmp.dir</name> <value>/home/hadoop/hadoop-1.0.4/hadoop-${user.name}</value> </property> <property> <name>fs.default.name</name> <value>hdfs://ubuntu1:9000</value> </property> </configuration>
③hdfs-site.xml:
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.replication</name> <value>2</value> </property> </configuration>
④mapred-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>mapred.job.tracker</name> <value>ubuntu1:9001</value> </property> </configuration>
⑤masters:
ubuntu
⑥slaves:
ubuntu1
ubuntu
****************************************************
===========================================================
6. 启动hadoop集群
注: 以下操作都是在主机上执行的
先到bin目录下:
./hadoop namenode -format
sh start-all.sh
----------命令输出----------
starting namenode, logging to /home/hadoop/hadoop-1.0.4/libexec/../logs/hadoop-hadoop-namenode-ubuntu.out
ubuntu1: starting datanode, logging to /home/hadoop/hadoop-1.0.4/libexec/../logs/hadoop-hadoop-datanode-ubuntu1.out
ubuntu: starting datanode, logging to /home/hadoop/hadoop-1.0.4/libexec/../logs/hadoop-hadoop-datanode-ubuntu.out
ubuntu: starting secondarynamenode, logging to /home/hadoop/hadoop-1.0.4/libexec/../logs/hadoop-hadoop-secondarynamenode-ubuntu.out
starting jobtracker, logging to /home/hadoop/hadoop-1.0.4/libexec/../logs/hadoop-hadoop-jobtracker-ubuntu.out
ubuntu1: starting tasktracker, logging to /home/hadoop/hadoop-1.0.4/libexec/../logs/hadoop-hadoop-tasktracker-ubuntu1.out
ubuntu: starting tasktracker, logging to /home/hadoop/hadoop-1.0.4/libexec/../logs/hadoop-hadoop-tasktracker-ubuntu.out
然后用用jps命令查看本机hadoop相关进程:
主机:
4176 SecondaryNameNode
4375 TaskTracker
3966 NameNode
4074 DataNode
4260 JobTracker
4415 Jps
从机:
2329 DataNode
2408 TaskTracker
2455 Jps
wordcount测试:
./hadoop dfs -mkdir test ./hadoop dfs -ls
./hadoop dfs -put ../conf/* test
./hadoop jar ../hadoop-examples-1.0.4.jar wordcount test/* out1
查看结果:
./hadoop dfs -cat out1/*
==========================================================
注: hadoop的log在logs目录下, 有问题看log, 找谷歌
2. 编译例子
====================================================
如何编译自带的例子:
自带的例子在/home/hadoop/hadoop-1.0.4/src/examples/org/apache/hadoop/examples目录下, 把.java文件编译成.class文件, 打包成jar包, 放到hadoop集群上跑.
mkdir classes javac -classpath hadoop-core-1.0.4.jar:commons-cli-1.2.jar WordCount.java -d classes/ jar -cvf wc.jar -C classes/ .
标明清单(manifest) 增加:org/(读入= 0) (写出= 0)(存储了 0%) 增加:org/apache/(读入= 0) (写出= 0)(存储了 0%) 增加:org/apache/hadoop/(读入= 0) (写出= 0)(存储了 0%) 增加:org/apache/hadoop/examples/(读入= 0) (写出= 0)(存储了 0%) 增加:org/apache/hadoop/examples/WordCount$TokenizerMapper.class(读入= 1790) (写出= 765)(压缩了 57%) 增加:org/apache/hadoop/examples/WordCount.class(读入= 1911) (写出= 996)(压缩了 47%) 增加:org/apache/hadoop/examples/WordCount$IntSumReducer.class(读入= 1789) (写出= 746)(压缩了 58%)
做完了, 跑跑看:
cp wc.jar ~/hadoop-1.0.4/bin cd ~/hadoop-1.0.4/bin ./hadoop jar wc.jar org.apache.hadoop.examples.WordCount input ret
接下去可以研究例子源码了.
3. 编译项目
1. 环境: ubuntu+eclipse+hadoop-1.0.4+jdk1.6.0_37+ant1.8.4
ant配置:
export ANT_HOME=/usr/src/apache-ant-1.8.4 export PATH=$ANT_HOME/bin:$PATH
2. 解压后, ant, ant eclipse, 在把hadoop-1.0.4导入eclipse里即可.
可能会出现很多错误, 要装SVN, autoconf, libtool等等,根据错误谷歌.
遇到的问题:
DataNode和NameNode ID不一致:http://blog.chinaunix.net/uid-28379399-id-3567167.html,
指定的目录要存在
SSH无密码登陆失败解决:
http://bbs.csdn.net/topics/370109654
The reported blocks 6093 needs additional 42 blocks to reach the threshold 0.9990 of total blocks 6141. Safe mode will be turned off automatically.
http://stackoverflow.com/questions/4966592/hadoop-safemode-recovery-taking-too-long
解决命令:hadoop dfsadmin -safemode leave
================================================================================
hadoop-2.0.0-cdh4.1.2,c语言读写hdfs
http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-project-dist/hadoop-hdfs/LibHdfs.html
echo $LD_LIBRARY_PATH=
:/usr/java/jdk1.6.0_30/jre/lib/amd64:/usr/java/jdk1.6.0_30/jre/lib/amd64/server:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/lib/native/
前者是jre的动态链接库,后者是hadoop本地动态链接库。
echo $CLASSPATH=
/home/gaoxun/hadoop-2.0.0-cdh4.1.2/src/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/mapred/test.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/tools/lib/hadoop-streaming-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/tools/lib/hadoop-archives-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/tools/lib/hadoop-rumen-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/tools/lib/hadoop-distcp-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/tools/lib/hadoop-extras-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/tools/lib/hadoop-datajoin-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/tools/lib/hadoop-gridmix-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/hadoop-yarn-site-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/hadoop-yarn-server-tests-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/hadoop-yarn-common-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/lib-examples/hsqldb-2.0.0.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/lib/hadoop-annotations-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/lib/jackson-mapper-asl-1.8.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/lib/paranamer-2.3.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/lib/guice-3.0.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/lib/avro-1.7.1.cloudera.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/lib/aopalliance-1.0.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/lib/jersey-server-1.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/lib/commons-io-2.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/lib/javax.inject-1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/lib/jersey-guice-1.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/lib/log4j-1.2.17.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/lib/jackson-core-asl-1.8.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/lib/jersey-core-1.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/lib/protobuf-java-2.4.0a.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/lib/asm-3.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/lib/snappy-java-1.0.4.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/lib/guice-servlet-3.0.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/lib/netty-3.2.4.Final.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/hadoop-yarn-server-common-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/hadoop-yarn-api-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/yarn/hadoop-yarn-server-tests-2.0.0-cdh4.1.2-tests.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/commons-beanutils-1.7.0.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/hadoop-annotations-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/jettison-1.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/jaxb-impl-2.2.3-1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/jackson-mapper-asl-1.8.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/commons-daemon-1.0.3.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/commons-lang-2.5.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/hadoop-common-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/jaxb-api-2.2.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/stax-api-1.0.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/commons-math-2.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/paranamer-2.3.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/hadoop-hdfs-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/commons-configuration-1.6.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/avro-1.7.1.cloudera.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/hadoop-auth-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/commons-collections-3.2.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/jackson-xc-1.8.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/commons-logging-1.1.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/jersey-server-1.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/commons-net-3.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/commons-io-2.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/activation-1.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/commons-cli-1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/jsr305-1.3.9.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/xmlenc-0.52.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/slf4j-log4j12-1.6.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/jline-0.9.94.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/zookeeper-3.4.3-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/log4j-1.2.17.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/jsch-0.1.42.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/jackson-core-asl-1.8.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/guava-11.0.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/kfs-0.3.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/json-simple-1.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/commons-beanutils-core-1.8.0.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/jersey-json-1.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/slf4j-api-1.6.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/commons-codec-1.4.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/jersey-core-1.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/commons-digester-1.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/protobuf-java-2.4.0a.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/jackson-jaxrs-1.8.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/asm-3.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/snappy-java-1.0.4.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/bin/bootstrap.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/bin/tomcat-juli.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/bin/commons-daemon.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/lib/catalina.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/lib/catalina-tribes.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/lib/jsp-api.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/lib/tomcat-i18n-ja.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/lib/jasper.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/lib/tomcat-i18n-es.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/lib/catalina-ha.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/lib/ecj-3.3.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/lib/annotations-api.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/lib/catalina-ant.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/lib/tomcat-i18n-fr.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/lib/jasper-el.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/lib/tomcat-coyote.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/lib/servlet-api.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/lib/tomcat-dbcp.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/httpfs/tomcat/lib/el-api.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/hdfs/hadoop-hdfs-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/hdfs/hadoop-hdfs-2.0.0-cdh4.1.2-tests.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/hdfs/hadoop-hdfs-2.0.0-cdh4.1.2-test-sources.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/hdfs/hadoop-hdfs-2.0.0-cdh4.1.2-sources.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/hdfs/lib/jetty-6.1.26.cloudera.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/hdfs/lib/jackson-mapper-asl-1.8.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/hdfs/lib/commons-daemon-1.0.3.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/hdfs/lib/commons-lang-2.5.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/hdfs/lib/jsp-api-2.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/hdfs/lib/jasper-runtime-5.5.23.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/hdfs/lib/commons-logging-1.1.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/hdfs/lib/jersey-server-1.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/hdfs/lib/commons-io-2.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/hdfs/lib/jsr305-1.3.9.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/hdfs/lib/xmlenc-0.52.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/hdfs/lib/jline-0.9.94.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/hdfs/lib/zookeeper-3.4.3-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/hdfs/lib/log4j-1.2.17.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/hdfs/lib/jackson-core-asl-1.8.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/hdfs/lib/guava-11.0.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/hdfs/lib/commons-codec-1.4.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/hdfs/lib/jersey-core-1.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/hdfs/lib/protobuf-java-2.4.0a.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/hdfs/lib/jetty-util-6.1.26.cloudera.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/hdfs/lib/asm-3.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/hdfs/lib/commons-el-1.0.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/hdfs/lib/servlet-api-2.5.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/mapreduce/lib-examples/hsqldb-2.0.0.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/mapreduce/lib/hadoop-annotations-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/mapreduce/lib/jackson-mapper-asl-1.8.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/mapreduce/lib/paranamer-2.3.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/mapreduce/lib/guice-3.0.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/mapreduce/lib/avro-1.7.1.cloudera.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/mapreduce/lib/aopalliance-1.0.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/mapreduce/lib/jersey-server-1.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/mapreduce/lib/commons-io-2.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/mapreduce/lib/javax.inject-1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/mapreduce/lib/jersey-guice-1.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/mapreduce/lib/log4j-1.2.17.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/mapreduce/lib/jackson-core-asl-1.8.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/mapreduce/lib/jersey-core-1.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/mapreduce/lib/protobuf-java-2.4.0a.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/mapreduce/lib/asm-3.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/mapreduce/lib/snappy-java-1.0.4.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/mapreduce/lib/guice-servlet-3.0.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/mapreduce/lib/netty-3.2.4.Final.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.0.0-cdh4.1.2-tests.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/hadoop-common-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/hadoop-common-2.0.0-cdh4.1.2-sources.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/hadoop-common-2.0.0-cdh4.1.2-test-sources.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/hadoop-common-2.0.0-cdh4.1.2-tests.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/commons-httpclient-3.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/hadoop-annotations-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/jettison-1.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/jetty-6.1.26.cloudera.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/jackson-mapper-asl-1.8.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/commons-lang-2.5.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/stax-api-1.0.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/jsp-api-2.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/commons-math-2.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/paranamer-2.3.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/commons-configuration-1.6.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/jets3t-0.6.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/mockito-all-1.8.5.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/avro-1.7.1.cloudera.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/hadoop-auth-2.0.0-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/commons-collections-3.2.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/jackson-xc-1.8.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/jasper-runtime-5.5.23.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/commons-logging-1.1.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/jersey-server-1.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/jasper-compiler-5.5.23.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/commons-net-3.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/commons-io-2.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/activation-1.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/commons-cli-1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/jsr305-1.3.9.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/xmlenc-0.52.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/slf4j-log4j12-1.6.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/jline-0.9.94.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/zookeeper-3.4.3-cdh4.1.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/log4j-1.2.17.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/jsch-0.1.42.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/jackson-core-asl-1.8.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/guava-11.0.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/kfs-0.3.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/jersey-json-1.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/slf4j-api-1.6.1.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/commons-codec-1.4.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/jersey-core-1.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/junit-4.8.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/commons-digester-1.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/protobuf-java-2.4.0a.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/jackson-jaxrs-1.8.8.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/jetty-util-6.1.26.cloudera.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/asm-3.2.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/commons-el-1.0.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/servlet-api-2.5.jar:/home/gaoxun/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:
就是把cdh目录下所有的jar文件都放到classpath下面咯。
gcc testHdfs.c -I../include/ -l hdfs -o testHdfs
前提是吧libjvm.so 和libhdfs.so都放到/usr/lib下,用-L指定有时会出现找不到的错误,建立软连接,ln -s 原 新
HADOOP Streaming编程:
streaming框架:
示例:
mapper.sh
#/bin/sh grep -E "is|code|people" exit 0
运行:
../hadoop/bin/hadoop streaming -input /app/ecom/fcr/wjp/test_input -output /app/ecom/fcr/wjp/streaming_output -mapper "mapper.sh" -reducer "cat" -file mapper.sh -jobconf mapred.job.name="dist-grep"