配置hadoop。
在master上操作以下步骤。
1、将hadoop-2.3.0-cdh5.0.0-src.tar.gz解压到/usr/cdh下,配置HADOOP_HOME环境。修改/etc/profile文件,添加export HADOOP_HOME=/usr/cdh/hadoop-2.3.0-cdh5.0.0,在export PATH下添加$HADOOP_HOME/bin:$HADOOP_HOME/sbin的配置;
2、修改$HADOOP_HOME/etc/hadoop下的core-site.xml、hdfs-site.xml、mapred-site.xml、yarn-site.xml四个配置文件。
core-site.xml
io.native.lib.available
true
fs.default.name
hdfs://master:9000
The name of the default file system.Either the literal string "local" or a host:port for NDFS.
true
hadoop.tmp.dir
/tmp/hadoop/hadoop-hadoop
hdfs-site.xml
dfs.namenode.name.dir
/usr/cdh/hadoop/dfs/name
Determines where on the local filesystem the DFS name node should store the name table.If this is a comma-delimited list of directories,then name table is replicated in all of the directories,for redundancy.
true
dfs.datanode.data.dir
/usr/cdh/hadoop/dfs/data
Determines where on the local filesystem an DFS data node should store its blocks.If this is a comma-delimited list of directories,then data will be stored in all named directories,typically on different devices.Directories that do not exist are ignored.
true
dfs.replication
1
dfs.permission
false
mapred-site.xml
mapreduce.framework.name
yarn
mapreduce.job.tracker
hdfs://master:9001
true
mapreduce.map.memory.mb
1536
mapreduce.map.java.opts
-Xmx1024M
mapreduce.reduce.memory.mb
3072
mapreduce.reduce.java.opts
-Xmx1024M
mapreduce.task.io.sort.mb
512
mapreduce.task.io.sort.factor
100
mapreduce.reduce.shuffle.parallelcopies
50
mapred.system.dir
/tmp/hadoop/mapred/system
true
mapred.local.dir
/tmp/hadoop/mapred/local
true
yarn-site.xml
yarn.resourcemanager.address
master:8080
yarn.resourcemanager.scheduler.address
master:8081
yarn.resourcemanager.resource-tracker.address
master:8082
yarn.nodemanager.aux-services
mapreduce_shuffle
yarn.nodemanager.aux-services.mapreduce.shuffle.class
org.apache.hadoop.mapred.ShuffleHandler
3、 运行前的准备
a) 在$HADOOP_HOME/bin下运行hdfs namenode –format
b) 在$HADOOP_HOME/sbin下运行./start-all.sh
如遇namenode或datanode无法正常启动,则需要查询日志是否因为namenode和datanode的缓存引起。其缓存文件的位置配置是在$HADOOP_HOME/etc/hadoop/hdfs-site.xml中配置
如果以上在本机运行正常,通过jps命令看到NameNode、SecondaryNameNode、ResourceManager、NodeManager、DataNode,则表示运行正常,配置正常。继续以下操作。
1、修改$HADOOP_HOME/etc/hadoop/savles文件,修改其内容为
slave1
slave2
2、复制到slave1和slave2。先在slave1和slave2上分别添加/usr/cdh文件夹,然后运行
scp -r /usr/cdh/hadoop-2.3.0-cdh5.0.0 hadoop@slave1:/usr/cdh
scp -r /usr/cdh/hadoop-2.3.0-cdh5.0.0 hadoop@slave2:/usr/cdh
3、修改slave1和slave2上的/usr/cdh的所属用户和运用权限为700
注******************************************************************:
使用hadoop fs -ls 查看文件系统的时候会遇到报错WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable原因是缺少libhadoop.so文件 在src目录或者hadoop-common子项目中重新build,命令:mvn package -DskipTests -Pdist,native,docs -Dtar再次遇到报错[ERROR] class file for org.mortbay.component.AbstractLifeCycle not found这次是遇到BUG了按照https://issues.apache.org/jira/browse/HADOOP-10110官方说明在hadoop-common-project/hadoop-auth/pom.xml文件中添加
org.mortbay.jetty
jetty-util
test
再次编译遇到报错Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.6:run (make) on project hadoop-common:这是没有安装zlib1g-dev的关系,这个可以 使用apt-get安装 最后把生成的.so文件全部拷贝到lib/native/目录下,再次运行hadoop fs -ls没有报错信息