Ssh设置
Hadoop001 172.16.100.31 Namenode
Hadoop002 172.16.100.32 Datanode01
Hadoop003 172.16.100.33 Datanode02
Hadoop004 172.16.100.34 Datanode03
四台设备操作
确定安装sshd
Service sshd start 添加到/etc/rc.d/rc.local 设置开机启动
创建组和用户
Groupadd hadoop
Useradd hadoop –g hadoop
Passwd hadoop
Vi /etc/ssh/sshd-config
RSAAuthentication yes
PubkeyAuthentication yes
AuthorizedKeysFile .ssh/authorized_keys
在hadoop001设备上操作
Su – hadoop
ssh-keygen -t dsa 会创建/home/hadoop/.ssh目录,目录内有id_dsa id_dsa.pub两个文件,
cat id_dsa.pub >>authorized_keys
su root
chmod 644 /home/hadoop/.ssh/authorized_keys
chmod 755 /home/hadoop/.ssh/
scp ~/.ssh/id_dsa.pub hadoop@172.16.100.32:~/
scp ~/.ssh/id_dsa.pub hadoop@172.16.100.33:~/
scp ~/.ssh/id_dsa.pub hadoop@172.16.100.34:~/
分别登陆其余三台设备
Cd /home/hadoop
Mkdir .ssh
Chmod 711 .ssh
Su - hadoop
cat id_dsa.pub >>.ssh/authorized.keys
chmod 644 /home/hadoop/.ssh/authorized_keys
Vi /etc/ssh/sshd-config
RSAAuthentication yes
PubkeyAuthentication yes
AuthorizedKeysFile .ssh/authorized_keys
以上操作完成了masterhadoop001向其他三台设备ssh时免去了密码
(无密码公共密钥key的配置是搭建Hadoop的过程中比较容易卡住的地方。现总结常见问题如下。
1、为何需要从master到slave的无密码ssh:用作master远程启动slave。所以需要master可以无密码登陆到salve机器上去执行bin下的启动脚本。
2、如上原因,那么在无密码ssh的过程中,slave是认证的服务端,master是客户端。
3、大致无密码公钥ssh的原理就是在客户端生成一对公/私钥对(id_rsa.pub\id_rsa),私钥在本地,公钥上传到服务端,用作让服务端认证自己。(大致看一下公钥文件内容,是一串密钥+客户端的host)。
理解了这些基本点,就不用死记配置过程了^_^
1、在master和slave的启动用户默认路径下建立.ssh文件夹。
2、在master上通过ssh-keygen -t rsa 。把.pub改名成authorized_key,分发到各个slave的.ssh下。
然后坑的地方就来了。对于.ssh目录,公钥、私钥的权限都有严格的要求。
1、用户目录755或700,不能使77*
2、.ssh目录755
3、.pub或authorized_key644
4、私钥600
如果权限不对可能不能无密码ssh(具体原因是遗留问题),如果经过上述步骤不能达到目的。调试的手段就很必要了。有两个。
1、ssh -v:ssh具体过程调试信息会打出
2、/var/log/secure:该日志中有失败原因
本篇文章来源于 Linux公社网站(www.linuxidc.com) 原文链接:http://www.linuxidc.com/Linux/2012-07/65253.htm)
Jdk安装
Java –version
查看系统是否已经安装java
如果安装
Yum remove java 卸载
Mkdir /usr/java
Cp jdk-6u45-linux-i586.bin.zip /usr/java/
Cd /usr/java
Unzip jdk-6u45-linux-i586.bin.zip
./jdk-6u45-linux-i586.bin
配置环境变量
Vi/etc/profile
JAVA_HOME=/usr/java/jdk1.6.0_45
PATH=$JAVA_HOME/bin:$PATH
CLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib
export JAVA_HOME PATH CLASSPATH
java –version 查看版本确认为1.6.0
hadoop安装
cp hadoop-1.0.2-bin.tar.gz /home/hadoop
cd /home/hadoop
tar –zxvf hadoop-1.0.2-bin.tar.gz
配置环境变量
Vi/etc/profile
export HADOOP_HOME=/home/hadoop/hadoop-1.0.2
export PATH=$HADOOP_HOME/bin:$PATH
Hadoop配置有关文件
◆ 配置hadoop-env.sh
Cd /home/hadoop/hadoop-1.0.2/conf
--修改jdk安装路径
root@gc conf]vi hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.6.0_45
配置namenode,修改site文件
Cd /home/hadoop/hadoop-1.0.2
Mkdir tmp
Chmod 777 tmp
--修改core-site.xml文件
[gird@hotel01conf]# vi core-site.xml
<?xmlversion="1.0"?>
<?xml-stylesheettype="text/xsl" href="configuration.xsl"?>
<!-- Putsite-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://hotel01.licz.com:9000</value> #完全分布式不能用localhost,要用master节点的IP或机器名.
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoop-1.0.2/tmp</value>
</property>
</configuration>
注:fs.default.nameNameNode的IP地址和端口
--修改hdfs-site.xml文件
[grid@hotel01hadoop-1.0.2]$ mkdir data
[root@hadoop001 conf]# cat hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.data.dir</name>
<value>/home/hadoop/hadoop-1.0.2/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
--修改mapred-site.xml文件
[root@hadoop001 conf]# cat mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>hadoop001:9001</value>
</property>
</configuration>
[root@hadoop001 conf]#
配置masters和slaves文件
[root@hadoop001 conf]# vi masters
hadoop001
[root@hadoop001 conf]# vi slaves
hadoop002
hadoop003
hadoop004
复制hadoop到datanode节点上
scp -rp hadoop-1.0.2 hadoop@hadoop002:/home/hadoop/
scp -rp hadoop-1.0.2 hadoop@hadoop003:/home/hadoop/
scp -rp hadoop-1.0.2 hadoop@hadoop004:/home/hadoop/
安装java到datanode节点
安装
配置环境变量
JAVA_HOME=/usr/java/jdk1.6.0_45
PATH=$JAVA_HOME/bin:$PATH
CLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib
export JAVA_HOME PATH CLASSPATH
export HADOOP_HOME=/home/hadoop/hadoop-1.0.2
export PATH=$HADOOP_HOME/bin:$PATH
查看确定/home/hadoop/hadoop-1.0.2/data文件是否存在
Mkdir /
chown -R hadoop:hadoop /home/hadoop/
格式化namenode
Su – hadoop
Cd /home/hadoop/hadoop-1.0.2/bin
./hadoop namenode –format
但没想到的是按网上的步骤配置后datanode节点怎么也没办法启动。后来通过分析启动日志后发现fs.data.dir参数设置的目录权限必需为755,要不启动datanode节点启动就会因为权限检测错误而自动关闭。提示信息如下:
J
问题:
[hadoop@hadoop001 ~]$ hadoop dfsadmin -report
Configured Capacity: 0 (0 KB)
Present Capacity: 0 (0 KB)
DFS Remaining: 0 (0 KB)
DFS Used: 0 (0 KB)
DFS Used%: 锟?
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Datanodes available: 0 (0 total, 0 dead)
或者通过http://172.16.100.31:50070查看也是没有
Configured Capacity: 0 (0 KB)
Present Capacity: 0 (0 KB)
DFS Remaining: 0 (0 KB)
DFS Used: 0 (0 KB)
DFS Used%: 锟?
Under replicated blocks: 0
Blocks with corrupt replicas: 0
分析:
1在namenode 和datanode上分别运行jps,查看发现一切正常,
Namenode节点上显示正常如下
3765 SecondaryNameNode
3850 JobTracker
5105 Jps
3584 NameNode
Datanode节点上显示正常如下
[hadoop@hadoop003 hadoop-1.0.2]$ jps
4511 Jps
4329 DataNode
4425 TaskTracker
但是为什么没有Configured Capacity ?有三种可能
1. namespaceid不一致,反应的问题可能datanode节点不会启动4329 datanode。
Stop-all.sh时也不会停掉datanode
2. slave不能ssh到主master上
datenode节点提示
NFO org.apache.hadoop.ipc.RPC: Server at /192.168.0.100:9000 not available yet, Zzzzz...
3. hosts 有问题没有去掉127.0.0.1 ,会提示
failed on local exception: java.net.NoRouteToHostException: No route to host
zookeeper的安装配置
下载zookeeper 3.4.3.tar.gz
Cp zookeeper 3.4.3.tar.gz /home/hadoop/
Tar –zxvf zzookeeper 3.4.3.tar.gz
chown -R hadoop:hadoop zookeeper-3.4.3/
将conf目录下的zoo-example.cfg文件重命名为zoo.cfg,修改其中的内容如下(未改动的内容省略)
su - hadoop
dataDir=/home/hadoop/zookeeper/data
server.1=hadoop001:2888:3888
server.2= hadoop002:2888:3888
server.3= hadoop003:2888:3888
su - hadoop
mkdir /home/hadoop/zookeeper/data
cd data
vi myid
hadoop002的内容为1
hadoop003的内容为2
hadoop004的内容为3
scp -rp zookeeper hadoop@hadoop003:/home/hadoop
scp -rp zookeeper hadoop@hadoop004:/home/hadoop
更改系统环境变量
export ZOOKEEPER_HOME=/home/hadoop/zookeeper/
PATH=$ZOOKEEPER_HOME/bin:$PATH
export PATH
启动并测试zookeeper
1、在所有服务器中执行:zookeeper/bin/zkServer.sh start
2、在所有机器上执行zkServer.sh status
[hadoop@hadoop003 ~]$ zkServer.sh status
JMX enabled by default
Using config: /home/hadoop/zookeeper/bin/../conf/zoo.cfg
Mode: leader
[hadoop@hadoop002 conf]$ zkServer.sh status
JMX enabled by default
Using config: /home/hadoop/zookeeper/bin/../conf/zoo.cfg
Mode: follower
可以看出hadoop003被选举为领导
3、jps 查看
[hadoop@hadoop003 ~]$ jps
5135 QuorumPeerMain
4329 DataNode
5202 Jps
4425 TaskTracker
[hadoop@hadoop003 ~]$
每个服务器都多出一个QuorumPeerMain的程序这个味zookeeper的进程
主意:只有所有的zookeeper都开始了才会选举的哦!!1
Hbase的安装配置
下载hbase 0.94.2.tar.gz
Cp hbase 0.94.2.tar.gz /home/hadoop
Tar –zxvf hbase.0.94.2.tar.gz
Mv hbase.0.94.2 hbase
配置hbase-env.sh
Vi /home/hadoop/hbase/conf/hbase-env.sh
# The java implementation to use. Java 1.6 required.
export JAVA_HOME=/usr/java/jdk1.6.0_45
# Extra Java CLASSPATH elements. Optional.
export HBASE_CLASSPATH=/home/hadoop/hadoop-1.0.2/conf
# The maximum amount of heap to use, in MB. Default is 1000.
export HBASE_HEAPSIZE=2048
export HBASE_OPTS="$HBASE_OPTS -XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode
# Tell HBase whether it should manage it's own instance of Zookeeper or not.
export HBASE_MANAGES_ZK=false
配置hbase.site.xml
[root@hadoop001 conf]# cat hbase-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
/**
* Copyright 2010 The Apache Software Foundation
*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
-->
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://hadoop001:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.master</name>
<value>hadoop001:60000</value>
</property>
<property>
<name>hbase.master.port</name>
<value>60000</value>
<description>The port master should bind to.</description>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>hadoop002,hadoop003,hadoop004</value>
</property>
</configuration>
配置regionservers
[root@hadoop001 conf]# vi regionservers
hadoop002
hadoop003
hadoop004
scp -rp /home/hadoop/hbase hadoop@hadoop002:/home/hadoop/
scp -rp /home/hadoop/hbase hadoop@hadoop003:/home/hadoop/
scp -rp /home/hadoop/hbase hadoop@hadoop004:/home/hadoop/
在hadoop001上更改环境变量
加
export HBASE_HOME=/home/hadoop/hbase
export PATH=$HBASE_HOME/bin:$PATH
在cd /home/hadoop/hbase/bin/
./start-hbase.sh
用Jps查看
Hadoop001上显示
[hadoop@hadoop001 bin]$ jps
4887 Jps
2493 JobTracker
2418 SecondaryNameNode
4620 HMaster
2244 NameNode
Hadoop002,hadoop003,hadoop004上显示
[hadoop@hadoop002 ~]$ jps
2326 TaskTracker
3417 QuorumPeerMain
4512 HRegionServer
2246 DataNode
4768 Jps
[hadoop@hadoop002 ~]$
停止要按照
Hbase---》zookeeper—》hadoop的顺序
开机要以返过来的顺序