一.jdk 安装及配置 (所有机器)
路径:/opt/soft/jdk
环境变量 vi /etc/profile 修改JAVA_HOME=/opt/soft/jdk,并添加PATH、CLASSPATH
IP HD1211
IP HD1212
IP DB1213
IP DB1214
IP DB1215
IP DB1216
IP DB1217
IP DB1218
IP DB1219
IP DB1220
IP DB1221
IP DB1222
IP DB1223
scp到各台机器
1.每台机器添加集群用户,用户名hadoop
2.ssh 设置
su - hadoop
ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
chmod 700 ~/.ssh
chmod 600 ~/.ssh/authorized_keys
scp ~/.ssh/id_dsa.pub hadoop@HD1212:~/master_key
每台机器
mkdir ~/.ssh
chmod 700 ~/.ssh
mv ~/master_key ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys
每台机器ssh一下
2.hadoop 安装
准备工作:
1.tar zxvf hadoop-0.20.203.0rc1.tar.gz
2. mv hadoop-0.20.203.0 hadoop 不要版本号,升级方便之用
3.chown -R hadoop:hadoop hadoop/
4.data1,data2,data3下分别创建dfs.name.dir,dfs.data.dir对应目录 (所有机器)
dfs/name, dfs/data
chown -R hadoop:hadoop dfs/
5.创建work/lib目录,存放第三方lib
cd /
mkdir work/lib
chown -R hadoop:hadoop lib
目前lib下有:dsapapi.jar、dsapstorage.jar
配置profile (所有机器)
1.增加HADOOP_HOME,添加PATH
vi /etc/profile
JAVA_HOME=/opt/soft/jdk
CLASSPATH=$CLASSPATH:$JAVA_HOME/lib:$JAVA_HOME/jre/lib
PATH=$PATH:$JAVA_HOME/bin:/opt/dell/srvadmin/sbin:/opt/dell/srvadmin/bin:$JAVA_HOME/jre/bin
HADOOP_HOME=/opt/soft/hadoop
PATH=$PATH:$HADOOP_HOME/bin
export JAVA_HOME HADOOP_HOME CLASSPATH TOMCAT_HOME OMREPORT_BIN PATH
2.修改hadoop/core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://HD1211:9000</value>
</property>
</configuration>
3.修改hadoop/hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/data1/dfs/data,/data2/dfs/data,/data3/dfs/data</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/data1/dfs/name,/data2/dfs/name,/data3/dfs/name</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.block.size</name>
<value>134217728</value>
</property>
4.修改hadoop/mapred-site.xml
<property>
<name>mapred.job.tracker</name>
<value>HD1211:9001</value>
</property>
<property>
<name>mapred.tasktracker.map.tasks.maximum</name>
<value>8</value>
</property>
<property>
<name>mapred.tasktracker.reduce.tasks.maximum</name>
<value>6</value>
</property>
<property>
<name>io.sort.mb</name>
<value>200</value>
</property>
<property>
<name>mapred.child.java.opts</name>
<value>-Xmx1g</value>
</property>
5.修改hadoop/hadoop-env.sh
export JAVA_HOME=/opt/soft/jdk
安装完HBase后,需再增加
export HBASE_HOME=/opt/soft/hbase
export HADOOP_CLASSPATH=${HBASE_HOME}/hbase-0.90.3.jar:${HBASE_HOME}/hbase-0.90.3-test.jar:${HBASE_HOME}/conf:${HBASE_HOME}/lib/zookeeper-3.3.2.jar:${HBASE_HOME}/lib/guava-r06.jar
方可运行hbase任务
6.修改masters 增加secondary namenode
HD1212
7.修改slavers
HD1212
DB1213
DB1214
DB1215
DB1216
DB1217
DB1218
DB1219
DB1220
DB1221
DB1222
DB1223
8.hbase 安装完后,在hadoop/conf 下增加hbase-site.xml软链接
ln -s /opt/soft/hbase/conf/hbase-site.xml hbase-site.xml
chown -R hadoop:hadoop hbase-site.xml
问题:软链接scp到集群各机器后原文件一同被复制
9.修改hadoop 在168行,增加第三方jar包 classpath修改,指向准备工作创建的第三方jar包存放目录:/work/lib
# add third party libs to CLASSPATH
for f in /work/lib/*.jar; do
CLASSPATH=${CLASSPATH}:$f;
done
scp 到个台机器,
三.Zookeeper(HD1211)
1.安装
1.HD1211
tar zxvf zookeeper-3.3.4.tar.gz
mv zookeeper-3.3.4 zookeeper 去版本号
chown -R hadoop:hadoop zookeeper
2.DB1213-DB1217 5台机器
cd /
mkdir zookeeperdata
chown -R hadoop:hadoop /zookeeperdata
2.配置
mv zoo_sample.cfg zoo.cfg
vi zoo.cfg
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
dataDir=/opt/soft/zookeeperdata
# the port at which the clients will connect
clientPort=2181
server.1=DB1213:2888:3888
server.2=DB1214:2888:3888
server.3=DB1215:2888:3888
server.4=DB1216:2888:3888
server.5=DB1217:2888:3888
3.scp 到 各机器,修改zookeeperdata myid:每台机器zookeeperdata下创建myid,内容为对应的id:1、2、3、4、5
四.HBase安装
1.安装准备
tar zxvf hbase-0.90.3.tar.gz
mv hbase-0.90.3.tar.gz hbase 去版本号
chown -R hadoop:hadoop hbase
替换lib下jar包:hadoop-core-0.20.203.0.jar和commons-configuration-1.6.jar
cp /opt/soft/hadoop/hadoop-core-0.20.203.0.jar /opt/soft/hbase/lib/
rm -rf /opt/soft/hbase/lib/hadoop-core-0.20-append-r1056497.jar
cp /opt/soft/hadoop/lib/commons-configuration-1.6.jar lib/
2.配置
1.环境变量 增加HBASE_HOME,修改PATH
vi /etc/profile (所有机器)
HBASE_HOME=/opt/soft/hbase
PATH=$PATH:$HBASE_HOME/bin
export HBASE_HOME
2. 编辑hbase-env.sh
export JAVA_HOME=/opt/soft/jdk
export HBASE_MANAGES_ZK=false
export HADOOP_HOME=/opt/soft/hadoop
3. 编辑hbase-site.xml
<property>
<name>hbase.rootdir</name>
<value>hdfs://HD1211:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.master</name>
<value>HD1211:60000</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/zookeeperdata</value>
</property>
<property>
<name>hbase.master.port</name>
<value>60000</value>
<description>The port master should bind to.</description>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>DB1213,DB1214,DB1215,DB1216,DB1217</value>
</property>
4.编辑 regionservers
HD1212
DB1213
DB1214
DB1215
DB1216
DB1217
DB1218
DB1219
DB1220
DB1221
DB1222
DB1223
3. SCP个台机器
五.启动
启动顺序
1.hadoop. HD1211: /opt/soft/hadoop/bin/start-all.sh
2.zookeeper. DB1213-DB1217:依次/opt/soft/zookeeper/bin/zkServer.sh start
3.hbase. HD1211: /opt/soft/hbase/bin/start-hbase.sh
停止顺序
1.hbase
2.zookeeper
3.hadoop
Tips
HBase 单独启动regionserver
启动集群中所有的regionserver
./hbase-daemons.sh start regionserver
启动某个regionserver
./hbase-daemon.sh start regionserver
启动某个master
./hbase-daemon.sh start master
路径:/opt/soft/jdk
环境变量 vi /etc/profile 修改JAVA_HOME=/opt/soft/jdk,并添加PATH、CLASSPATH
二.hadoop 集群安装
HD1211:namenode, jobtracker, Hmaster
HD1212:secondarynamenode, datanode, tasktracker, HregionServer
DB1213-DB1217:datanode, tasktracker, HRegionServer, zookeeper
DB1213-DB1223:datanode, tasktracker, HRegionServer
1.编辑/etc/hosts(所有机器)IP HD1211
IP HD1212
IP DB1213
IP DB1214
IP DB1215
IP DB1216
IP DB1217
IP DB1218
IP DB1219
IP DB1220
IP DB1221
IP DB1222
IP DB1223
scp到各台机器
1.每台机器添加集群用户,用户名hadoop
2.ssh 设置
su - hadoop
ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
chmod 700 ~/.ssh
chmod 600 ~/.ssh/authorized_keys
scp ~/.ssh/id_dsa.pub hadoop@HD1212:~/master_key
每台机器
mkdir ~/.ssh
chmod 700 ~/.ssh
mv ~/master_key ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys
每台机器ssh一下
2.hadoop 安装
准备工作:
1.tar zxvf hadoop-0.20.203.0rc1.tar.gz
2. mv hadoop-0.20.203.0 hadoop 不要版本号,升级方便之用
3.chown -R hadoop:hadoop hadoop/
4.data1,data2,data3下分别创建dfs.name.dir,dfs.data.dir对应目录 (所有机器)
dfs/name, dfs/data
chown -R hadoop:hadoop dfs/
5.创建work/lib目录,存放第三方lib
cd /
mkdir work/lib
chown -R hadoop:hadoop lib
目前lib下有:dsapapi.jar、dsapstorage.jar
配置profile (所有机器)
1.增加HADOOP_HOME,添加PATH
vi /etc/profile
JAVA_HOME=/opt/soft/jdk
CLASSPATH=$CLASSPATH:$JAVA_HOME/lib:$JAVA_HOME/jre/lib
PATH=$PATH:$JAVA_HOME/bin:/opt/dell/srvadmin/sbin:/opt/dell/srvadmin/bin:$JAVA_HOME/jre/bin
HADOOP_HOME=/opt/soft/hadoop
PATH=$PATH:$HADOOP_HOME/bin
export JAVA_HOME HADOOP_HOME CLASSPATH TOMCAT_HOME OMREPORT_BIN PATH
2.修改hadoop/core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://HD1211:9000</value>
</property>
</configuration>
3.修改hadoop/hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/data1/dfs/data,/data2/dfs/data,/data3/dfs/data</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/data1/dfs/name,/data2/dfs/name,/data3/dfs/name</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.block.size</name>
<value>134217728</value>
</property>
4.修改hadoop/mapred-site.xml
<property>
<name>mapred.job.tracker</name>
<value>HD1211:9001</value>
</property>
<property>
<name>mapred.tasktracker.map.tasks.maximum</name>
<value>8</value>
</property>
<property>
<name>mapred.tasktracker.reduce.tasks.maximum</name>
<value>6</value>
</property>
<property>
<name>io.sort.mb</name>
<value>200</value>
</property>
<property>
<name>mapred.child.java.opts</name>
<value>-Xmx1g</value>
</property>
5.修改hadoop/hadoop-env.sh
export JAVA_HOME=/opt/soft/jdk
安装完HBase后,需再增加
export HBASE_HOME=/opt/soft/hbase
export HADOOP_CLASSPATH=${HBASE_HOME}/hbase-0.90.3.jar:${HBASE_HOME}/hbase-0.90.3-test.jar:${HBASE_HOME}/conf:${HBASE_HOME}/lib/zookeeper-3.3.2.jar:${HBASE_HOME}/lib/guava-r06.jar
方可运行hbase任务
6.修改masters 增加secondary namenode
HD1212
7.修改slavers
HD1212
DB1213
DB1214
DB1215
DB1216
DB1217
DB1218
DB1219
DB1220
DB1221
DB1222
DB1223
8.hbase 安装完后,在hadoop/conf 下增加hbase-site.xml软链接
ln -s /opt/soft/hbase/conf/hbase-site.xml hbase-site.xml
chown -R hadoop:hadoop hbase-site.xml
问题:软链接scp到集群各机器后原文件一同被复制
9.修改hadoop 在168行,增加第三方jar包 classpath修改,指向准备工作创建的第三方jar包存放目录:/work/lib
# add third party libs to CLASSPATH
for f in /work/lib/*.jar; do
CLASSPATH=${CLASSPATH}:$f;
done
scp 到个台机器,
三.Zookeeper(HD1211)
1.安装
1.HD1211
tar zxvf zookeeper-3.3.4.tar.gz
mv zookeeper-3.3.4 zookeeper 去版本号
chown -R hadoop:hadoop zookeeper
2.DB1213-DB1217 5台机器
cd /
mkdir zookeeperdata
chown -R hadoop:hadoop /zookeeperdata
2.配置
mv zoo_sample.cfg zoo.cfg
vi zoo.cfg
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
dataDir=/opt/soft/zookeeperdata
# the port at which the clients will connect
clientPort=2181
server.1=DB1213:2888:3888
server.2=DB1214:2888:3888
server.3=DB1215:2888:3888
server.4=DB1216:2888:3888
server.5=DB1217:2888:3888
3.scp 到 各机器,修改zookeeperdata myid:每台机器zookeeperdata下创建myid,内容为对应的id:1、2、3、4、5
四.HBase安装
1.安装准备
tar zxvf hbase-0.90.3.tar.gz
mv hbase-0.90.3.tar.gz hbase 去版本号
chown -R hadoop:hadoop hbase
替换lib下jar包:hadoop-core-0.20.203.0.jar和commons-configuration-1.6.jar
cp /opt/soft/hadoop/hadoop-core-0.20.203.0.jar /opt/soft/hbase/lib/
rm -rf /opt/soft/hbase/lib/hadoop-core-0.20-append-r1056497.jar
cp /opt/soft/hadoop/lib/commons-configuration-1.6.jar lib/
2.配置
1.环境变量 增加HBASE_HOME,修改PATH
vi /etc/profile (所有机器)
HBASE_HOME=/opt/soft/hbase
PATH=$PATH:$HBASE_HOME/bin
export HBASE_HOME
2. 编辑hbase-env.sh
export JAVA_HOME=/opt/soft/jdk
export HBASE_MANAGES_ZK=false
export HADOOP_HOME=/opt/soft/hadoop
3. 编辑hbase-site.xml
<property>
<name>hbase.rootdir</name>
<value>hdfs://HD1211:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.master</name>
<value>HD1211:60000</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/zookeeperdata</value>
</property>
<property>
<name>hbase.master.port</name>
<value>60000</value>
<description>The port master should bind to.</description>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>DB1213,DB1214,DB1215,DB1216,DB1217</value>
</property>
4.编辑 regionservers
HD1212
DB1213
DB1214
DB1215
DB1216
DB1217
DB1218
DB1219
DB1220
DB1221
DB1222
DB1223
3. SCP个台机器
五.启动
启动顺序
1.hadoop. HD1211: /opt/soft/hadoop/bin/start-all.sh
2.zookeeper. DB1213-DB1217:依次/opt/soft/zookeeper/bin/zkServer.sh start
3.hbase. HD1211: /opt/soft/hbase/bin/start-hbase.sh
停止顺序
1.hbase
2.zookeeper
3.hadoop
Tips
HBase 单独启动regionserver
启动集群中所有的regionserver
./hbase-daemons.sh start regionserver
启动某个regionserver
./hbase-daemon.sh start regionserver
启动某个master
./hbase-daemon.sh start master