系统环境CentOS Linux release 6.0 (Final)
在虚拟机里面启动同时启动两个系统,以作分布式安装测试使用
192.168.109.129 nodename (用作运行hadoop 主服务程序和hbase 主服务程序,分别还要安装dns)
192.168.109.130 datanode1(用作运行hadoop 副本服务程序和hbase 副本服务程序,分别还要安装dns)
hadoop官方主页下载hadoop1.0.0版本(找不到下载可以点击这里寻找)
在系统里面添加一个用户hadoop ,不过也可以用其他用户名(过程省略...不懂可以问谷歌)
hadoop 因为是java编写的所以要预先安装java 环境
yum install java-1.6.0-openjdk
cd /home/hadoop/
wget http://apache.etoak.com/hadoop/common/hadoop-1.0.0/hadoop-1.0.0.tar.gz
tar zxvf hadoop-1.0.0.tar.gz
(上面是下载和解压文件)
vi hadoop-1.0.0/conf/hadoop-env.sh
编辑hadoop启动时需要导入环境变量
export JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre
(安装完java后如果找不到可以find / -name java,再找不到问谷歌)
export HADOOP_CLASSPATH=/home/hadoop/hadoop-1.0.0
export PATH=$PATH:/home/hadoop/hadoop-1.0.0/bin
编辑完保存退出
编辑hadoop启动时候需要启动副机器,用作文件数据分布式存储
vi hadoop-1.0.0/conf/slaves
输入datanode1 (datanode1 是副机器的机器名)
编辑完保存退出
用root权限在分别编辑两个机器名vi /etc/hostname
分别是nodename,datanode1
编辑完保存退出
再切换到hadoop用户 vi hadoop-1.0.0/conf/core-site.xml (配置访问hdfs 域名和端口,存储目录)输入以下
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://nodename:7070/</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/home/hadoop/hadoop-1.0.0/tmpdir/hdfs/name</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoop-1.0.0/tmpdir</value>
</property>
编辑完保存退出
vi hadoop-1.0.0/conf/mapred-site.xml (配置运行mapreduce 端口)输入以下
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>nodename:7071</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/home/hadoop/hadoop-1.0.0/tmpdir/mapred/local</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>/home/hadoop/hadoop-1.0.0/tmpdir/mapred/system</value>
</property>
</configuration>
编辑完保存退出
vi hadoop-1.0.0/conf/hdfs-site.xml (官方说明有自己找找)输入以下
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/hadoop/hadoop-1.0.0/tmpdir/hdfs/data</value>
</property>
</configuration>
编辑完保存退出
vi hadoop-1.0.0/conf/masters (配置hadoop 系统里面主机器名)输入以下
nodename
编辑完保存退出
启动hadoop 时候会根据salves 副本机器名去ssh 登陆其他电脑启动副本程序,我们需要配置无密码输入登录
不会的问谷歌.搜索 "ssh 无密码 登陆". 有很多教程,如果不配置的话。每次都要输入登录密码
输入 hadoop-1.0.0/bin/start-all.sh
starting namenode, logging to /home/hadoop/hadoop-1.0.0/libexec/../logs/hadoop-hadoop-namenode-localhost.localdomain.out
datanode1: starting datanode, logging to /home/hadoop/hadoop-1.0.0/libexec/../logs/hadoop-hadoop-datanode-localhost.localdomain.out
starting jobtracker, logging to /home/hadoop/hadoop-1.0.0/libexec/../logs/hadoop-hadoop-jobtracker-localhost.localdomain.out
datanode1: starting tasktracker, logging to /home/hadoop/hadoop-1.0.0/libexec/../logs/hadoop-hadoop-tasktracker-localhost.localdomain.out
--------------------------------------吃饭分界线----------------------------------------------------
hbase官方主页下载hbase.0.20.1版本(找不到下载可以点击这里寻找)
cd /home/hadoop
wget http://labs.mop.com/apache-mirror/hbase/hbase-0.92.1/hbase-0.92.1.tar.gz
tar zxvf hbase-0.92.1.tar.gz
编辑需要导入的环境变量vi hbase-0.92.1/conf/hbase-env.sh
export JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre
export HBASE_HOME=/home/hadoop/hbase-0.92.1
export HBASE_CLASSPATH=${HBASE_HOME}/bin/hbase
export HBASE_MANAGES_ZK=true
编辑完保存退出
vi hbase-0.92.1/conf/hbase-site.xml 输入以下
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://nodename:7070/hbase</value>
<description>The directory shared by RegionServers.</description>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>The replication count for HLog and HFile storage. Should not be greater than HDFS datanode count.</description>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>datanode1</value>
<description>The directory shared by RegionServers.</description>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hadoop/zookeeper</value>
<description>Property from ZooKeeper's config zoo.cfg.The directory where the snapshot is stored.</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
<description>The mode the cluster will be in. Possible values are false: standalone and pseudo-distributed setups with managed Zookeeper true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh)</description>
</property>
<property>
<name>hbase.master.dns.interface</name>
<value>eth0</value>
<description>The name of the Network Interface from which a master
should report its IP address.
</description>
</property>
<property>
<name>hbase.master.dns.nameserver</name>
<value>192.168.109.130</value>
<description>The host name or IP address of the name server (DNS) which a master should use to determine the host name used for communication and display purposes.
</description>
</property>
<property>
<name>hbase.regionserver.dns.interface</name>
<value>eth0</value>
</property>
<property>
<name>hbase.regionserver.dns.nameserver</name>
<value>192.168.109.130</value>
</property>
</configuration>
(内容太多了就不截图了)
hbase.master.dns.interface
hbase.master.dns.nameserver
hbase.regionserver.dns.interface
hbase.regionserver.dns.nameserver
这个分别配置查询域名的DNS,因为分布式启动hbase 时候需要通过dns查询主机的IP,不然就默认是localhost,
还有很多时候连接机器直接连接不上最好是关闭防火墙.
vi hbase-0.92.1/conf/regionservers
输入datanode1
编辑完保存退出
scp -r /home/hadoop/hadoop-1.0.0 192.168.109.130:/home/hadoop/hadoop-1.0.0
scp -r /home/hadoop/hbase-0.92.1 192.168.109.130:/home/hadoop/hbase-0.92.1
然后可以启动
hbase-0.92.1/bin/start-hbase.sh
datanode1: starting zookeeper, logging to /home/hadoop/hbase-0.92.1/logs/hbase-hadoop-zookeeper-localhost.localdomain.out
starting master, logging to /home/hadoop/hbase-0.92.1/logs/hbase-hadoop-master-localhost.localdomain.out
datanode1: starting regionserver, logging to /home/hadoop/hbase-0.92.1/logs/hbase-hadoop-regionserver-localhost.localdomain.out
检查是否启动成功
ps aux|grep java
nodename 机器上面运行着
org.apache.hadoop.hdfs.server.namenode.NameNode
org.apache.hadoop.mapred.JobTracke
org.apache.hadoop.hbase.master.HMaster
datanode1 机器上面运行着
org.apache.hadoop.hdfs.server.datanode.DataNode
org.apache.hadoop.mapred.TaskTracker
org.apache.hadoop.hbase.zookeeper.HQuorumPeer
org.apache.hadoop.hbase.regionserver.HRegionServer
这样就运行成功了
参考资料:
http://openjdk.java.net/install/
http://www.hadoopor.com/ 里面有很多疑难杂症的解决办法
http://stevenz.blog.hexun.com/15798089_d.html ssh 无密码登录说明
http://www.360doc.com/content/10/1016/15/1317564_61493507.shtml 安装配置DNS和反向解析
http://www.cnblogs.com/ventlam/archive/2011/01/22/HBaseCluster.html
http://linuxjcq.blog.51cto.com/3042600/760634
测试工具使用
http://hi.baidu.com/cavaran/blog/item/031b883ce817b7d89e3d626f.html