linux 3台虚拟机:
192.169.200.101 cpyftest-1(主机名或者域名) Namenode(别名)
192.169.200.161 cpyftest-2(主机名或者域名) Datanode1(别名)
192.169.200.191 cpyftest-3(主机名或者域名) Datanode2(别名)
hadoop可运行兼容文件:
hadoop1.0.2替换了核心包的可以正常使用的版本包含hbase0.94.0
>1.设置/etc/hosts
192.169.200.101 cpyftest-1 Namenode
192.169.200.161 cpyftest-2 Datanode1
192.169.200.191 cpyftest-3 Datanode2
>2设置账户:
3台虚拟机的用户组和用户名和密码一致
>3.设置三台虚拟机ssh无密码登录
在三台虚拟机上运行命令:ssh-keygen -t rsa
把生成的3个id_rsa.pub拷贝到 authorized_keys中 每台虚拟机持有一份该文件
参考命令:cat id_rsa.pub >> authorized_keys 拷贝内容
在101运行命令:scp authorized_keys cpyftest-2 :/.ssh/ 实在不行就打开文件复制黏贴
>4.配置hadoop文件
hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_07
core-sit.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://192.169.200.101:9000</value>
</property>
</configuration>
hdf-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value> 设置拷贝2份
</property>
<property>
<name>dfs.data.dir</name>
<value>/root/hadoop/data</value>
</property>
<property>
<name>dfs.web.ugi</name>
<value>用户名,用户组</value> 设置web访问权限否则提示:no id webuser
</property>
<property>
<name>dfs.permissions</name>
<value>true</value>
</property>
<property>
<name>dfs.permissions.supergroup</name>
<value>supergroup</value>
</property>
</configuration>
mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>192.169.200.101:9001</value>
</property>
</configuration>
>5配置master和slaves
在hadoop/conf文件夹中找到
两个文件
master配置如下
192.169.200.101 或 cpyftest-1 或 Namenode 选择一个即可
slaves
配置其他两个
Datanode1
Datanode2
>6把配置好的hadoop文件每台机器拷贝一份
>7在101机器进入hadoop/bin目录中运行命令:./start-all.sh
运行结束后输入命令:./jps 查看进程(jps是java自带的,如果提示没有该命令请检查java的环境变量是否设置了,也可以进入java/bin目录中运行该命令)
28037 NameNode 名称节点进程 28037是进程号
28950 Jps28220 SecondaryNameNode 辅助名称节点进程 28220是进程号
28259 JobTracker 作业跟踪器进程 28259是进程号
>8在161和191运行./jps
查看进程都会有如下进程:
DataNode 数据节点进程
Jps java进程
TaskTracker 任务跟踪器进程
至此hadoop可以运行了
请查看http://namenode:50070查看是否运行正常以及后台log日志是否报错
>9配置hbase该文件也是替换了hadoop的核心包,可以正常运行,同样需要拷贝到其他的2台机器上,要求路径相同,并且计算机的时间相同
hbase-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_07
其中默认文件的jar包引入错误
# Extra Java CLASSPATH elements. Optional.
export HBASE_CLASSPATH=$HADOOP_CLASSPATH:$HBASE_HOME/hbase-0.94.0.jar:$HBASE_HOME/hbase-0.94.0-tests.jar:$HBASE_HOME/conf:${HBASE_HOME}/lib/zookeeper-3.4.3.jar
export HBASE_MANAGES_ZK=true 完全分布式为true,自带的zookeeper
hbase.site.xml
<property>
<name>hbase.rootdir</name>
<value>hdfs://192.169.200.101:9000/hbase</value>
<description>The directory shared by region servers and into
which HBase persists. The URL should be 'fully-qualified'
to include the filesystem scheme. For example, to specify the
HDFS directory '/hbase' where the HDFS instance's namenode is
running at namenode.example.org on port 9000, set this value to:
hdfs://namenode.example.org:9000/hbase. By default HBase writes
into /tmp. Change this configuration else all data will be lost
on machine restart.
</description>
</property>
<property>
<name>hbase.tmp.dir</name>
<value>hdfs://192.169.200.101:9000/tmp</value>
<description>Temporary directory on the local filesystem.
Change this setting to point to a location more permanent
than '/tmp' (The '/tmp' directory is often cleared on
machine restart).
</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
<description>The mode the cluster will be in. Possible values are
false for standalone mode and true for distributed mode. If
false, startup will run all HBase and ZooKeeper daemons together
in the one JVM.
</description>
</property>
<property>
<name>hbase.tmp.dir</name>
<value>hdfs://192.169.200.101:9000/tmp</value>
<description>Temporary directory on the local filesystem.
Change this setting to point to a location more permanent
than '/tmp' (The '/tmp' directory is often cleared on
machine restart).
</description>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>192.169.200.101,192.169.200.161,192.169.200.191</value>
<description>Comma separated list of servers in the ZooKeeper Quorum.
For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com".
By default this is set to localhost for local and pseudo-distributed modes
of operation. For a fully-distributed setup, this should be set to a full
list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in hbase-env.sh
this is the list of servers which we will start/stop ZooKeeper on.
</description>
</property>
regionservers中配置
Namenode
Datanode1
Datanode2
>10进入hbase/bin目录中
运行:./start-hbase.sh
运行:./hbase shell
在shell中运行list 不报错则运行正常
查看Masterhttp://192.169.200.101:60010/master.jsp
查看Region Server
http://192.169.200.101:60030/regionserver.jsp
查看ZK Tree
http://192.169.200.101:60010/zk.jsp
参考:http://www.itpub.net/thread-1713683-1-1.html
http://genius-bai.iteye.com/blog/641724
http://www.yankay.com/wp-content/hbase/book.html
http://www.linuxidc.com/Linux/2012-03/55622.htm