HADOOP、HIVE、HBASE整合

版本信息:
hadoop-0.20.2.zip
zookeeper-3.3.5.tar.gz
hive-0.6.0.tar.gz
hbase-0.20.3.tar.gz
一:安装hadoop
1:解压安装包,修改conf/hadoop-env.sh文件
将export JAVA_HOME的值修改为你机上的jdk安装目录,比如export JAVA_HOME=/cygdrive/d/java/jdk1.6.0/jdk,/cygdrive是Cygwin安装成功后系统的根目录(cygdrive/d:就相当于xp系统中的D盘)
export JAVA_HOME=/cygdrive/d/java/jdk1.7.0_04

2:安装和配置ssh  
2.1:安装 执行$ ssh-host-config
2.2:配置
2.2.1:启动sshd服务
net start sshd
2.2.2:连接host
ssh localhost


3:配置hadoop
3.1:编辑conf/hdfs-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/nianzai/hadoop</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
</configuration>

3.2:编辑hdfs-site.xml
<property>
  <name>dfs.replication</name>
  <value>1</value>
  <description>Default block replication.
  The actual number of replications can be specified when the file is created.
  The default is used if replication is not specified in create time.
  </description>
</property>

3.3:编辑mapred-site.xml
<property>
  <name>mapred.job.tracker</name>
  <value>172.16.13.70:9001</value>
  <description>The host and port that the MapReduce job tracker runs
  at.  If "local", then jobs are run in-process as a single map
  and reduce task.
  </description>
</property>

3.4:配置masters、master
3.5:启动hadoop命令
sh hadoop namenode -format
sh start-all.sh
sh hadoop fs -mkdir input

3.5:验证jobtracker启动成功,例如Job tracker
http://localhost:50030/jobtracker.jsp

3.6:namenode验证----有问题?
http://master名称:50070,能看到Live Datanodes。

二:安装Zookeeper
1:解压安装包
2:修改zookeeper-3.3.4/conf目录下面的 zoo_sample.cfg修改为zoo.cfg
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
dataDir=/home/hadoop/zookeeper-3.3.5/
# the port at which the clients will connect
clientPort=2181

3:启动ZooKeeper服务器进程
cd zookeeper-3.3.4/ 
bin/zkServer.sh start

4:jps查看ZooKeeper服务启动是否

三:安装hbase
1:解压安装
2:配置hbase
2.1:修改hbase-env.sh
export JAVA_HOME=/cygdrive/d/java/jdk1.7.0_04
export HBASE_MANAGES_ZK=false

2.2:将hbase0.90.2 lib目录下hadoop-core-0.20-append-r1056497.jar删除,替换成hadoop0.20.2 下的hadoop-0.20.2-core.jar
2.3:修改hbase-site.xml文件
<property>
    <name>hbase.rootdir</name>
    <value>hdfs://172.16.13.70:8020/hbase</value>
        <description>The directory shared by region servers.Should be fully-qualified to include the filesystem to use.E.g: hdfs://NAMENODE_SERVER:PORT/HBASE_ROOTDIR
   </description>
   </property>
   <property>
    <name>hbase.master</name>
    <value>172.16.13.70:60000</value>
    <description>The host and port that the HBase master runs at.</description>
   </property>

regionservers
master

2.4:启动hbase
sh start-hbase.sh

2.5:启动jps查看hbase启动是否

四:安装hive
1:修改hive-default.xml文件
<property>
  <name>hive.exec.scratchdir</name>
  <value>/data/work/hive/tmp</value>
  <description>Scratch space for Hive jobs</description>
</property>

<property>
  <name>hive.querylog.location</name>
  <value>/data/work/hive/querylog</value>
  <description>Scratch space for Hive jobs</description>
</property>

<property>
  <name>hive.hwi.listen.host</name>
  <value>0.0.0.0</value>
  <description>This is the host address the Hive Web Interface will listen on</description>
</property>

<property>
  <name>hive.hwi.listen.port</name>
  <value>9999</value>
  <description>This is the port the Hive Web Interface will listen on</description>
</property>

2:修改bin/hive-config.sh,支持jdk
export JAVA_HOME=/cygdrive/d/java/jdk1.7.0_04
export HIVE_HOME=/home/hadoop/hive-0.6.0
export HADOOP_HOME=/home/hadoop/hadoop-0.20.2
export HBASE_HOME=/home/hadoop/hbase-0.20.3

3:启动bin/hive

五:修改/etc/profile
export HADOOP_HOME=/home/hadoop/hadoop-0.20.2
export HIVE_HOME=/home/hadoop/hive-0.6.0
export PATH=$HADOOP_HOME/bin:$HIVE_HOME/bin:$PATH

六:修改usr/lib/hadoop-0.20/conf/hadoop-env.sh
export HBASE_HOME=/cygdrive/c/home/hadoop/hbase-0.20.3

export HADOOP_CLASSPATH="$HBASE_HOME:$HBASE_HOME/lib/hbase-0.20.3.jar:$HADOOP_CLASSPATH"

七:执行生效
. /etc/profile

八:问题解决FAQ
1:运行Hive时,也许会出现如下错误,表示你的JVM分配的空间不够,错误信息如下:
Invalid maximum heap size: -Xmx4096m
The specified size exceeds the maximum representable size.
Could not create the Java virtual machine.

解决方法:
/work/hive/bin/ext# vim util/execHiveCmd.sh 文件中第33行
修改,
HADOOP_HEAPSIZE=4096

HADOOP_HEAPSIZE=256

另外,在 /etc/profile/ 加入 export $HIVE_HOME=/work/hive


单节点HBase的连接
./bin/hive -hiveconf hbase.master=172.16.13.70:60000  

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

tony168hongweigan

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值