背景
没有用什么高大上的BI工具,一直在控制台操作,遂想看看作为分析师以及用户的话,什么样的结果展现形式比较明了。选了之前接触过的zeppelin.
角色分工
+------------------------------------------+------------------+-------------------+
| hadoop(hostname/IP:roler) | hive | zeppelin |
+------------------------------------------+------------------+-------------------+
| sv000/172.29.6.100:master | server | server |
+------------------------------------------+------------------+-------------------+
| sv001/172.29.6.101:slave | -------- | -------- |
+------------------------------------------+------------------+-------------------+
| sv002/172.29.6.102:slave | -------- | -------- |
+------------------------------------------+------------------+-------------------+
| sv003/172.29.6.103:slave | -------- | -------- |
+------------------------------------------+------------------+-------------------+
事前准备
- Hadoop-2.5.2
- Hive-1.2.1
- zeppelin-0.5.6
环境搭建
Hadoop和Hive的环境这里就不在累述了,之前的博文中都已经写过了。这里只写一些注意点:
zeppelin中要连接hive数据库,就必须使用远程连接,这样的话,就需要在hive这边设置用户名和密码,否则在zeppelin的web端去连接的时候,就直接报错了。
$HIVE_HOME/conf/hive-site.xml中追加:
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>ername to use against metastoredatabase</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
<description>password to use against metastoredatabase</description>
</property>
启动hive
./bin/hive --service metastore &
./bin/hiveserver2 &