Hive
1.hive的安装
Hive安装准备:
1)Hadoop;
2)Mysql;
1. 只需要安装在集群里面的一台节点上即可,此处选择hadoop1节点
2. 在Hadoop1上安装mariadb
yum -y install mariadb-server mariadb
3. 开启服务并开机自启
systemctl start mariadb.service
systemctl enable mariadb.service
4. 设置密码。第一次登陆时直接空密码登陆,之后使用sql语句设置密码
mysql -u root -p
登录之后,先查看databases是否正常,之后sql语句设置密码
> use mysql;
> update user set password=password( '123456' ) where user= 'root' ;
然后设置root用户可以从任何主机登陆,对任何的库和表都有访问权限
> grant all privileges on *.* to root@'%' identified by '123456';
> grant all privileges on *.* to root@'hadoop1' identified by '123456';
> grant all privileges on *.* to root@'localhost' identified by '123456';
> FLUSH PRIVILEGES;
3)mysql-connector-java,放在$HIVE_HOME/lib目录下;
配置:
-
修改配置文件/etc/profile:
export HIVE_HOME=…
export PATH= P A T H : PATH: PATH:HIVE_HOME/bin然后source /etc/profile
-
hive-env.sh中添加信息:
export JAVA_HOME=…
export HADOOP_HOME=…
export HIVE_HOME=…(直接从profile里面复制)
-
hive-site.xml
1)Mysql连接方式,url,用户名,密码;
2)log路径;
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://hadoop1:3306/hive?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
</property>
<!-- 配置log等路径 -->
<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/hive</value>
</property>
<property>
<name>hive.exec.local.scratchdir</name>
<value>/tmp/hive/local</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/tmp/hive/resources</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
<property>
<name>hive.querylog.location</name>
<value>/tmp/hive/querylog</value>
<description>Location of Hive run time structured log file</description>
</property>
<property>
<name>hive.server2.logging.operation.log.location</name>
<value>/tmp/hive/operation_logs</value>
<description>Top level directory where operation logs are stored if logging functionality is enabled</description>
</property>
</configuration>
-
为Hive创建HDFS目录:
hdfs dfs -mkdir /tmp // /usr/hive/warehouse Hive 数据存放的路径 hdfs dfs -mkdir -p /usr/hive/warehouse hdfs dfs -chmod g+w /tmp hdfs dfs -chmod g+w /usr/hive/warehouse
-
访问方式
1)hive(不推荐)
2)从 Hive 2.1 版本开始, 在启动 Hive 之前需运行 schematool 命令来执行初始化操作:
schematool -dbType mysql -initSchema (只需第一次启动时执行一次)先启动:hiveserver2 // 多用户,安全,推荐使用
再执行:beeline
2. 问题录
1)schematool -dbType mysql -initSchema failed time zone ET ,WDT
mysql:
set time_zone="SYSTEM";
set global time_zone="+8:00";
flush privileges;
问题:User: root is not allowed to impersonate root
需要在hadoop的core-site.xml配置root代理。
在hadoop的配置文件core-site.xml中添加如下属性:(所有节点都要)
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
重启Hadoop集群。