APACHE HIVE TM
The Apache Hive ™ data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL. Structure can be projected onto data already in storage. A command line tool and JDBC driver are provided to connect users to Hive.
HBASE操作组件功能比较:

1、mysql安装
yum -y install wget
wget -i -c http://dev.mysql.com/get/mysql57-community-release-el7-10.noarch.rpm
yum -y install mysql57-community-release-el7-10.noarch.rpm
yum -y install mysql-community-server
service mysqld start
netstat -nltp | grep 3306
//systemctl restart mysqld.service(重启命令)
grep “password” /var/log/mysqld.log 查看原始密码

如需重置密码:
# 停止MySQL
sudo systemctl stop mysqld
# 设置关闭鉴权和网络连接
sudo systemctl set-environment MYSQLD_OPTS="--skip-grant-tables --skip-networking"
# 启动MySQL
sudo systemctl start mysqld
# 以root账号登录MySQL,并重设密码
mysql -uroot
update mysql.user set authentication_string=PASSWORD("NewPassword") where User='root' AND Host = 'localhost';
flush privileges;
quit
# 停止MySQL
sudo systemctl stop mysqld
# 恢复鉴权和网络连接设置
sudo systemctl unset-environment MYSQLD_OPTS
# 启动MySQL
sudo systemctl start mysqld
2、HIVE安装:
下载apache-hive-3.1.2-bin.tar.gz
https://mirrors.tuna.tsinghua.edu.cn/apache/hive/
配置
cp hive-env.sh.template hive-env.sh
编辑hive-site.xml
<configuration>
<!-- 记录HIve中的元数据信息 记录在mysql中 -->
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://127.0.0.1:3306/hive?createDatabaseIfNotExist=true&useSSL=false</value>
</property>
<!-- jdbc mysql驱动 -->
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<!-- mysql的用户名和密码 -->
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>root</value>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>
<property>
<name>hive.exec.scratchdir</name>
<value>/user/hive/tmp</value>
</property>
<!-- 日志目录 -->
<property>
<name>hive.querylog.location</name>
<value>/user/hive/log</value>
</property>
<!-- 设置metastore的节点信息 -->
<property>
<name>hive.metastore.uris</name>
<value>thrift://master:9083</value>
</property>
<!-- 客户端远程连接的端口 -->
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>0.0.0.0</value>
</property>
<property>
<name>hive.server2.webui.host</name>
<value>0.0.0.0</value>
</property>
<!-- hive服务的页面的端口 -->
<property>
<name>hive.server2.webui.port</name>
<value>10002</value>
</property>
<property>
<name>hive.server2.long.polling.timeout</name>
<value>5000</value>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<value>true</value>
</property>
<property>
<name>datanucleus.autoCreateSchema</name>
<value>false</value>
</property>
<property>
<name>datanucleus.fixedDatastore</name>
<value>true</value>
</property>
<property>
<name>hive.execution.engine</name>
<value>mr</value>
</property>
</configuration>
编辑Hadoop安装目录中,配置文件中的core-site.xml文件
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
拷贝一个mysql的jdbc驱动jar包到hive的lib目录中
配置环境变量
export JAVA_HOME=/usr/local/java/jdk1.8.0_161
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin
export HADOOP_HOME=/usr/local/hadoop/hadoop-3.1.2
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export HBASE_HOME=/usr/local/hbase-2.2.2
export HIVE_HOME=/usr/local/hive-3.1.2/apache-hive-3.1.2-bin
export PHOENIX_HOME=/usr/local/phoenix-5.1.2/phoenix-hbase-2.2-5.1.2-bin
export PHOENIX_CLASSPATH=$PHOENIX_HOME
export PATH=$HBASE_HOME/bin:$HIVE_HOME/bin:$PHOENIX_HOME/bin:$PATH
source /etc/profile配置生效
重启hadoop
hive --service metastore启动元数据服务
初始化hive的元数据库
./schematool -initSchema -dbType mysql
启动 hive

其它情况辅助命令:
注:hbase元数据损坏
hbase hbck -metaonly(检查命令)
hadoop fs -rm -r /hbase(删除重启)
删除zookeeper元数据
1、找到zookeeper的安装目录
2、打开zookeeper客户端,并连接到服务器:zookeeper-client -server localhost:2181
3、ls 可以看到hbase的目录
4、rmr /hbase/meta-region-server
5、quit退出,重启hbase
注:
<!--进入zk客户端-->
ZK_HOME/zkCli.sh
<!--清空hbase-->
rmr /hbase
本文介绍了Apache Hive数据仓库软件的使用方法,包括通过SQL读写分布式存储中的大型数据集。此外,还详细阐述了MySQL和Hive的安装步骤及配置过程。
772

被折叠的 条评论
为什么被折叠?



