Hive安装配置与基础使用

最新推荐文章于 2022-10-20 00:01:02 发布

ybxmCsdn

最新推荐文章于 2022-10-20 00:01:02 发布

阅读量254

点赞数

分类专栏：大数据开发文章标签： hive

本文链接：https://blog.csdn.net/sjlving123/article/details/108874134

版权

大数据开发专栏收录该内容

9 篇文章 1 订阅

订阅专栏

Hive环境的部署

MySQL安装与配置

安装：先对mysql压缩文件解压，命令为 tar -xvf MySQL-5.6.26-1.linux_glibc2.5.x86_64.rpm-bundle.tar 。然后用rmp命令来安装mysql服务器和命令行客户端，安装命令为 rpm -ivh MySQL-server-5.6.26-1.linux_glibc2.5.x86_64.rpm和rpm -ivh MySQL-client-5.6.26-1.linux_glibc2.5.x86_64.rpm。 # 在安装过程中，会遇到一些依赖问题，可以通过yum install，perl yum install libaio，yum -y install autoconf 来解决，如果遇到conflict with的包冲突问题，可以用rpm -e mysql-libs-5.1.73-3.el6_5.x86_64 --nodeps 来移除老的版本（版本不一定相同，但解决方法相同）。

配置：https://www.cnblogs.com/rmxd/p/11318525.html

https://blog.csdn.net/qq_32786873/article/details/78846008

//安装完mysql服务器后通过运行/usr/bin/mysql_secure_installation脚本来进行基础配置，其中可以启动root用户的远程登陆和修改root用户的密码。或者你也可以通过运行以下命令来修改root用户的远程登陆权限：

mysql> grant all privileges on *.* to 'root'@'%' identified by '你的密码' with grant option;

mysql> flush privileges;

Hive安装与配置

hive软件的安装：

从http://hive.apache.org/downloads.html中下载一个hive版本，然后直接解压即可使用，例如

tar -xzvf apache-hive-1.2.1-bin.tar.gz

hive的配置：

1. 在~/.bashrc下配置环境变量，例如export HIVE_HOME=/home/user1/hive，export PATH=$PATH:$HIVE_HOME/bin ，然后执行命令source ~/.bashrc

2. 配置hive安装目录下的conf/hive-site.xml文件

<name>javax.jdo.option.ConnectionURL</name>

<value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>

<description>JDBC connect string for a JDBC metastore</description>

</property>

<name>javax.jdo.option.ConnectionDriverName</name>

<value>com.mysql.jdbc.Driver</value>

<description>Driver class name for a JDBC metastore</description>

</property>

<name>javax.jdo.option.ConnectionUserName</name>

<description>username to use against metastore database</description>

</property>

<name>javax.jdo.option.ConnectionPassword</name>

<description>password to use against metastore database</description>

</property>

</configuration>

3. 将mysql对应版本的jdbc jar包放入hive的安装目录的lib中

详细的参考链接：https://www.cnblogs.com/xiaolan-Lin/p/11358901.html

Hive shell命令行与API使用

命令行的使用

1. 进入hive的shell命令行界面直接使用hive：hive

2. 在linux当前的用户目录中中创建一个.hiverc文件，然后写入以下命令，即可在hive命令行界面看到当前库和查询结果的字段名称：

set hive.cli.print.header=true;

set hive.cli.print.current.db=true;

3. 启动hive服务器，以便在其他的节点可以通过客户端连接该hive服务器，启动命令为：hiveserver2 -hiveconf hive.root.logger=DEBUG,console，当然也可以将服务启动于后台nohup bin/hiveserver2 1>/dev/null 2>&1 &。

然后可以在其他的节点（该节点也安装了hive框架）来连接hive服务器，连接命令为：先输入beeline进入beeline命令行，再输入!connect jdbc:hive2://localhost:10000，最后输入用户名和密码（启动hdfs的用户名和密码）即可连接成功（在实际使用的时候需要将localhost换成hive服务器的ip地址）。或者直接输入beeline -u jdbc:hive2://localhost:10000 -n root一部到位。

4. 在不进入hive命令行的情况下，可以直接在linux命令行下通过命令hive -e "select * from db_order.t_order"来直接操作hive。

如果要执行特别复杂的的hql语句，那么可以把hql语句写入一个文件例如x.hql，然后使用hive -f /home/user/x.hql来执行。x.hql文件中的内容可以如下所示（sql语句与mysql中的语句相似）：

select * from db_order.t_order;

select count(1) from db_order.t_user;

5. 在shell命令行下对hive数据库中的数据进行增删改查或者是对表结构进行操作，具体操作细节参看我的另一篇博客：

https://blog.csdn.net/sjlving123/article/details/108953780

客户端API的使用

hive java客户端api的使用类似于mysql命令在java程序中被调用的方式。首先通过org.apache.hive.jdbc.HiveDriver来创建hive驱动，建立与hive服务器的连接。然后编写hive sql命令，通过连接对象将sql语句发送到hive服务器中交由其处理。所以，无论是hive shell还是hive api，要想对hive数据库中的数据进行操作，都需要通过hive sql语句。这与hbase数据库不同，hbase不支持SQL语句，对hbase中数据的操作需要使用特定的已经封装好的函数，且在shell命令行和java api中的操作命令所有差异（hbase 的shell命令行一般不常用）。

另外，hive sql语句是对mapreduce程序的封装，hive能将hql命令转化为mapreduce程序，它只能查找在hdfs中已经存在的数据，并且不像mysql一样还能对数据进行增、删、改操作，hive只支持对数据的查找，并且由于是由mapreduce程序实现的，其效率不高。

package com.bigdata.hadoop.hive;

import org.junit.After;
import org.junit.Before;
import org.junit.Test;

import java.sql.*;

/**
* JDBC 操作 Hive（注：JDBC 访问 Hive 前需要先启动HiveServer2）
*/
public class HiveJDBC {

private static String driverName = "org.apache.hive.jdbc.HiveDriver";
private static String url = "jdbc:hive2://hdpcomprs:10000/db_comprs";
private static String user = "hadoop";
private static String password = "";

private static Connection conn = null;
private static Statement stmt = null;
private static ResultSet rs = null;

// 加载驱动、创建连接
@Before
public void init() throws Exception {
Class.forName(driverName);
conn = DriverManager.getConnection(url,user,password);
stmt = conn.createStatement(); # 创建hive的连接对象stmt
}

// 创建数据库
@Test
public void createDatabase() throws Exception {
String sql = "create database hive_jdbc_test";
System.out.println("Running: " + sql);
stmt.execute(sql); # 通过hive的连接对象stmt来执行hql命令
}

// 查询所有数据库
@Test
public void showDatabases() throws Exception {
String sql = "show databases";
System.out.println("Running: " + sql);
rs = stmt.executeQuery(sql);
while (rs.next()) {
System.out.println(rs.getString(1));
}
}

// 创建表
@Test
public void createTable() throws Exception {
String sql = "create table emp(\n" +
"empno int,\n" +
"ename string,\n" +
"job string,\n" +
"mgr int,\n" +
"hiredate string,\n" +
"sal double,\n" +
"comm double,\n" +
"deptno int\n" +
")\n" +
"row format delimited fields terminated by '\\t'";
System.out.println("Running: " + sql);
stmt.execute(sql);
}

// 查询所有表
@Test
public void showTables() throws Exception {
String sql = "show tables";
System.out.println("Running: " + sql);
rs = stmt.executeQuery(sql);
while (rs.next()) {
System.out.println(rs.getString(1));
}
}

// 查看表结构
@Test
public void descTable() throws Exception {
String sql = "desc emp";
System.out.println("Running: " + sql);
rs = stmt.executeQuery(sql);
while (rs.next()) {
System.out.println(rs.getString(1) + "\t" + rs.getString(2));
}
}

// 加载数据（将本地数据加载到hdfs的指定目录中去）
@Test
public void loadData() throws Exception {
String filePath = "/home/hadoop/data/emp.txt";
String sql = "load data local inpath '" + filePath + "' overwrite into table emp";
System.out.println("Running: " + sql);
stmt.execute(sql);
}

// 查询数据
@Test
public void selectData() throws Exception {
String sql = "select * from emp";
System.out.println("Running: " + sql);
rs = stmt.executeQuery(sql);
System.out.println("员工编号" + "\t" + "员工姓名" + "\t" + "工作岗位");
while (rs.next()) {
System.out.println(rs.getString("empno") + "\t\t" + rs.getString("ename") + "\t\t" + rs.getString("job"));
}
}

// 统计查询（会运行mapreduce作业）
@Test
public void countData() throws Exception {
String sql = "select count(1) from emp";
System.out.println("Running: " + sql);
rs = stmt.executeQuery(sql);
while (rs.next()) {
System.out.println(rs.getInt(1) );
}
}

// 删除数据库表
@Test
public void deopTable() throws Exception {
String sql = "drop table if exists emp";
System.out.println("Running: " + sql);
stmt.execute(sql);
}

// 删除数据库
@Test
public void dropDatabase() throws Exception {
String sql = "drop database if exists hive_jdbc_test";
System.out.println("Running: " + sql);
stmt.execute(sql);
}

// 释放资源
@After
public void destory() throws Exception {
if ( rs != null) {
rs.close();
}
if (stmt != null) {
stmt.close();
}
if (conn != null) {
conn.close();
}
}
}