Hive之JAVA API
一.准备工作
1.JDK的版本:1.8+
2.Hadoop版本:2.7.3
3.Hive版本:2.3.3
4.Hadoop环境都已经正常启动
二.启动hiveserver2服务
必须启动否则在idea中运行会出问题
hive --service hiveserver2
三.在hive中创建mydb数据库,表格,引入表格
1.创建名为mydb的数据库
create database mydb comment 'This is My DB';
2.使用mydb数据库
use mydb;
3.创建表格emp框架
create table if not exists emp(
empno int,
ename string,
job string,
mgr int,
hiredate string,
sal int,
comm int,
deptno int
)
row format delimited fields terminated by ','
lines terminated by '\n';
4.添加数据到emp表中
1)此处为我将本地的emp.csv上传到了虚拟机中的地址
文件内容如下:
7369,SMITH,CLERK,7902,1980/12/17,800,20
7499,ALLEN,SALESMAN,7698,1981/2/20,1600,300,30
7521,WARD,SALESMAN,7698,1981/2/22,1250,500,30
7566,JONES,MANAGER,7839,1981/4/2,2975,20
7654,MARTIN,SALESMAN,7698,1981/9/28,1250,1400,30
7698,BLAKE,MANAGER,7839,1981/5/1,2850,30
7782,CLARK,MANAGER,7839,1981/6/9,2450,10
7788,SCOTT,ANALYST,7566,1987/4/19,3000,20
7839,KING,PRESIDENT,1981/11/17,5000,10
7844,TURNER,SALESMAN,7698,1981/9/8,1500,0,30
7876,ADAMS,CLERK,7788,1987/5/23,1100,20
7900,JAMES,CLERK,7698,1981/12/3,9500,30
7902,FORD,ANALYST,7566,1981/12/3,3000,20
7934,MILLER,CLERK,7782,1982/1/23,1300,10
2)将文件内容导入mydb数据库的emp表格中
load data local inpath 'file:///files/emp.csv' overwrite into table emp;
四.打开idea创建一个新的项目
1.pom中引入依赖
<dependencies>
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>2.3.3</version>
</dependency>
</dependencies>
2.创建一个java类,并编写代码
- 实现按照部门号统计各部门的工资与奖金的总额,并按照总额降序排序
import java.sql.*;
public class HiveDemo {
public static void main(String[] args) throws Exception {
Class.forName("org.apache.hive.jdbc.HiveDriver");
Connection connection = DriverManager.getConnection("jdbc:hive2://192.168.163.128:10000/mydb");
Statement statement = connection.createStatement();
ResultSet resultSet = statement.executeQuery("select deptno,sum((cast((comm is not null) as int) + sal)) sums from emp group by deptno order by sums desc");
while (resultSet.next()) {
System.out.println(resultSet.getInt(1) + ","
+ resultSet.getInt(2));
}
}
}