SpringBoot整合hive-jdbc详解

转载自:https://blog.csdn.net/pengjunlee/article/details/81838480,建议看原文。

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

 

2018年08月19日 15:40:18 pengjunlee 阅读数:10402

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/pengjunlee/article/details/81838480

本文将对如何在Springboot项目中整合hive-jdbc进行简单示例和介绍,项目的完整目录层次如下图所示。

官方帮助文档地址:https://cwiki.apache.org/confluence/display/Hive/HiveClient#HiveClient-JDBC

 添加依赖与配置

首先,需要在工程POM文件中引入hive-jdbc所需的Maven依赖。

 
  1. <dependency>

  2. <groupId>org.springframework.boot</groupId>

  3. <artifactId>spring-boot-starter-web</artifactId>

  4. </dependency>

  5. <dependency>

  6. <groupId>com.alibaba</groupId>

  7. <artifactId>druid-spring-boot-starter</artifactId>

  8. <version>1.1.1</version>

  9. </dependency>

  10. <dependency>

  11. <groupId>org.springframework.boot</groupId>

  12. <artifactId>spring-boot-starter-jdbc</artifactId>

  13. </dependency>

  14. <dependency>

  15. <groupId>org.springframework.data</groupId>

  16. <artifactId>spring-data-hadoop</artifactId>

  17. <version>2.5.0.RELEASE</version>

  18. </dependency>

  19. <dependency>

  20. <groupId>org.apache.hive</groupId>

  21. <artifactId>hive-jdbc</artifactId>

  22. <version>2.3.3</version>

  23. <exclusions>

  24. <exclusion>

  25. <groupId>org.eclipse.jetty.aggregate</groupId>

  26. <artifactId>*</artifactId>

  27. </exclusion>

  28. </exclusions>

  29. </dependency>

  30. <dependency>

  31. <groupId>org.apache.tomcat</groupId>

  32. <artifactId>tomcat-jdbc</artifactId>

  33. </dependency>

  34. <dependency>

  35. <groupId>jdk.tools</groupId>

  36. <artifactId>jdk.tools</artifactId>

  37. <version>1.8</version>

  38. <scope>system</scope>

  39. <systemPath>${JAVA_HOME}/lib/tools.jar</systemPath>

  40. </dependency>

然后,在核心配置文件 application.yml 中添加数据源相关配置。

 
  1. hive:

  2. url: jdbc:hive2://172.16.250.234:10000/hive

  3. driver-class-name: org.apache.hive.jdbc.HiveDriver

  4. type: com.alibaba.druid.pool.DruidDataSource

  5. user: hadoop

  6. password: Pure@123

  7. # 下面为连接池的补充设置,应用到上面所有数据源中

  8. # 初始化大小,最小,最大

  9. initialSize: 1

  10. minIdle: 3

  11. maxActive: 20

  12. # 配置获取连接等待超时的时间

  13. maxWait: 60000

  14. # 配置间隔多久才进行一次检测,检测需要关闭的空闲连接,单位是毫秒

  15. timeBetweenEvictionRunsMillis: 60000

  16. # 配置一个连接在池中最小生存的时间,单位是毫秒

  17. minEvictableIdleTimeMillis: 30000

  18. validationQuery: select 1

  19. testWhileIdle: true

  20. testOnBorrow: false

  21. testOnReturn: false

  22. # 打开PSCache,并且指定每个连接上PSCache的大小

  23. poolPreparedStatements: true

  24. maxPoolPreparedStatementPerConnectionSize: 20

 配置数据源与JdbcTemplate

我们可以使用SpringBoot默认的 org.apache.tomcat.jdbc.pool.DataSource 数据源,并使用这个数据源装配一个JdbcTemplate。

 
  1.  
  2. import org.apache.tomcat.jdbc.pool.DataSource;

  3. import org.slf4j.Logger;

  4. import org.slf4j.LoggerFactory;

  5. import org.springframework.beans.factory.annotation.Autowired;

  6. import org.springframework.beans.factory.annotation.Qualifier;

  7. import org.springframework.context.annotation.Bean;

  8. import org.springframework.context.annotation.Configuration;

  9. import org.springframework.core.env.Environment;

  10. import org.springframework.jdbc.core.JdbcTemplate;

  11.  
  12. @Configuration

  13. public class HiveJdbcConfig {

  14.  
  15. private static final Logger logger = LoggerFactory.getLogger(HiveJdbcConfig.class);

  16.  
  17. @Autowired

  18. private Environment env;

  19.  
  20. @Bean(name = "hiveJdbcDataSource")

  21. @Qualifier("hiveJdbcDataSource")

  22. public DataSource dataSource() {

  23. DataSource dataSource = new DataSource();

  24. dataSource.setUrl(env.getProperty("hive.url"));

  25. dataSource.setDriverClassName(env.getProperty("hive.driver-class-name"));

  26. dataSource.setUsername(env.getProperty("hive.user"));

  27. dataSource.setPassword(env.getProperty("hive.password"));

  28. logger.debug("Hive DataSource Inject Successfully...");

  29. return dataSource;

  30. }

  31.  
  32. @Bean(name = "hiveJdbcTemplate")

  33. public JdbcTemplate hiveJdbcTemplate(@Qualifier("hiveJdbcDataSource") DataSource dataSource) {

  34. return new JdbcTemplate(dataSource);

  35. }

  36.  
  37. }

我们也可以使用数据源,本例中使用的是Druid数据源,其配置内容如下。

 
  1. import javax.sql.DataSource;

  2.  
  3. import org.springframework.beans.factory.annotation.Qualifier;

  4. import org.springframework.boot.context.properties.ConfigurationProperties;

  5. import org.springframework.context.annotation.Bean;

  6. import org.springframework.context.annotation.Configuration;

  7. import org.springframework.jdbc.core.JdbcTemplate;

  8.  
  9. import com.alibaba.druid.pool.DruidDataSource;

  10.  
  11. @Configuration

  12. @ConfigurationProperties(prefix = "hive")

  13. public class HiveDruidConfig {

  14.  
  15. private String url;

  16. private String user;

  17. private String password;

  18. private String driverClassName;

  19. private int initialSize;

  20. private int minIdle;

  21. private int maxActive;

  22. private int maxWait;

  23. private int timeBetweenEvictionRunsMillis;

  24. private int minEvictableIdleTimeMillis;

  25. private String validationQuery;

  26. private boolean testWhileIdle;

  27. private boolean testOnBorrow;

  28. private boolean testOnReturn;

  29. private boolean poolPreparedStatements;

  30. private int maxPoolPreparedStatementPerConnectionSize;

  31.  
  32. @Bean(name = "hiveDruidDataSource")

  33. @Qualifier("hiveDruidDataSource")

  34. public DataSource dataSource() {

  35. DruidDataSource datasource = new DruidDataSource();

  36. datasource.setUrl(url);

  37. datasource.setUsername(user);

  38. datasource.setPassword(password);

  39. datasource.setDriverClassName(driverClassName);

  40.  
  41. // pool configuration

  42. datasource.setInitialSize(initialSize);

  43. datasource.setMinIdle(minIdle);

  44. datasource.setMaxActive(maxActive);

  45. datasource.setMaxWait(maxWait);

  46. datasource.setTimeBetweenEvictionRunsMillis(timeBetweenEvictionRunsMillis);

  47. datasource.setMinEvictableIdleTimeMillis(minEvictableIdleTimeMillis);

  48. datasource.setValidationQuery(validationQuery);

  49. datasource.setTestWhileIdle(testWhileIdle);

  50. datasource.setTestOnBorrow(testOnBorrow);

  51. datasource.setTestOnReturn(testOnReturn);

  52. datasource.setPoolPreparedStatements(poolPreparedStatements);

  53. datasource.setMaxPoolPreparedStatementPerConnectionSize(maxPoolPreparedStatementPerConnectionSize);

  54. return datasource;

  55. }

  56.  
  57. // 此处省略各个属性的get和set方法

  58.  
  59. @Bean(name = "hiveDruidTemplate")

  60. public JdbcTemplate hiveDruidTemplate(@Qualifier("hiveDruidDataSource") DataSource dataSource) {

  61. return new JdbcTemplate(dataSource);

  62. }

  63.  
  64. }

使用DataSource操作 Hive

 
  1. import java.sql.ResultSet;

  2. import java.sql.SQLException;

  3. import java.sql.Statement;

  4. import java.util.ArrayList;

  5. import java.util.List;

  6.  
  7. import javax.sql.DataSource;

  8.  
  9. import org.slf4j.Logger;

  10. import org.slf4j.LoggerFactory;

  11. import org.springframework.beans.factory.annotation.Autowired;

  12. import org.springframework.beans.factory.annotation.Qualifier;

  13. import org.springframework.web.bind.annotation.RequestMapping;

  14. import org.springframework.web.bind.annotation.RestController;

  15.  
  16. /**

  17. * 使用 DataSource 操作 Hive

  18. */

  19. @RestController

  20. @RequestMapping("/hive")

  21. public class HiveDataSourceController {

  22.  
  23. private static final Logger logger = LoggerFactory.getLogger(HiveDataSourceController.class);

  24.  
  25. @Autowired

  26. @Qualifier("hiveJdbcDataSource")

  27. org.apache.tomcat.jdbc.pool.DataSource jdbcDataSource;

  28.  
  29. @Autowired

  30. @Qualifier("hiveDruidDataSource")

  31. DataSource druidDataSource;

  32.  
  33. /**

  34. * 列举当前Hive库中的所有数据表

  35. */

  36. @RequestMapping("/table/list")

  37. public List<String> listAllTables() throws SQLException {

  38. List<String> list = new ArrayList<String>();

  39. // Statement statement = jdbcDataSource.getConnection().createStatement();

  40. Statement statement = druidDataSource.getConnection().createStatement();

  41. String sql = "show tables";

  42. logger.info("Running: " + sql);

  43. ResultSet res = statement.executeQuery(sql);

  44. while (res.next()) {

  45. list.add(res.getString(1));

  46. }

  47. return list;

  48. }

  49.  
  50. /**

  51. * 查询Hive库中的某张数据表字段信息

  52. */

  53. @RequestMapping("/table/describe")

  54. public List<String> describeTable(String tableName) throws SQLException {

  55. List<String> list = new ArrayList<String>();

  56. // Statement statement = jdbcDataSource.getConnection().createStatement();

  57. Statement statement = druidDataSource.getConnection().createStatement();

  58. String sql = "describe " + tableName;

  59. logger.info("Running: " + sql);

  60. ResultSet res = statement.executeQuery(sql);

  61. while (res.next()) {

  62. list.add(res.getString(1));

  63. }

  64. return list;

  65. }

  66.  
  67. /**

  68. * 查询指定tableName表中的数据

  69. */

  70. @RequestMapping("/table/select")

  71. public List<String> selectFromTable(String tableName) throws SQLException {

  72. // Statement statement = jdbcDataSource.getConnection().createStatement();

  73. Statement statement = druidDataSource.getConnection().createStatement();

  74. String sql = "select * from " + tableName;

  75. logger.info("Running: " + sql);

  76. ResultSet res = statement.executeQuery(sql);

  77. List<String> list = new ArrayList<String>();

  78. int count = res.getMetaData().getColumnCount();

  79. String str = null;

  80. while (res.next()) {

  81. str = "";

  82. for (int i = 1; i < count; i++) {

  83. str += res.getString(i) + " ";

  84. }

  85. str += res.getString(count);

  86. logger.info(str);

  87. list.add(str);

  88. }

  89. return list;

  90. }

  91.  
  92. }

 使用 JdbcTemplate 操作 Hive

 
  1. import org.slf4j.Logger;

  2. import org.slf4j.LoggerFactory;

  3. import org.springframework.beans.factory.annotation.Autowired;

  4. import org.springframework.beans.factory.annotation.Qualifier;

  5. import org.springframework.dao.DataAccessException;

  6. import org.springframework.jdbc.core.JdbcTemplate;

  7. import org.springframework.web.bind.annotation.RequestMapping;

  8. import org.springframework.web.bind.annotation.RestController;

  9.  
  10. /**

  11. * 使用 JdbcTemplate 操作 Hive

  12. */

  13. @RestController

  14. @RequestMapping("/hive2")

  15. public class HiveJdbcTemplateController {

  16.  
  17. private static final Logger logger = LoggerFactory.getLogger(HiveJdbcTemplateController.class);

  18.  
  19. @Autowired

  20. @Qualifier("hiveDruidTemplate")

  21. private JdbcTemplate hiveDruidTemplate;

  22.  
  23. @Autowired

  24. @Qualifier("hiveJdbcTemplate")

  25. private JdbcTemplate hiveJdbcTemplate;

  26.  
  27. /**

  28. * 示例:创建新表

  29. */

  30. @RequestMapping("/table/create")

  31. public String createTable() {

  32. StringBuffer sql = new StringBuffer("CREATE TABLE IF NOT EXISTS ");

  33. sql.append("user_sample");

  34. sql.append("(user_num BIGINT, user_name STRING, user_gender STRING, user_age INT)");

  35. sql.append("ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' "); // 定义分隔符

  36. sql.append("STORED AS TEXTFILE"); // 作为文本存储

  37.  
  38. logger.info("Running: " + sql);

  39. String result = "Create table successfully...";

  40. try {

  41. // hiveJdbcTemplate.execute(sql.toString());

  42. hiveDruidTemplate.execute(sql.toString());

  43. } catch (DataAccessException dae) {

  44. result = "Create table encounter an error: " + dae.getMessage();

  45. logger.error(result);

  46. }

  47. return result;

  48.  
  49. }

  50.  
  51. /**

  52. * 示例:将Hive服务器本地文档中的数据加载到Hive表中

  53. */

  54. @RequestMapping("/table/load")

  55. public String loadIntoTable() {

  56. String filepath = "/home/hadoop/user_sample.txt";

  57. String sql = "load data local inpath '" + filepath + "' into table user_sample";

  58. String result = "Load data into table successfully...";

  59. try {

  60. // hiveJdbcTemplate.execute(sql);

  61. hiveDruidTemplate.execute(sql);

  62. } catch (DataAccessException dae) {

  63. result = "Load data into table encounter an error: " + dae.getMessage();

  64. logger.error(result);

  65. }

  66. return result;

  67. }

  68.  
  69. /**

  70. * 示例:向Hive表中添加数据

  71. */

  72. @RequestMapping("/table/insert")

  73. public String insertIntoTable() {

  74. String sql = "INSERT INTO TABLE user_sample(user_num,user_name,user_gender,user_age) VALUES(888,'Plum','M',32)";

  75. String result = "Insert into table successfully...";

  76. try {

  77. // hiveJdbcTemplate.execute(sql);

  78. hiveDruidTemplate.execute(sql);

  79. } catch (DataAccessException dae) {

  80. result = "Insert into table encounter an error: " + dae.getMessage();

  81. logger.error(result);

  82. }

  83. return result;

  84. }

  85.  
  86. /**

  87. * 示例:删除表

  88. */

  89. @RequestMapping("/table/delete")

  90. public String delete(String tableName) {

  91. String sql = "DROP TABLE IF EXISTS "+tableName;

  92. String result = "Drop table successfully...";

  93. logger.info("Running: " + sql);

  94. try {

  95. // hiveJdbcTemplate.execute(sql);

  96. hiveDruidTemplate.execute(sql);

  97. } catch (DataAccessException dae) {

  98. result = "Drop table encounter an error: " + dae.getMessage();

  99. logger.error(result);

  100. }

  101. return result;

  102. }

  103. }

启动测试 

通过运行HiveApplication类的main方法启动项目,接下来对每个示例方法进行测试。

创建Hive表 

待项目启动后,在浏览器中访问 http://localhost:8080/hive2/table/create 来创建一张 user_sample 测试表:

user_sample 表的创建 sql 如下:

 
  1. create table user_sample

  2. (

  3. user_num bigint,

  4. user_name string,

  5. user_gender string,

  6. user_age int

  7. ) row format delimited fields terminated by ',';

查看Hive表  

测试表创建完成后,通过访问 http://localhost:8080/hive/table/list 来查看hive库中的数据表都有哪些?

返回如下内容:

在Hive客户端中使用 show tables 命令查看,与浏览器中看到的数据表相同,内容如下:

访问 http://localhost:8080/hive/table/describe?tableName=user_sample 来查看 user_sample 表的字段信息:

返回如下内容:

 在Hive客户端中使用 describe user_sample 命令进行查看,与浏览器中看到的数据表字段相同。

导入数据 

接下来进行数据导入测试,先在Hive服务器的 /home/hadoop/ 目录下新建一个user_sample.txt 文件,内容如下:

 
  1. 622,Lee,M,25

  2. 633,Andy,F,27

  3. 644,Chow,M,25

  4. 655,Grace,F,24

  5. 666,Lily,F,29

  6. 677,Angle,F,23

 然后在浏览器中访问以下地址,将 /home/hadoop/user_sample.txt 文件中的内容加载到 user_sample 数据表中。

http://localhost:8080/hive2/table/load

数据导入成功之后,访问 http://localhost:8080/hive/table/select?tableName=user_sample ,返回如下内容:

 在Hive客户端中使用 select * form user_sample 命令进行查看,与浏览器中看到的内容相同。

插入数据 

再访问  http://localhost:8080/hive2/table/insert 来测试向 user_sample 表中插入一条数据。

Hive客户端打印的Map-Reduce执行过程日志如下:

再次访问 http://localhost:8080/hive/table/select?tableName=user_sample ,内容如下:

项目源码已上传至CSDN,资源地址:https://download.csdn.net/download/pengjunlee/10613827

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值