Hive-JDBCli

最新推荐文章于 2024-08-05 20:59:55 发布

xrl001

最新推荐文章于 2024-08-05 20:59:55 发布

阅读量998

点赞数

分类专栏： hive

hive 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

可以同步参考

使用hive-site.xml自动连接到HiveServer2

从Hive 2.2.0（HIVE-14063）开始，BeeLine 增加了使用类路径中存在的hive-site.xml来自动生成基于hive-site.xml中的配置属性的连接url和另外一个用户配置文件。并非所有的url属性都可以从hive-site.xml派生，因此为了使用此功能，用户必须创建一个名为“beeline-hs2-connection.xml”的配置文件，该配置文件是Hadoop xml格式文件。此文件用于为连接URL提供用户特定的连接属性。BeeLine在$ {user.home} /。beeline /（基于Unix的操作系统）或$ {user.home} \ beeline \目录（在Windows的情况下）查找此配置文件。如果在以上位置找不到该文件，BeeLine将在$ {HIVE_CONF_DIR}位置和/ etc / hive / conf中查找它（检查HIVE-16335，这些位置从Hive 2.2.0中的/ etc / conf / hive中修复）那个订单。一旦找到该文件，BeeLine将使用beeline-hs2-connection.xml与类路径中的hive-site.xml一起确定连接URL。

beeline-hs2-connection.xml中的url连接属性必须具有前缀“beeline.hs2.connection”，后跟url属性名称。例如，为了提供属性ssl，beeline-hs2-connection.xml中的属性键应为“beeline.hs2.connection.ssl”。以下示例beeline.hs2.connection.xml提供了直线连接网址的用户和密码值。在这种情况下，使用类路径中的hive-site.xml来获取其他属性，如HS2主机名和端口信息，kerberos配置属性，SSL属性，传输模式等。如果密码为空，请删除beeline.hs2.connection.password属性。在大多数情况下，以下配置值在beeline-hs2-connection.xml和正确的配置单元中。

 
 <?xml version= 
 "1.0" 
 ?> 

 
 <?xml-stylesheet type= 
 "text/xsl"  
 href= 
 "configuration.xsl" 
 ?> 

 
 <configuration> 

 
 <property> 

 
   <name>beeline.hs2.connection.user</name> 

 
   <value>hive</value> 

 
 </property> 

 
 <property> 

 
   <name>beeline.hs2.connection.password</name> 

 
   <value>hive</value> 

 
 </property> 

 
 </configuration> 

 
 在beeline-hs2-connection.xml和hive-site.xml中存在属性的情况下，从beeline-hs2-connection.xml派生的属性值优先。例如在下面的beeline-hs2-connection.xml文件中，为启用了Kerberos的环境中的BeeLine连接提供了主体的值。在这种情况下，只要连接URL，beeline.hs2.connection.principal的属性值将从hive-site.xml覆盖HiveConf.ConfVars.HIVE_SERVER2_KERBEROS_PRINCIPAL的值。
  

 
      <?xml version= 
      "1.0" 
      ?> 
     
      <?xml-stylesheet type= 
      "text/xsl"  
      href= 
      "configuration.xsl" 
      ?> 
     
      <configuration> 
     
      <property> 
     
        <name>beeline.hs2.connection.hosts</name> 
     
        <value>localhost: 
      10000 
      </value> 
     
      </property> 
     
      <property> 
     
        <name>beeline.hs2.connection.principal</name> 
     
        <value>hive/dummy-hostname 
      @domain 
      .com</value> 
     
      </property> 
     
      </configuration> 
     
      在属性beeline.hs2.connection.hosts的情况下，beeline.hs2.connection.hiveconf和beeline.hs2.connection.hivevar属性值是逗号分隔的值列表。例如，以下beeline-hs2-connection.xml以逗号分隔格式提供hiveconf和hivevar值。

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>beeline.hs2.connection.user</name>
<value>hive</value>
</property>
<property>
<name>beeline.hs2.connection.hiveconf</name>
<value>hive.cli.print.current.db=true, hive.cli.print.header=true</value>
</property>
<property>
<name>beeline.hs2.connection.hivevar</name>
<value>testVarName1=value1, testVarName2=value2</value>
</property>
</configuration>

当beeline-hs2-connection.xml存在时，何时没有提供其他参数BeeLine自动连接到使用配置文件生成的URL。当提供连接参数（-u，-n或-p）时，BeeLine使用它们，并且不使用beeline-hs2-connection.xml自动连接。删除或重命名beeline-hs2-connection.xml禁用此功能。

步骤：

、

加载HiveServer2 JDBC驱动程序。从1.2.0开始，应用程序不再需要使用Class.forName（）显式加载JDBC驱动程序。

例如：
```
的Class.forName（ “org.apache.hive.jdbc.HiveDriver”）;
```
通过Connection使用JDBC驱动程序创建对象来连接到数据库。

例如：
```
连接cnct = DriverManager.getConnection（“jdbc：hive2：// <host>：<port>”，“<user>”，“<password>”）;
```
默认<port>值为10000.在非安全配置中，<user>为查询运行时指定一个。<password>在非安全模式下，字段值被忽略。
```
连接cnct = DriverManager.getConnection（“jdbc：hive2：// <host>：<port>”，“<user>”，“”）;
```
在Kerberos安全模式下，用户信息基于Kerberos凭据。

通过创建Statement对象并使用其executeQuery()方法将SQL提交到数据库。

例如：

Statement stmt = cnct.createStatement（）; 
ResultSet rset = stmt.executeQuery（“SELECT foo FROM bar”）;

必要时处理结果集。

用到的jar包

如果你是用Maven，加入以下依赖

 
        <dependency> 
       
        <groupId>org.apache.hive</groupId> 
       
        <artifactId>hive-jdbc</artifactId> 
       
        <version> 
        0.11 
        . 
        0 
        </version> 
       
        </dependency> 
       
        <dependency> 
       
        <groupId>org.apache.hadoop</groupId> 
       
        <artifactId>hadoop-common</artifactId> 
       
        <version> 
        2.2 
        . 
        0 
        </version> 
       
        </dependency>

 
 import  
 java.sql.SQLException; 

 
 import  
 java.sql.Connection; 

 
 import  
 java.sql.ResultSet; 

 
 import  
 java.sql.Statement; 

 
 import  
 java.sql.DriverManager; 

 
 public  
 class  
 HiveJdbcClient { 

 
    
 private  
 static  
 String driverName =  
 "org.apache.hive.jdbc.HiveDriver" 
 ; 

/**

 
     
 * @param args 

 
     
 * @throws SQLException 

*/

 
    
 public  
 static  
 void  
 main(String[] args)  
 throws  
 SQLException { 

 
 try  
 {

 
        
 Class.forName(driverName); 

 
      
 }  
 catch  
 (ClassNotFoundException e) { 

 
        
 // TODO Auto-generated catch block 

 
        
 e.printStackTrace(); 

 
        
 System.exit( 
 1 
 ); 

}

 
      
 //replace "hive" here with the name of the user the queries should run as 

 
      
 Connection con = DriverManager.getConnection( 
 "jdbc:hive2://localhost:10000/default" 
 ,  
 "hive" 
 ,  
 "" 
 ); 

 
      
 Statement stmt = con.createStatement(); 

 
      
 String tableName =  
 "testHiveDriverTable" 
 ; 

 
      
 stmt.execute( 
 "drop table if exists "  
 + tableName); 

 
      
 stmt.execute( 
 "create table "  
 + tableName +  
 " (key int, value string)" 
 ); 

 
      
 // show tables 

 
      
 String sql =  
 "show tables '"  
 + tableName +  
 "'" 
 ; 

 
      
 System.out.println( 
 "Running: "  
 + sql); 

 
      
 ResultSet res = stmt.executeQuery(sql); 

 
 if  
 (res.next()) {

 
        
 System.out.println(res.getString( 
 1 
 )); 

}

 
         
 // describe table 

 
      
 sql =  
 "describe "  
 + tableName; 

 
      
 System.out.println( 
 "Running: "  
 + sql); 

 
      
 res = stmt.executeQuery(sql); 

 
 while  
 (res.next()) {

 
        
 System.out.println(res.getString( 
 1 
 ) +  
 "\t"  
 + res.getString( 
 2 
 )); 

}

 
      
 // load data into table 

 
      
 // NOTE: filepath has to be local to the hive server 

 
      
 // NOTE: /tmp/a.txt is a ctrl-A separated file with two fields per line 

 
      
 String filepath =  
 "/tmp/a.txt" 
 ; 

 
      
 sql =  
 "load data local inpath '"  
 + filepath +  
 "' into table "  
 + tableName; 

 
      
 System.out.println( 
 "Running: "  
 + sql); 

 
      
 stmt.execute(sql); 

 
      
 // select * query 

 
      
 sql =  
 "select * from "  
 + tableName; 

 
      
 System.out.println( 
 "Running: "  
 + sql); 

 
      
 res = stmt.executeQuery(sql); 

 
 while  
 (res.next()) {

 
        
 System.out.println(String.valueOf(res.getInt( 
 1 
 )) +  
 "\t"  
 + res.getString( 
 2 
 )); 

}

 
      
 // regular hive query 

 
      
 sql =  
 "select count(1) from "  
 + tableName; 

 
      
 System.out.println( 
 "Running: "  
 + sql); 

 
      
 res = stmt.executeQuery(sql); 

 
 while  
 (res.next()) {

 
        
 System.out.println(res.getString( 
 1 
 )); 

}

}

}

运行JDBC示例代码

 
          # Then on the command-line 
         
          $ javac HiveJdbcClient.java 
         
          # To run the program using remote hiveserver in non-kerberos mode, we need the following jars in the classpath 
         
          # from hive/build/dist/lib 
         
          #     hive-jdbc*.jar 
         
          #     hive-service*.jar 
         
          #     libfb303-0.9.0.jar 
         
          #     libthrift-0.9.0.jar 
         
          #     log4j-1.2.16.jar 
         
          #     slf4j-api-1.6.1.jar 
         
          #     slf4j-log4j12-1.6.1.jar 
         
          #     commons-logging-1.0.4.jar 
         
          # 
         
          # 
         
          # To run the program using kerberos secure mode, we need the following jars in the classpath 
         
          #     hive-exec*.jar 
         
          #     commons-configuration-1.6.jar (This is not needed with Hadoop 2.6.x and later). 
         
          #  and from hadoop 
         
          #     hadoop-core*.jar (use hadoop-common*.jar for Hadoop 2.x) 
         
          # 
         
          # To run the program in embedded mode, we need the following additional jars in the classpath 
         
          # from hive/build/dist/lib 
         
          #     hive-exec*.jar 
         
          #     hive-metastore*.jar 
         
          #     antlr-runtime-3.0.1.jar 
         
          #     derby.jar 
         
          #     jdo2-api-2.1.jar 
         
          #     jpox-core-1.2.2.jar 
         
          #     jpox-rdbms-1.2.2.jar 
         
          # and from hadoop/build 
         
          #     hadoop-core*.jar 
         
          # as well as hive/build/dist/conf, any HIVE_AUX_JARS_PATH set,  
         
          # and hadoop jars necessary to run MR jobs (eg lzo codec) 
         
          $ java - 
          cp  
          $CLASSPATH HiveJdbcClient

或者，您可以运行以下bash脚本，这将在调用客户端之前种子数据文件并构建类路径。该脚本还添加了在嵌入式模式下使用HiveServer2所需的所有其他jar。

 
          #!/bin/bash 
         
          HADOOP_HOME= 
          /your/path/to/hadoop 
         
          HIVE_HOME= 
          /your/path/to/hive 
         
          echo  
          -e  
          '1\x01foo'  
          >  
          /tmp/a 
          .txt 
         
          echo  
          -e  
          '2\x01bar'  
          >>  
          /tmp/a 
          .txt 
         
          HADOOP_CORE=$( 
          ls  
          $HADOOP_HOME 
          /hadoop-core 
          *.jar) 
         
          CLASSPATH=.:$HIVE_HOME 
          /conf 
          :$(hadoop classpath) 
         
          for  
          i  
          in  
          ${HIVE_HOME} 
          /lib/ 
          *.jar ;  
          do 
         
          CLASSPATH=$CLASSPATH:$i 
         
          done 
         
          java - 
          cp  
          $CLASSPATH HiveJdbcClient