Hive命令详解

本文详细介绍了Hive的命令使用,包括在Hive环境中执行的命令如quit、exit、set、add FILE、list FILE等,以及在操作系统中执行的命令如--help、--debug、beeline、cleardanglingscratchdir等。这些命令涵盖了配置管理、资源添加与删除、查询执行、服务启动等多个方面,帮助用户更好地理解和操作Hive。
摘要由CSDN通过智能技术生成

Hive 命令分为进入 Hive 环境可以执行的命令和 hive 提供的在操作系统中执行的命令

1. Hive 环境可以执行的命令

命令解释
quit, exit.使用 quit 或 exit 离开交互接口
reset重置配置项为默认值。通过使用 set 命令或者在 hive 命令行用 -hiveconf 设置的参数,会设置为默认值。
通过 set hiveconf:; 的方式设置的变量,由于历史原因,不适用。
set <key>=<value>设置配置参数的值。如果变量名拼写错误,不会给出提示
set输出用户的和 Hive 的所有的配置
set -v输出所有 Hadoop 和 Hive 的配置
add FILE[S] <filepath> <filepath>*
add JAR[S] <filepath> <filepath>*
add ARCHIVE[S] <filepath> <filepath>*
增加一个或多个文件,jar 包或者归档文件到分布式缓存。add files 可以一次添加一个目录下得所有文件,只用指定目录名,add jars 和 add archives 类似。
add FILE[S] <ivyurl> <ivyurl>*
add JAR[S] <ivyurl> <ivyurl>*
add ARCHIVE[S]<ivyurl> <ivyurl>*
和上行作用一样,以 Ivy URL 的格式 ivy://group:module:version?query_string。需要服务器可以上网
list FILE[S]
list JAR[S]
list ARCHIVE[S]
列出已经添加的文件,jar 包,归档文件等,可以用于检查资源是否已经添加到分布式缓存
delete FILE[S] <filepath>*
delete JAR[S] <filepath>*
delete ARCHIVE[S] <filepath>*
从分布式缓存删除资源。
delete FILE[S]<ivyurl> <ivyurl>*
delete JAR[S] <ivyurl> <ivyurl>*
delete ARCHIVE[S] <ivyurl> <ivyurl>*
删除 <ivyurl > 格式的资源
! <command>从 Hive shell 中执行一个 shell 命令
dfs <dfs command>从 Hive shell 中执行一个 dfs 命令
<query string>执行 Hive查询,并且输出结果到标准输出
source FILE <filepath>执行一个脚本文件。
compile <groovy string> AS GROOVY NAMED <name>

示例

compile `import org.apache.hadoop.hive.ql.exec.UDF \;
public class Madd extends UDF {
  public double evaluate(double a, double b){
    return a+b \;
  }
} ` AS GROOVY NAMED Madd.groovy;
CREATE TEMPORARY FUNCTION Madd as 'Madd';

SELECT Madd(3,4);

DROP TEMPORARY FUNCTION Madd;

2. hive 提供的在操作系统中执行的命令

2.1 —help

直接执行 hive --help,输出信息如下:

[houzhizhen@localhost ~]$ hive --help
Usage ./hive <parameters> --service serviceName <service parameters>
Service List: beeline cleardanglingscratchdir cli fixacidkeyindex help hiveburninclient hiveserver2 hplsql jar lineage llapdump llap llapstatus metastore metatool orcfiledump rcfilecat schemaTool strictmanagedmigration tokentool version 
Parameters parsed:
  --auxpath : Auxiliary jars 
  --config : Hive configuration directory
  --service : Starts specific service/component. cli is default
Parameters used:
  HADOOP_HOME or HADOOP_PREFIX : Hadoop install directory
  HIVE_OPT : Hive options
For help on a particular service:
  ./hive --service serviceName --help
Debug help:  ./hive --debug --help

可以放到任何 serivce 后面,代表看此 service 的help。

hive --service cli --help            
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Users/houzhizhen/software/hive/apache-hive-3.1.3-bin/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/houzhizhen/software/hadoop/hadoop-3.2.3/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive Session ID = d644b5ac-8244-4177-a01e-6f8c69ade60f
usage: hive
 -d,--define <key=value>          Variable substitution to apply to Hive
                                  commands. e.g. -d A=B or --define A=B
    --database <databasename>     Specify the database to use
 -e <quoted-query-string>         SQL from command line
 -f <filename>                    SQL from files
 -H,--help                        Print help information
    --hiveconf <property=value>   Use value for given property
    --hivevar <key=value>         Variable substitution to apply to Hive
                                  commands. e.g. --hivevar A=B
 -i <filename>                    Initialization SQL file
 -S,--silent                      Silent mode in interactive shell
 -v,--verbose                     Verbose mode (echo executed SQL to the
                                  console)

2.2 hive --debug

hive --debug --help 可以看到 debug 的参数的设置。

[houzhizhen@localhost ~]$ hive --debug --help

Allows to debug Hive by connecting to it via JDI API

Usage: hive --debug[:comma-separated parameters list]

Parameters:

recursive=<y|n>             Should child JVMs also be started in debug mode. Default: y
port=<port_number>          Port on which main JVM listens for debug connection. Default: 8000
mainSuspend=<y|n>           Should main JVM wait with execution for the debugger to connect. Default: y
childSuspend=<y|n>          Should child JVMs wait with execution for the debugger to connect. Default: n
swapSuspend                 Swaps suspend options between main and child JVMs

-- 可以加到任何 service 上,如hive --service beeline --debug,会已 debug 模式启动 beeline。执行效果如下,等待远程debug 模式连接此 JVM。

[houzhizhen@localhost ~]$ hive --service beeline  --debug
Listening for transport dt_socket at address: 8000

2.3 beeline

beeline 可以作为 jdbc 客户端连接远程 hiveserver 或者其他数据库。
直接执行 beeline 命令和 hive --service beeline 的效果一样。其实 beeline 的内容如下:

#!/usr/bin/env bash

# Omit some copyright information
bin=`dirname "$0"`
bin=`cd "$bin"; pwd`

. "$bin"/hive --service beeline "$@"

2.4 cleardanglingscratchdir

hive --service cleardanglingscratchdir

执行的 class 是 org.apache.hadoop.hive.ql.session.ClearDanglingScratchDir. 以下的注释说明了工作原理。

/**
 * A tool to remove dangling scratch directory. A scratch directory could be left behind
 * in some cases, such as when vm restarts and leave no chance for Hive to run shutdown hook.
 * The tool will test a scratch directory is use, if not, remove it.
 * We rely on HDFS write lock for to detect if a scratch directory is in use:
 * 1. A HDFS client open HDFS file ($scratchdir/inuse.lck) for write and only close
 *    it at the time the session is closed
 * 2. cleardanglingscratchDir can try to open $scratchdir/inuse.lck for write. If the
 *    corresponding HiveCli/HiveServer2 is still running, we will get exception.
 *    Otherwise, we know the session is dead
 * 3. If the HiveCli/HiveServer2 dies without closing the HDFS file, NN will reclaim the
 *    lease after 10 min, ie, the HDFS file hold by the dead HiveCli/HiveServer2 is writable
 *    again after 10 min. Once it become writable, cleardanglingscratchDir will be able to
 *    remove it
 */

2.5 cli

执行hive --service cli 和直接执行hive的结果一样,如 hive 命令没有参数,则默认 service 是 cli。

hive --service cli --help            
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Users/houzhizhen/software/hive/apache-hive-3.1.3-bin/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/houzhizhen/software/hadoop/hadoop-3.2.3/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive Session ID = d644b5ac-8244-4177-a01e-6f8c69ade60f
usage: hive
 -d,--define <key=value>          Variable substitution to apply to Hive
                                  commands. e.g. -d A=B or --define A=B
    --database <databasename>     Specify the database to use
 -e <quoted-query-string>         SQL from command line
 -f <filename>                    SQL from files
 -H,--help                        Print help information
    --hiveconf <property=value>   Use value for given property
    --hivevar <key=value>         Variable substitution to apply to Hive
                                  commands. e.g. --hivevar A=B
 -i <filename>                    Initialization SQL file
 -S,--silent                      Silent mode in interactive shell
 -v,--verbose                     Verbose mode (echo executed SQL to the
                                  console)

2.6 fixacidkeyindex

可以检测一个或者多个目录。

hive --service fixacidkeyindex --help
usage ./hive --service fixacidkeyindex [-h] --check-only|--recover [--backup-path <new-path>] <path_to_orc_file_or_directory>

  --check-only                Check acid orc file for valid acid key index and exit without fixing
  --recover                   Fix the acid key index for acid orc file if it requires fixing
  --backup-path <new_path>  Specify a backup path to store the corrupted files (default: /tmp)
  --help (-h)                 Print help message

执行的 class 是org.apache.hadoop.hive.ql.io.orc.FixAcidKeyIndex.

/**
 * Utility to check and fix the ACID key index of an ORC file if it has been written incorrectly
 * due to HIVE-18817.
 * The condition that will be checked in the ORC file will be if the number of stripes in the
 * acid key index matches the number of stripes in the ORC StripeInformation.
 */

2.8 hiveburninclient

Execute test with specified loop.

2.9 hiveserver2

hive --service hiveserver2 --help
usage: hiveserver2
    --deregister <versionNumber>   Deregister all instances of given
                                   version from dynamic service discovery
    --failover <workerIdentity>    Manually failover Active HS2 instance
                                   to passive standby mode
 -H,--help                         Print help information
    --hiveconf <property=value>    Use value for given property
    --listHAPeers                  List all HS2 instances when running in
                                   Active Passive HA mode

hplsql

vim a.hql

CREATE FUNCTION hello(text STRING)
 RETURNS STRING
BEGIN
 RETURN 'Hello, ' || text || '!';
END;
PRINT hello('world')
hplsql -f a.hql
Hello, world!

jar

./hive --service jar <yourjar> <yourclass> HIVE_OPTS <your_args>

lineage

 hive --service lineage 'select sr_customer_sk as ctr_customer_sk,sr_store_sk as ctr_store_sk,sum(SR_FEE) as ctr_total_return from tpcds_hdfs_orc_3.store_returns,tpcds_hdfs_orc_3.date_dim where sr_returned_date_sk = d_date_sk and d_year =2000 group by sr_customer_sk ,sr_store_sk'
InputTable=tpcds_hdfs_orc_3.date_dim
InputTable=tpcds_hdfs_orc_3.store_returns

llapdump

llap

llapstatus

metastore

hive --service metastore --help
usage: hivemetastore
 -h,--help                        Print help information
    --hiveconf <property=value>   Use value for given property
 -p <port>                        Hive Metastore port number, default:9083
 -v,--verbose                     Verbose mode

metatool

metatool
Initializing HiveMetaTool..
HiveMetaTool:Parsing failed.  Reason: Invalid arguments: 
usage: metatool
 -dryRun                                  Perform a dry run of
                                          updateLocation changes.When run
                                          with the dryRun option
                                          updateLocation changes are
                                          displayed but not persisted.
                                          dryRun is valid only with the
                                          updateLocation option.
 -executeJDOQL <query-string>             execute the given JDOQL query
 -help                                    print this message
 -listFSRoot                              print the current FS root
                                          locations
 -prepareAcidUpgrade <find-compactions>   Generates a set Compaction
                                          commands to run to prepare for
                                          Hive 2.x to 3.0 upgrade
 -serdePropKey <serde-prop-key>           Specify the key for serde
                                          property to be updated.
                                          serdePropKey option is valid
                                          only with updateLocation option.
 -tablePropKey <table-prop-key>           Specify the key for table
                                          property to be updated.
                                          tablePropKey option is valid
                                          only with updateLocation option.
 -updateLocation <new-loc> <old-loc>      Update FS root location in the
                                          metastore to new location.Both
                                          new-loc and old-loc should be
                                          valid URIs with valid host names
                                          and schemes.When run with the
                                          dryRun option changes are
                                          displayed but are not persisted.
                                          When run with the
                                          serdepropKey/tablePropKey option
                                          updateLocation looks for the
                                          serde-prop-key/table-prop-key
                                          that is specified and updates
                                          its value if found.

orcfiledump

rcfilecat

schemaTool

strictmanagedmigration

tokentool

version

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值