【Hive】 cli 的基本用法

最新推荐文章于 2022-12-23 20:21:50 发布

ciedecem

最新推荐文章于 2022-12-23 20:21:50 发布

阅读量919

点赞数

分类专栏： Hadoop

Hadoop 专栏收录该内容

24 篇文章 0 订阅

订阅专栏

Original Link: http://archive.cloudera.com/cdh/3/hive/language_manual/cli.html

$hive cli --help

usage: hive
-d,--define <key=value> Variable subsitution to apply to hive
commands. e.g. -d A=B or --define A=B
--database <databasename> Specify the database to use
-e <quoted-query-string> SQL from command line
-f <filename> SQL from files
-H,--help Print help information
-h <hostname> connecting to Hive Server on remote host
--hiveconf <property=value> Use value for given property
--hivevar <key=value> Variable subsitution to apply to hive
commands. e.g. --hivevar A=B
-i <filename> Initialization SQL file
-p <port> connecting to Hive Server on port number
-S,--silent Silent mode in interactive shell
-v,--verbose Verbose mode (echo executed SQL to the
console)

$HIVE_HOME/bin/hive -e 'select a.col from tab1 a' -hiveconf hive.exec.scratchdir=/home/my/hive_scratch -hiveconf mapred.reduce.tasks=32

$HIVE_HOME/bin/hive -S -e 'select a.col from tab1 a' > a.txt

$HIVE_HOME/bin/hive -f /home/my/hive-script.sql

Example of running an initialization script before entering interactive mode

$HIVE_HOME/bin/hive -i /home/my/hiverc

hiverc file

The cli when invoked without the -i option will attempt to load HIVE_HOME/bin/.hiverc and $HOME/.hiverc as initialization files.

Hive interactive shell commands

Command	Description
quit	Use quit or exit to leave the interactive shell.
set key=value	Use this to set value of particular configuration variable. One thing to note here is that if you misspell the variable name, cli will not show an error.
set	This will print a list of configuration variables that are overridden by user or hive.
set -v	This will print all hadoop and hive configuration variables.
add FILE [file] [file]*	Adds a file to the list of resources
list FILE	list all the files added to the distributed cache
list FILE [file]*	Check if given resources are already added to distributed cache
! [cmd]	Executes a shell command from the hive shell
dfs [dfs cmd]	Executes a dfs command from the hive shell
[query]	Executes a hive query and prints results to standard out
source FILE	Used to execute a script file inside the CLI.

    hive> set mapred.reduce.tasks=32; hive> set; hive> select a.* from tab1; hive> !ls; hive> dfs -ls; 
  

Hive Resources

Hive can manage the addition of resources to a session where those resources need to be made available at query execution time. Any locally accessible file can be added to the session. Once a file is added to a session, hive query can refer to this file by its name (in map/reduce/transform clauses) and this file is available locally at execution time on the entire hadoop cluster. Hive uses Hadoop's Distributed Cache to distribute the added files to all the machines in the cluster at query execution time.

    ADD { FILE[S] | JAR[S] | ARCHIVE[S] } <filepath1> [<filepath2>]* LIST { FILE[S] | JAR[S] | ARCHIVE[S] } [<filepath1> <filepath2> ..] DELETE { FILE[S] | JAR[S] | ARCHIVE[S] } [<filepath1> <filepath2> ..] 
  

FILE resources are just added to the distributed cache. Typically, this might be something like a transform script to be executed.
JAR resources are also added to the Java classpath. This is required in order to reference objects they contain such as UDF's.
ARCHIVE resources are automatically unarchived as part of distributing them.

Example

    hive> add FILE /tmp/tt.py;hive> list FILES;/tmp/tt.pyhive> from networks a MAP a.networkid USING 'python tt.py' as nn where a.ds = '2009-01-04' limit 10; 
  

It is not neccessary to add files to the session if the files used in a transform script are already available on all machines in the hadoop cluster using the same path name. For example:

... MAP a.networkid USING 'wc -l' ...: here wc is an executable available on all machines
... MAP a.networkid USING '/home/nfsserv1/hadoopscripts/tt.py' ...: here tt.py may be accessible via a nfs mount point that's configured identically on all the cluster nodes.