一 解压apache-hive-0.13.1-bin.tar.gz并重命名
tar-zxf apache-hive-0.13.1-bin.tar.gz -C /opt/modules/
mvapache-hive-0.13.1-bin/ hive-0.13.1
二 创建目录HDFS目录/user/hive/warehouse& /tmp
/opt/modules/hadoop-2.5.0/bin/ hdfsdfs –mkdir -p /user/hive/warehouse
/opt/modules/hadoop-2.5.0/bin / hdfsdfs -mkdir /tmp
/opt/modules/hadoop-2.5.0/bin/ hdfsdfs -chmod g+w /tmp
/opt/modules/hadoop-2.5.0/bin/ hdfsdfs -chmod g+w /user/hive
/warehouse
三 启动HDFS 和 YARN
/opt/modules/hadoop-2.5.0/sbin/hadoop-dameon.shstart namenode
/opt/modules/hadoop-2.5.0/sbin/hadoop-dameon.shstart datanode
/opt/modules/hadoop-2.5.0/sbin/yarn-dameon.shstart resourcemanager
/opt/modules/hadoop-2.5.0/sbin/yarn-dameon.shstart nodemanager
/opt/modules/hadoop-2.5.0/sbinmr-jobhistory-daemon.sh start
historyserver
四 修改hive-env.sh指定hadoop安装目录
#Set HADOOP_HOME to point to a specific hadoop install directory
HADOOP_HOME=/opt/modules/hadoop-2.5.0
#Hive Configuration Directory can be controlled by:
exportHIVE_CONF_DIR=/opt/modules/hive-0.13.1/conf
五 指定MySQL数据库为默认的数据库
由于默认使用derby数据库,只能启动一个实例,所以我们使用MySQL作为默认的数据库
5.1 根据默认的配置文件复制一份然后重命名为hive-site.xml
cphive-default.xml.template hive-site.xml
然后只保留<configuration></configuration>空节点
5.2 开始安装MySQL
检查以前是否安装过MySQL
rpm-qa|grep mysql
如果安装过则没有必要在安装。如果卸掉重新装则执行以下操作:
yumremove mysql*
或者通过rpm 卸载
sudorpm -e --nodeps mysql-community-server-5.6.33-2.el6.x86_64
sudorpm -e --nodeps mysql-community-libs-5.6.33-2.el6.x86_64
sudorpm -e --nodeps mysql-community-client-5.6.33-2.el6.x86_64
sudorpm -e --nodeps mysql-community-release-el6-5.noarch
安装mysql yum源包
sudoyum localinstall mysql-community-release-el6-5.noarch.rpm
安装mysql
sudoyum install mysql-server
启动mysql服务
sudo/etc/init.d/mysqld start
或者sudo servicemysqld start
查看启动状态
sudo/etc/init.d/mysqld status
或者 sudoservice mysqld status
设置mysqld开机启动
sudochkconfig mysqld on
查看是否开机启动配置成功
sudo chkconfig--list|grep mysql
如果2,3,4,5都是on说明配置成功
设置root密码
mysqladmin -u rootpassword '123456'
进入mysql
mysql -uroot -p
授予其他主机可以访问数据库所在主机任何数据库的任何表
grantall privileges on *.* to 'root'@'%' identified by '123456' with grant option;
*.*: 任何数据库.任何表
'root'@'%': %代表任何主机,这里其实就是其他任何主机的root用户
如果想删除其他的root用户,则可以删除
比如:
deletefrom user where user='' and host='localhost';
flushprivileges;
5.3 配置hive-site.xml
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://hadoop09-linux:3306/metastore?createDatabaseIfNotExist=true</value>
<description>JDBC connectstring for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver classname for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>username to useagainst metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
<description>password to useagainst metastore database</description>
</property>
5.4 然后把MySql驱动包放到/opt/modules/hive-0.13.1/lib
cd/opt/modules/hive-0.13.1/lib/
cp/opt/software/hive/mysql-connector-java-5.1.27-bin.jar ./
5.5 这时候我们也可以在mysql去查看,多了一个metastore数据库
查看元数据表信息
六 重新配置日志文件
由于默认日志文件是放在临时目录里面的,即/tmp/hadoop
我们想放在安装目录下/logs文件夹下:
6.1 重命名hive-log4j.properties.template为hive-log4j.properties
cp hive-log4j.properties.templatehive-log4j.properties
6.2 打开hive-log4j.properties修改日志文件存放位置
hive.log.dir=/opt/modules/hive-0.13.1/logs
七 配置进入HiveShell 能显示在当前哪一个数据库
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
<description>Whether toinclude the current database in the Hive prompt.</description>
</property>
八 配置查询表的时候能看到表的列名
<property>
<name>hive.cli.print.header</name>
<value>true</value>
<description>Whether toprint the names of the columns in query output.</description>
</property>
九 简单测试DDL 和 DML语句
9.1 数据库
9.1.1 创建数据库
CREATE (DATABASE|SCHEMA) [IF NOTEXISTS] database_name
[COMMENT database_comment]
[LOCATION hdfs_path]
[WITH DBPROPERTIES(property_name=property_value, ...)];
比如:
CREATE DATABASE IF NOT EXISTS ext;
如果希望你创建的数据库指定存放在HDFS某一个位置:
CREATE DATABASE IF NOT EXISTS ext
LOCATION '/local'
9.1.2 删除数据库
DROP (DATABASE|SCHEMA) [IF EXISTS]database_name [RESTRICT|CASCADE];
比如:
DROP DATABASE IF EXISTS ext;
9.1.3 更改数据库
ALTER (DATABASE|SCHEMA)database_name SET OWNER [USER|ROLE] user_or_role;
0.13貌似只能改所有者,0.14可以改一些属性,如这样:
ALTER (DATABASE|SCHEMA) database_nameSET DBPROPERTIES (property_name=property_value, ...);
9.1.4 使用数据库
USE database_name;
USE ext;
9.2 表
9.2.1 创建表
9.2.1.1 普通方式创建
CREATE [TEMPORARY] [EXTERNAL] TABLE[IF NOT EXISTS] [db_name.]table_name
[(col_name data_type [COMMENT col_comment], ...)]
[COMMENT table_comment]
[PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)]
[CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC],...)] INTO num_buckets BUCKETS]
[SKEWED BY (col_name, col_name, ...)]
ON ((col_value, col_value, ...), (col_value, col_value, ...), ...)
[STORED AS DIRECTORIES]
[
[ROW FORMAT row_format]
[STORED AS file_format] | STORED BY 'storage.handler.class.name' [WITHSERDEPROPERTIES (...)]
]
[LOCATION hdfs_path]
[TBLPROPERTIES (property_name=property_value, ...)]
[AS select_statement];
-- (Note: TEMPORARYavailable in Hive 0.14.0 and later)
-- (Note: SKEWED BYAvailable in Hive 0.10.0 and later)
-- (Note: AS Available inHive 0.5.0 and later; not supported for external tables)
比如:
CREATE TABLE emp(
empno int,
ename string,
job string,
mgr int,
hiredate string,
sal double,
comm double,
deptno int
)
ROW FORMAT delimited fieldsterminated by '\t';
9.2.1.2 查询语句创建
createtable if not exists stu_as as select name from stu_info;
--既复制表结构,又复制表数据
9.2.1.3 like 创建
createtable if not exists stu_like like stu_info;
--只是复制表结构,不复制表数据
9.2.2 更改表
重命名:
ALTER TABLE table_name RENAME TOnew_table_name;
ALTER TABLE ext RENAME TO ext1;
更改表属性:
ALTER TABLE table_name SETTBLPROPERTIES table_properties;
table_properties:
: (property_name = property_value, property_name = property_value, ... )
更改表的注释
ALTER TABLE table_name SETTBLPROPERTIES ('comment' = new_comment);
添加 SerDe 属性[反序列化用的]
ALTER TABLE table_name [PARTITIONpartition_spec] SET SERDE serde_class_name [WITH SERDEPROPERTIESserde_properties];
ALTER TABLE table_name [PARTITIONpartition_spec] SET SERDEPROPERTIES serde_properties;
比如:
ALTER TABLE table_name SETSERDEPROPERTIES ('field.delim' = ',');
更改表的存储属性
ALTER TABLE table_name
CLUSTERED BY (col_name, col_name,...)
SORTED BY (col_name, ...)] INTOnum_buckets BUCKETS;
更改表分区
ALTER TABLE table_name ADD [IF NOTEXISTS] PARTITION partition_spec
[LOCATION 'location1']partition_spec [LOCATION 'location2'] ...;
比如:
ALTER TABLE page_view ADD
PARTITION (dt='2008-08-08',country='us') location '/path/to/us/part080808'
PARTITION (dt='2008-08-09',country='us') location '/path/to/us/part080809';
重命名分区
ALTER TABLE table_name PARTITIONpartition_spec RENAME TO PARTITION partition_spec;
交换分区
ALTER TABLE table_name_1 EXCHANGEPARTITION (partition_spec, partition_spec2, ...) WITH TABLE table_name_2;
删除分区
ALTER TABLE table_name DROP [IFEXISTS] PARTITION partition_spec[, PARTITION partition_spec, ...]
[IGNORE PROTECTION] [PURGE];
Change ColumnName/Type/Position/Comment
改变列名字 列类型 列位置 注释等
ALTER TABLE table_name [PARTITIONpartition_spec] CHANGE [COLUMN] col_old_name col_new_name column_type
[COMMENT col_comment] [FIRST|AFTERcolumn_name] [CASCADE|RESTRICT];
比如:
CREATE TABLE test_change (a int, bint, c int);
// First change column a's name toa1.
ALTER TABLE test_change CHANGE a a1INT;
// Next change column a1's name toa2, its data type to string, and put it after column b.
ALTER TABLE test_change CHANGE a1a2 STRING AFTER b;
// The new table's structureis: b int, a2 string, c int.
// Then change column c's name toc1, and put it as the first column.
ALTER TABLE test_change CHANGE c c1INT FIRST;
// The new table's structureis: c1 int, b int, a2 string.
// Add a comment to column a1
ALTER TABLE test_change CHANGE a1a1 INT COMMENT 'this is column a1';
添加和替换列
ALTER TABLE table_name
[PARTITION partition_spec]
ADD|REPLACE COLUMNS (col_namedata_type [COMMENT col_comment], ...)
[CASCADE|RESTRICT]
--(Note: Hive 0.14.0 and later)
9.2.3 删除表
DROP TABLE [IF EXISTS] table_name;
9.2.4 删除表内容
TRUNCATE TABLE table_name[PARTITION partition_spec];
9.2.5 描述表
描述表
desctable_name; //简单描述
descextended table_names;//详细描述
descformatted table_names;//格式化描述
9.2.6 给表加载数据
从本地加载:会将文件拷贝到/user/hive/warehouse/hadoop09.db/student
loaddata local inpath '/opt/software/hive/student.tsv' [overwrite] into
tablestu_info;
HDFS加载:会将文件从HDFS移到 /user/hive/warehouse/hadoop09.db
/student
首先上传一个文件
dfs-put /opt/software/hive/stu.tsv /;
loaddata inpath '/stu.tsv' [overwrite] into table stu_info;