1. 使用
1.2. 进程启停
1.2.1. 启动
启动顺序:statestore->catalog->impalad
root用户:
在statestore节点
service impala-state-store start
在catalog节点
service impala-catalog start
在impalad节点:
service impala-server start
1.2.2. 停止
停止顺序:与启动顺序相反
root用户:
在impalad节点:
service impala-server stop
在catalog节点
service impala-catalog stop
在statestore节点
service impala-state-store stop
1.2.3. 查看状态
statestore:
service impala-state-store status
catalog:
service impala-catalog status
impalad:
service imapla-server status
1.3. impala sql
impala sql 与hive使用的 sql(HiveQL)有着高度的兼容性,对于那些已经熟悉在hive上使用sql 查询的人员而言是极为方便的。目前impala sql可以算是HiveQL声明、数据类型、内置函数的一个子集。
支持ARRAY、BIGINT、BOOLEAN、CHAR、DECIMAL、DOUBLE、FLOAT、INT、MAP 、REAL、SMALLINT、STRING、STRUCT 、TIMESTAMP、TINYINT、VARCHAR数据类型。
支持算数运算符、比较运算符、Between AND、IN、LIKE、IS NULL等。
支持别名、库、表、函数、标识符、表、视图。
支持创建表、函数、视图、库、角色,修改表、视图,赋权限操作等DDL。
支持SELECT、INSTERT、LOAD DATA等DML操作
1.4. impala-shell客户端
1.4.1. impala-shell命令使用
查看impala-shell命令帮助:
$ impala-shell –help
指定连接的impalad实例和端口:
$ impala-shell -i e3base01:19005
目前使用impala-shell直连都需指定19005端口
指定使用某个查询:
$ impala-shell -i e3base01:19005 -q "select count(*) from sensitive.events"
查看imapal-shell版本:
$ impala-shell -v
指定数据库:
$ impala-shell -d default
指定连接用户:
$ impala-shell -u e3base
组合使用:
$ impala-shell -i localhost:19005 -d test -q 'show tables'
1.4.2. 连接到impala-shell后的使用:
1. 输入help查看所有可用的 impala-shell命令。
[e3base01:19005] > help;
Documented commands (type help <topic>):
========================================
compute describe explain profile select shell tip use version
connect exit history quit set show unset values with
Undocumented commands:
======================
alter delete drop insert source summary upsert
create desc help load src update
2. 输入 help <topic> 查看指定命令的帮助。
[e3base01:19005] > help use;
3. 查看数据库:show databases;
4. 创建数据库:create database test1;
5. 使用数据库:use sensitive;
6. 查看表:show tables;
7. 创建表:create table default.events( ip STRING, country STRING, client STRING, action STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
8. 插入数据:
a) insert: insert into default.events select * from sensitive.events;
其中sensitive.events与新创建的表格式一样且为已有数据的表
insert into default.events (ip,country,client,action) values ("192.168.88.2","China","IOS","Insert");
(不推荐单条插入)
b) create table events2 STORED AS PARQUET as select * from sensitive.events;
或者直接再创建表的时候导入数据
c) load:
# touch events.csv
插入数据:
10.1.2.3,US,android,createNote
10.200.88.99,FR,windows,updateNote
10.1.2.3,US,android,updateNote
10.200.88.77,FR,ios,createNote
10.1.4.5,US,windows,updateTag
上传文件到hdfs的根目录下,然后使用impala load
load data inpath '/events.csv' into table default.events;
注:连接impala用户必须有hdfs的/events.csv文件所在目录的写权限,否则会报错。
ERROR: AnalysisException: Unable to LOAD DATA from hdfs://drmcluster/events.csv because Impala does not have WRITE permissions on its parent directory hdfs://drmcluster/
且若原表有数据,则需要refresh一下。
9. 创建视图:
create view default.events_usonly as select * from default.events where country = 'US';
10. 查询:
a) 查找全部:
select * from default.events;
b) 带条件查找
select * from default.events where client = "windows";
select * from default.events where client = "windows" order by ip;
select country from default.events where client = "windows" || client = "android" group by country;
select distinct client from default.events where client = "windows" || client = "android";
select max(ip) from default.events where client = "windows" || client = "android";
c) 多表查询
select events.ip,events.client,events_usonly.country,events_usonly.action from events join events_usonly on events.country = events_usonly.country;
11. 描述表:describe events;
12. 删除表:drop table events2;
13. 修改表:alter table events add columns (id int);