1. Impala Paper
Impala: A Modern, Open-Source SQL Engine for Hadoop
2. Impala 服务部署
IMPALA_CATALOG_SERVICE 和 IMPALA_STATE_STORE 服务都只有一个,安装在 master 节点。
3个 DataNode 节点安装 IMPALA_DAEMONS 服务。
3. 启动命令
3.1 IMPALA_CATALOG_SERVICE
systemctl start impalaCatalog
3.2 IMPALA_STATE_STORE
systemctl start impalaStatestore
3.3 IMPALA_DAEMONS
systemctl start impalad
4. impala-shell
4.1 master节点执行
- 在 master 节点执行 impala-shell,不能连接到服务器。
[root@bmr-master-158e99a-1 impala]# impala-shell
Starting Impala Shell without Kerberos authentication
Error connecting: TTransportException, Could not connect to bmr-master-158e99a-1:21000
***********************************************************************************
Welcome to the Impala shell.
(Impala Shell v3.4.0-RELEASE (9f1c31c) built on Thu May 5 14:20:18 CST 2022)
Press TAB twice to see a list of available commands.
***********************************************************************************
[Not connected] >
- 连接 bmr-core-d9ad745-3
[Not connected] > connect bmr-core-d9ad745-3;
Connection lost, reconnecting...
Error connecting: TTransportException, Could not connect to bmr-master-158e99a-1:21000
Opened TCP connection to bmr-core-d9ad745-3:21000
Connected to bmr-core-d9ad745-3:21000
Server version: impalad version 3.4.0-RELEASE DEBUG (build 9f1c31c597b73662b76eb09a102bdeead58a96ed)
[bmr-core-d9ad745-3:21000] default>
- create table
[bmr-core-d9ad745-3:21000] default> create table t2(c1 string) stored as textfile;
Query: create table t2(c1 string) stored as textfile
+-------------------------+
| summary |
+-------------------------+
| Table has been created. |
+-------------------------+
Fetched 1 row(s) in 0.65s
- show create table
[bmr-core-d9ad745-3:21000] default> show create table t2;
Query: show create table t2
+--------------------------------------------------------------------+
| result |
+--------------------------------------------------------------------+
| CREATE TABLE default.t2 ( |
| c1 STRING |
| ) |
| STORED AS TEXTFILE |
| LOCATION 'hdfs://bmr-cluster/warehouse/tablespace/managed/hive/t2' |
| |
+--------------------------------------------------------------------+
Fetched 1 row(s) in 4.96s
- put file to hdfs
[bmr-core-d9ad745-3:21000] default> ! hadoop fs -put /etc/profile /profile;
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; support was removed in 8.0
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/bmr/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/bmr/tez/lib/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
--------
Executed in 2.67s
- load data
[bmr-core-d9ad745-3:21000] default> load data inpath '/profile' overwrite into table t2;
Query: load data inpath '/profile' overwrite into table t2
+----------------------------------------------------------+
| summary |
+----------------------------------------------------------+
| Loaded 1 file(s). Total files in destination location: 1 |
+----------------------------------------------------------+
Fetched 1 row(s) in 0.17s
- select
[bmr-core-d9ad745-3:21000] default> select count(distinct c1) from t2;
Query: select count(distinct c1) from t2
Query submitted at: 2023-04-24 17:50:14 (Coordinator: http://bmr-core-d9ad745-3:25000)
Query progress can be monitored at: http://bmr-core-d9ad745-3:25000/query_plan?query_id=9547e24902296619:55e38d8100000000
+--------------------+
| count(distinct c1) |
+--------------------+
| 68 |
+--------------------+
Fetched 1 row(s) in 0.64s
MYSQL 查询信息
连接hive metastore 后台 mysql 数据库。
mysql> SELECT DB_ID,TBL_NAME FROM TBLS WHERE TBL_NAME='t2' \G
*************************** 1. row ***************************
DB_ID: 1
TBL_NAME: t2
1 row in set (0.00 sec)
mysql> select * from DBS WHERE DB_ID=1 \G;
*************************** 1. row ***************************
DB_ID: 1
DESC: Default Hive database
DB_LOCATION_URI: hdfs://bmr-cluster/warehouse/tablespace/managed/hive
NAME: default
OWNER_NAME: public
OWNER_TYPE: ROLE
CTLG_NAME: hive
1 row in set (0.00 sec)
看不到 hive 表的解决办法
Impala 创建的表 使用 hive 可以看到,但是在 hive 创建的表,使用 impala 可能看不到,因为 Impala METADATA 使用使用缓存。
在 hive 中创建 t1 后,impala 看不到。
[bmr-core-d9ad745-3:21000] default> show tables;
Query: show tables
+------+
| name |
+------+
| t2 |
+------+
- 使用
INVALIDATE METADATA
使刷新失效
[bmr-core-d9ad745-3:21000] default> INVALIDATE METADATA ;
Query: INVALIDATE METADATA
Query submitted at: 2023-04-24 18:45:51 (Coordinator: http://bmr-core-d9ad745-3:25000)
Query progress can be monitored at: http://bmr-core-d9ad745-3:25000/query_plan?query_id=75424307c619bfa3:321221ee00000000
Fetched 0 row(s) in 2.45s
show tables
可以看到 Hive 的表。
[bmr-core-d9ad745-3:21000] default> show tables;
Query: show tables
+------+
| name |
+------+
| t1 |
| t2 |
+------+
Fetched 2 row(s) in 0.01s