impala优化

最新推荐文章于 2024-09-03 11:37:17 发布

ITBOY_ITBOX

最新推荐文章于 2024-09-03 11:37:17 发布

阅读量548

点赞数

分类专栏： Impala

本文链接：https://blog.csdn.net/m0_37294838/article/details/90384658

版权

Impala 专栏收录该内容

10 篇文章 1 订阅

订阅专栏

尽量将StateStore和Catalog单独部署到同一个节点，保证他们正常通信。
通过对Impala Daemon内存限制（默认256M）及StateStore工作线程数，来提高Impala的执行效率。
SQL优化，使用之前调用执行计划
选择合适的文件格式进行存储，提高查询效率。
避免产生很多小文件（如果有其他程序产生的小文件，可以使用中间表，将小文件数据存放到中间表。然后通过insert…select…方式中间表的数据插入到最终表中）
使用合适的分区技术，根据分区粒度测算
使用compute stats进行表信息搜集，当一个内容表或分区明显变化，重新计算统计相关数据表或分区。因为行和不同值的数量差异可能导致impala选择不同的连接顺序时进行查询。

[hadoop104:21000] > show table stats student;

Query: show table stats student

+-------+--------+------+--------------+-------------------+--------+-------------------+---------------------------------------------------+

| #Rows | #Files | Size | Bytes Cached | Cache Replication | Format | Incremental stats | Location                                          |

+-------+--------+------+--------------+-------------------+--------+-------------------+---------------------------------------------------+

| -1    | 1      | 67B  | NOT CACHED   | NOT CACHED        | TEXT   | false             | hdfs://hadoop102:8020/user/hive/warehouse/student |

+-------+--------+------+--------------+-------------------+--------+-------------------+---------------------------------------------------+

[hadoop104:21000] > compute stats student;

Query: compute stats student

+-----------------------------------------+

| summary                                 |

+-----------------------------------------+

| Updated 1 partition(s) and 2 column(s). |

+-----------------------------------------+

[hadoop104:21000] > show table stats student;

Query: show table stats student

+-------+--------+------+--------------+-------------------+--------+-------------------+---------------------------------------------------+

| #Rows | #Files | Size | Bytes Cached | Cache Replication | Format | Incremental stats | Location                                          |

+-------+--------+------+--------------+-------------------+--------+-------------------+---------------------------------------------------+

| 6     | 1      | 67B  | NOT CACHED   | NOT CACHED        | TEXT   | false             | hdfs://hadoop102:8020/user/hive/warehouse/student |

+-------+--------+------+--------------+-------------------+--------+-------------------+---------------------------------------------------+