impala基于CDH,提供针对HDFS,hbase的实时查询,查询语句类似于hive
包括几个组件
Clients:提供Hue, ODBC clients, JDBC clients, and the Impala Shell与impala交互查询
Hive Metastore:保存数据的元数据,让impala知道数据的结构等信息
Cloudera Impala:协调查询在每个datanode上,分发并行查询任务,并将查询返回客户端
HBase and HDFS:存储数据
环境
hadoop-2.0.0-cdh4.1.2
hive-0.9.0-cdh4.1.2
impala利用yum安装
增加yum库
[cloudera-impala]
name=Impala
baseurl=http://archive.cloudera.com/impala/redhat/5/x86_64/impala/1/
gpgkey = http://archive.cloudera.com/impala/redhat/5/x86_64/impala/RPM-GPG-KEY-cloudera
gpgcheck = 1
加至/etc/yum.repos.d目录下
注意cdh与hive及impala需要版本匹配,具体去impala官网去查一下
需要内存比较大,需要64位机器(推荐有点忘了是否支持32位),支持的linux版本也有要求
http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/PDF/Installing-and-Using-Impala.pdf
安装CDH4
http://archive.cloudera.com/cdh4/cdh/4/
cdh与hive都可以在这找到
三台机器
master安装 namenode,secondnamenode,ResourceManager, impala-state-store,impala-shell,hive
slave1安装 datanode,nodemanager,impala-server, impala-shell
slave2安装 datanode,nodemanager,impala-server, impala-shell
hadoop配置
在master机器上配置
$HADOOP_HOME/etc/hadoop中的core-site.xml增加
<property>
<name>io.native.lib.available</name>
<value>true</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
<description>The name of the default file system.Either theliteral string "local" or a host:port for NDFS.</description>
<final>true</final>
</property>
$HADOOP_HOME/etc/hadoop中的hdfs-site.xml增加
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hadoop/cloudera/hadoop/dfs/name</value>
<description>Determines where on the local filesystem the DFS namenode should store the name table.If this is a comma-delimited list ofdirectories,then name table is replicated in all of the directories,forredundancy.</description>
<final>true</final>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hadoop/cloudera/hadoop/dfs/data</value>
<description>Determines where on the local filesystem an DFS datanode should store its blocks.If this is a comma-delimited list ofdirectories,then data will be stored in all named directories,typically ondifferent devices.Directories that do not exist are ignored.
</description>
<final>true</final>
</property>
<property>
<name>dfs.http.address</name>
<value>fca-vm-arch-proxy1:50070</value>
CDH4 impala安装配置
最新推荐文章于 2024-03-13 08:00:00 发布
本文档详细介绍了在CDH4平台上安装和配置Impala的步骤,包括Impala的角色组件、系统环境要求以及配置过程。首先,讨论了Impala组件如Client、Hive Metastore和Cloudera Impala的职责。接着,提供了安装CDH4、Hadoop、Hive的链接和匹配版本的重要性。然后,详细阐述了三台机器(Master、Slave1、Slave2)的节点角色分配以及各组件的安装位置。在配置部分,涉及了Hadoop、HDFS、YARN、Hive的相关配置文件修改,以及环境变量设置。最后,介绍了启动流程、验证步骤以及可能出现的问题和解决方案。整个过程强调了权限设置对于避免插入数据时出错的重要性。
摘要由CSDN通过智能技术生成