Linux环境下安装FLink1.11.1并启动SQL-client读取Hive数据
首先去官网下载Flink1.11.1的tgz的包,教程如上篇文章上半部分流程一样,然后配置一下FLINK_HOME/conf/sql-client-defaults.yaml:
catalogs:
- name: myhive #自己定个名字就行
type: hive
hive-conf-dir: /etc/hive/conf # hive-site.xml的路径
hive-version: 1.2.1 # hive版本
execution:
# select the implementation responsible for planning table programs
# possible values are 'blink' (used by default) or 'old'
planner: blink
# 'batch' or 'streaming' execution
type: batch #这里streaming和batch都行
# allow 'event-time' or only 'processing-time' in sources
time-characteristic: event-time
# interval in ms for emitting periodic watermarks
periodic-watermarks-interval: 200
# 'changelog' or 'table' presentation of results
result-mode: table
# maximum number of maintained rows in 'table' presentation of results
max-table-result-rows: 1000000
# parallelism of the program
parallelism: 1
# maximum parallelism
max-parallelism: 128
# minimum idle state retention in ms
min-idle-state-retention: 0
# maximum idle state retention in ms
max-idle-state-retention: 0
# current catalog ('default_catalog' by default)
current-catalog: myhive
# current database of the current catalog (default database of the catalog by default)
current-database: secoo_tmp
# controls how table programs are restarted in case of a failures
restart-strategy:
# strategy type
# possible values are "fixed-delay", "failure-rate", "none", or "fallback" (default)
type: fallback
配置/etc/profile文件:
export HADOOP_HOME=/usr/hdp/2.4.0.0-169/hadoop
export YARN_CONF_DIR=/etc/hadoop/conf
export HADOOP_CLASSPATH=`hadoop classpath` #非常重要,不添加 运行flink命令时会报错
在FLink安装目录启动yarn-session.sh:
./bin/yarn-session.sh -n 5 -tm 4096 -s 4 -nm 应用名称 -q 队列名称 -d(这个参数可以保证在我们退出客户端时,任务不被立即杀死,还在yarn上持续运行着)
yarn-session的参数介绍
-n : 指定TaskManager的数量;
-d: 以分离模式运行;
-id:指定yarn的任务ID;
-j:Flink jar文件的路径;
-jm:JobManager容器的内存(默认值:MB);
-nl:为YARN应用程序指定YARN节点标签;
-nm:在YARN上为应用程序设置自定义名称;
-q:显示可用的YARN资源(内存,内核);
-qu:指定YARN队列;
-s:指定TaskManager中slot的数量;
-st:以流模式启动Flink;
-tm:每个TaskManager容器的内存(默认值:MB);
-z:命名空间,用于为高可用性模式创建Zookeeper子路径;
在yarn页面查看Flink-session任务:

提交程序报错:
org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: Unable to instantiate java compiler
···
Caused by: java.lang.IllegalStateException: Unable to instantiate java compiler
···
Caused by: java.lang.ClassCastException: org.codehaus.janino.CompilerFactory cannot be cast to org.codehaus.commons.compiler.ICompilerFactory
解决办法:
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-exec</artifactId>
<version>1.2.1</version>
<exclusions>
<exclusion>
<groupId>org.codehaus.janino</groupId>
<artifactId>janino</artifactId>
</exclusion>
<exclusion>
<groupId>org.codehaus.janino</groupId>
<artifactId>commons-compiler</artifactId>
</exclusion>
</exclu

本文详细介绍了在Linux环境下安装Flink 1.11.1并配置读取Hive数据时可能遇到的问题及解决方案,包括Flink与Hadoop版本冲突、依赖冲突等,提供了编译flink-shaded-hadoop-2-uber jar包的步骤,以确保Flink能顺利连接和操作Hive。
最低0.47元/天 解锁文章
1万+

被折叠的 条评论
为什么被折叠?



