hive bug1:
Exception in thread "org.apache.hadoop.hive.common.JvmPauseMonitor$Monitor@34a2d6e0" java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.ArrayList.<init>(ArrayList.java:153)
at sun.management.ManagementFactoryHelper.getGarbageCollectorMXBeans(ManagementFactoryHelper.java:131)
at java.lang.management.ManagementFactory.getGarbageCollectorMXBeans(ManagementFactory.java:416)
at org.apache.hadoop.hive.common.JvmPauseMonitor.getGcTimes(JvmPauseMonitor.java:141)
at org.apache.hadoop.hive.common.JvmPauseMonitor.access$400(JvmPauseMonitor.java:45)
at org.apache.hadoop.hive.common.JvmPauseMonitor$Monitor.run(JvmPauseMonitor.java:186)
at java.lang.Thread.run(Thread.java:748)
Exception in thread "HiveServer2-Handler-Pool: Thread-213" java.lang.OutOfMemoryError: GC overhead limit exceeded
修改hive-env.sh调整heap大小
hive bug2:
WARN org.apache.flink.runtime.taskmanager.Task
[] - PartitionCommitter -> Sink: end (1/1)#205
(349f5d83bf18f490002b6817c89f0d5a) switched from INITIALIZING to FAILED
with failure cause:
org.apache.flink.streaming.runtime.tasks.StreamTaskException:
Cannot load user class: org.apache.flink.connectors.hive.HadoopFileSystemFactory
找到相应版本的jar包、connector,这个也是最长遇到的问题
hive bug3:
ERROR [a287cd2c-87cf-4572-abe4-cfaa15ec44b1 main] exec.Task (SessionState.java:printError(1250)) - Ended Job = job_1635418137267_0003 with errors
Error during job, obtaining debugging information...
2021-11-01 02:52:13,271 ERROR [Thread-53] exec.Task (SessionState.java:printError(1250)) - Error during job, obtaining debugging information...
2021-11-01 02:52:13,276 INFO [a287cd2c-87cf-4572-abe4-cfaa15ec44b1 main] impl.YarnClientImpl (YarnClientImpl.java:killApplication(497)) - Killed application application_1635418137267_0003
2021-11-01 02:52:13,277 INFO [a287cd2c-87cf-4572-abe4-cfaa15ec44b1 main] reexec.ReOptimizePlugin (ReOptimizePlugin.java:run(70)) - ReOptimization: retryPossible: false
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
2021-11-01 02:52:13,277 ERROR [a287cd2c-87cf-4572-abe4-cfaa15ec44b1 main] ql.Driver (SessionState.java:printError(1250)) - FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
测试方法:修改etc/hadoop/mapred-site.xml配置后,重启hdfs生效。
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=/hom***01/hadoop3.2</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=/h***1/hadoop3.2</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_MAPRED_HOME=/h***1/hadoop3.2</value>
</property>
etc/hadoop/hdfs-site.xml
<property>
<name>dfs.datanode.max.transfer.threads</name>
<value>8192</value>
</property>
调整hadoop配置文件yarn-site.xml中值:
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>3072</value>
<description>default value is 1024</description>
</property>
hive bug4:
执行部分hive sql脚本后hive直接挂掉,报错栈溢出、内存错误等。
报错:解决第一个bug3后消失,无法复现
hive bug5:
flink跟hive版本不兼容,降级hive失败后升级flink; 注意使用最新版本的hive跟flink
比如下面错误:
Before 3.1.2 of hive version, getQueryCurrentTimestamp return Timestamp. But when hive version is 3.1.2,getCurrentTSMethod invoke return Instant. So the code `(Timestamp)getCurrentTSMethod.invoke(sessionState)` will result the ClassCastException. It should be compatibility with this situation.
when I use hive dialect to create hive table, it will tirgger this error. The error is below:
Exception in thread "main" java.lang.ClassCastException: java.time.Instant cannot be cast to java.sql.TimestampException in thread "main" java.lang.ClassCastException: java.time.Instant cannot be cast to java.sql.Timestamp at org.apache.flink.table.planner.delegation.hive.HiveParser.setCurrentTimestamp(HiveParser.java:365) at org.apache.flink.table.planner.delegation.hive.HiveParser.startSessionState(HiveParser.java:350) at org.apache.flink.table.planner.delegation.hive.HiveParser.parse(HiveParser.java:218) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:722)