文章目录
Hive问题处理记录
一、Hive failed; error=‘Cannot allocate memory’ (errno=12)
1.问题
[root@hadoop_zxy bin]# hive
Logging initialized using configuration in jar:file:/zxy/apps/hive-1.2.1/lib/hive-common-1.2.1.jar!/hive-log4j.properties
Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000fec00000, 20971520, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 20971520 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /zxy/apps/hive-1.2.1/bin/hs_err_pid24710.log
2.解决
由于报错os::commit_memory(0x00000000fec00000, 20971520, 0),但是经检查内存充足
于是判断是系统内存分配策略的问题,做以下修改
[root@hadoop_zxy bin]# vim /etc/sysctl.conf
修改前
修改后
[root@hadoop_zxy bin]# sysctl -p
kernel.sem = 250 64000 100 512
kernel.shmmax = 500000000
kernel.shmmni = 4096
kernel.shmall = 4000000000
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
net.ipv4.tcp_syncookies = 1
net.ipv4.ip_forward = 0
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.ipv4.conf.default.arp_filter = 1
net.core.netdev_max_backlog = 10000
vm.overcommit_memory = 1
vm.max_map_count = 262144
3.测试
[root@hadoop_zxy bin]# hive --service hiveserver2 &
[1] 583
[root@hadoop_zxy bin]# hive --service metastore &
[2] 818
4.拓展
vm.overcommit_memory配置
linux系统会对大部分的申请都回复yes,以便于运行更多的程序。但是有些程序申请完内存后并不一定会立马使用,这就叫做overcommit。
而通过vm.overcommit_memory的配置就可以控制overcommit的内存分配策略,主要分为以下三种:
0:内核首先会检查是否有足够的内存分配,如果没有就反馈申请失败,也就是cannot allocate memory的出现
1:内核允许超量使用内存直到内存用完为止
2:表示内核绝不允许超量使用内存,即系统的内存空间不能超过swap+50%的RAM值,50%是overcommit_ratio的默认值,该参数支持修改
二、hive-ls: 无法访问/opt/apps/spark-2.2.0/lib/spark-assembly-*.jar: 没有那个文件或目录
1.问题
ls: 无法访问/opt/apps/spark-2.2.0/lib/spark-assembly-*.jar: 没有那个文件或目录
Logging initialized using configuration in jar:file:/opt/apps/hive-1.2.1/lib/hive-common-1.2.1.jar!/hive-log4j.properties
OK
Time taken: 1.544 seconds
ls: 无法访问/opt/apps/spark-2.2.0/lib/spark-assembly-*.jar: 没有那个文件或目录
Logging initialized using configuration in jar:file:/opt/apps/hive-1.2.1/lib/hive-common-1.2.1.jar!/hive-log4j.properties
OK
Time taken: 1.19 seconds
ls: 无法访问/opt/apps/spark-2.2.0/lib/spark-assembly-*.jar: 没有那个文件或目录
Logging initialized using configuration in jar:file:/opt/apps/hive-1.2.1/lib/hive-common-1.2.1.jar!/hive-log4j.properties
OK
Time taken: 2.658 seconds
ls: 无法访问/opt/apps/spark-2.2.0/lib/spark-assembly-*.jar: 没有那个文件或目录
Logging initialized using configuration in jar:file:/opt/apps/hive-1.2.1/lib/hive-common-1.2.1.jar!/hive-log4j.properties
OK
Time taken: 1.752 seconds
ls: 无法访问/opt/apps/spark-2.2.0/lib/spark-assembly-*.jar: 没有那个文件或目录
2.解决方案
1.hive安装路径的bin目录下
2.修改hive配置文件
vim bin/hive
# add Spark assembly jar to the classpath
if [[ -n "$SPARK_HOME" ]]
then
sparkAssemblyPath=`ls ${SPARK_HOME}/lib/spark-assembly-*.jar`
CLASSPATH="${CLASSPATH}:${sparkAssemblyPath}"
fi
修改为如下
# add Spark assembly jar to the classpath
if [[ -n "$SPARK_HOME" ]]
then
sparkAssemblyPath=`ls ${SPARK_HOME}/jars/*.jar`
CLASSPATH="${CLASSPATH}:${sparkAssemblyPath}"
fi
3.tips:
在spark2之后,spark目录下已经没有lib目录了,相关的jar包都被存放于jars目录下,所以只需要修改其路径即可
三、hive和presto的求数组长度函数区别(hive&cardinality)
1.数据示例:
123456@qq.com
2.目的:
获取邮箱字符串’@'后字符串
3.实现
3.1 hive用法
SELECT
email.email,
(case when size(split(email.email, '@')) = 2 then split(t1.email, '@')[1] else '' end ) as email_suffix
FROM `email`
3.2 presto用法
SELECT
email.email,
(case when cardinality(split(email.email, '@')) = 2 then split(t1.email, '@')[2] else '' end ) as email_suffix
FROM `email`
4.比较
在计算数组长度的时候,hive和presto的函数不同
其中hive的size函数默认数组的下标从0开始
presto的cardinality函数默认数组的下标从1开始