《Hive系列》Hive问题处理记录

DATA数据猿

已于 2024-01-02 22:08:41 修改

阅读量4.1k

点赞数 1

分类专栏： Hive 文章标签： hive presto

于 2021-07-30 13:57:59 首次发布

本文链接：https://blog.csdn.net/m0_51197424/article/details/119246000

版权

Hive 专栏收录该内容

8 篇文章 7 订阅

订阅专栏

文章目录

Hive问题处理记录
一、Hive failed; error='Cannot allocate memory' (errno=12)
二、hive-ls: 无法访问/opt/apps/spark-2.2.0/lib/spark-assembly-*.jar: 没有那个文件或目录
三、hive和presto的求数组长度函数区别(hive&cardinality)

Hive问题处理记录

一、Hive failed; error=‘Cannot allocate memory’ (errno=12)

1.问题


[root@hadoop_zxy bin]# hive

Logging initialized using configuration in jar:file:/zxy/apps/hive-1.2.1/lib/hive-common-1.2.1.jar!/hive-log4j.properties
Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000fec00000, 20971520, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 20971520 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /zxy/apps/hive-1.2.1/bin/hs_err_pid24710.log

2.解决

由于报错os::commit_memory(0x00000000fec00000, 20971520, 0)，但是经检查内存充足
于是判断是系统内存分配策略的问题，做以下修改

[root@hadoop_zxy bin]# vim /etc/sysctl.conf

修改前

在这里插入图片描述

修改后

在这里插入图片描述

[root@hadoop_zxy bin]# sysctl -p
kernel.sem = 250 64000 100 512
kernel.shmmax = 500000000
kernel.shmmni = 4096
kernel.shmall = 4000000000
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
net.ipv4.tcp_syncookies = 1
net.ipv4.ip_forward = 0
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.ipv4.conf.default.arp_filter = 1
net.core.netdev_max_backlog = 10000
vm.overcommit_memory = 1
vm.max_map_count = 262144

3.测试


[root@hadoop_zxy bin]# hive --service hiveserver2 &
[1] 583
[root@hadoop_zxy bin]# hive --service metastore &
[2] 818

4.拓展

vm.overcommit_memory配置

linux系统会对大部分的申请都回复yes，以便于运行更多的程序。但是有些程序申请完内存后并不一定会立马使用，这就叫做overcommit。
而通过vm.overcommit_memory的配置就可以控制overcommit的内存分配策略，主要分为以下三种：
0：内核首先会检查是否有足够的内存分配，如果没有就反馈申请失败，也就是cannot allocate memory的出现
1：内核允许超量使用内存直到内存用完为止
2：表示内核绝不允许超量使用内存，即系统的内存空间不能超过swap+50%的RAM值，50%是overcommit_ratio的默认值，该参数支持修改

二、hive-ls: 无法访问/opt/apps/spark-2.2.0/lib/spark-assembly-*.jar: 没有那个文件或目录

1.问题

ls: 无法访问/opt/apps/spark-2.2.0/lib/spark-assembly-*.jar: 没有那个文件或目录

Logging initialized using configuration in jar:file:/opt/apps/hive-1.2.1/lib/hive-common-1.2.1.jar!/hive-log4j.properties
OK
Time taken: 1.544 seconds
ls: 无法访问/opt/apps/spark-2.2.0/lib/spark-assembly-*.jar: 没有那个文件或目录

Logging initialized using configuration in jar:file:/opt/apps/hive-1.2.1/lib/hive-common-1.2.1.jar!/hive-log4j.properties
OK
Time taken: 1.19 seconds
ls: 无法访问/opt/apps/spark-2.2.0/lib/spark-assembly-*.jar: 没有那个文件或目录

Logging initialized using configuration in jar:file:/opt/apps/hive-1.2.1/lib/hive-common-1.2.1.jar!/hive-log4j.properties
OK
Time taken: 2.658 seconds
ls: 无法访问/opt/apps/spark-2.2.0/lib/spark-assembly-*.jar: 没有那个文件或目录

Logging initialized using configuration in jar:file:/opt/apps/hive-1.2.1/lib/hive-common-1.2.1.jar!/hive-log4j.properties
OK
Time taken: 1.752 seconds
ls: 无法访问/opt/apps/spark-2.2.0/lib/spark-assembly-*.jar: 没有那个文件或目录

2.解决方案

1.hive安装路径的bin目录下
2.修改hive配置文件
vim bin/hive

# add Spark assembly jar to the classpath
if [[ -n "$SPARK_HOME" ]]
then
sparkAssemblyPath=`ls ${SPARK_HOME}/lib/spark-assembly-*.jar`
CLASSPATH="${CLASSPATH}:${sparkAssemblyPath}"
fi

修改为如下


# add Spark assembly jar to the classpath
if [[ -n "$SPARK_HOME" ]]
then
  sparkAssemblyPath=`ls ${SPARK_HOME}/jars/*.jar`
  CLASSPATH="${CLASSPATH}:${sparkAssemblyPath}"
fi

3.tips:

在spark2之后，spark目录下已经没有lib目录了，相关的jar包都被存放于jars目录下，所以只需要修改其路径即可

三、hive和presto的求数组长度函数区别(hive&cardinality)

1.数据示例:

123456@qq.com

2.目的:

获取邮箱字符串’@'后字符串

3.实现

3.1 hive用法

SELECT 
email.email,
(case when size(split(email.email, '@')) = 2 then split(t1.email, '@')[1] else '' end ) as email_suffix
FROM `email`

3.2 presto用法

SELECT 
email.email,
(case when cardinality(split(email.email, '@')) = 2 then split(t1.email, '@')[2] else '' end ) as email_suffix
FROM `email`

4.比较

在计算数组长度的时候，hive和presto的函数不同
其中hive的size函数默认数组的下标从0开始
presto的cardinality函数默认数组的下标从1开始

DATA数据猿

关注

1
点赞
踩
5

收藏

觉得还不错? 一键收藏
打赏
3
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录