深入浅出学Hive：Hive参数

最新推荐文章于 2023-06-14 09:35:50 发布

hellozhxy

最新推荐文章于 2023-06-14 09:35:50 发布

阅读量203

点赞数

分类专栏： hive

hive 专栏收录该内容

18 篇文章 1 订阅

订阅专栏

第一部分：Hive 参数

hive.exec.max.created.files

•说明：所有hive运行的map与reduce任务可以产生的文件的和

•默认值:100000

hive.exec.dynamic.partition

•说明：是否为自动分区

•默认值：false

hive.mapred.reduce.tasks.speculative.execution

•说明：是否打开推测执行

•默认值：true

hive.input.format

•说明：Hive默认的input format

•默认值： org.apache.hadoop.hive.ql.io.CombineHiveInputFormat

•如果有问题可以使用org.apache.hadoop.hive.ql.io.HiveInputFormat

hive.exec.counters.pull.interval

•说明：Hive与JobTracker拉取counter信息的时间

•默认值：1000ms

hive.script.recordreader

•说明：使用脚本时默认的读取类

•默认值： org.apache.hadoop.hive.ql.exec.TextRecordReader

hive.script.recordwriter

•说明：使用脚本时默认的数据写入类

•默认值： org.apache.hadoop.hive.ql.exec.TextRecordWriter

hive.mapjoin.check.memory.rows

•说明：内存里可以存储数据的行数

•默认值： 100000

hive.mapjoin.smalltable.filesize

•说明：输入小表的文件大小的阀值，如果小于该值，就采用普通的join

•默认值： 25000000

hive.auto.convert.join

•说明：是不是依据输入文件的大小，将Join转成普通的Map Join

•默认值： false

hive.mapjoin.followby.gby.localtask.max.memory.usage

•说明：map join做group by 操作时，可以使用多大的内存来存储数据，如果数据太大，则不会保存在内存里

•默认值：0.55

hive.mapjoin.localtask.max.memory.usage

•说明：本地任务可以使用内存的百分比

•默认值： 0.90

hive.heartbeat.interval

•说明：在进行MapJoin与过滤操作时，发送心跳的时间

•默认值1000

hive.merge.size.per.task

•说明：合并后文件的大小

•默认值： 256000000

hive.mergejob.maponly

•说明：在只有Map任务的时候合并输出结果

•默认值： true

hive.merge.mapredfiles

•默认值：在作业结束的时候是否合并小文件

•说明： false

hive.merge.mapfiles

•说明：Map-Only Job是否合并小文件

•默认值：true

hive.hwi.listen.host

•说明：Hive UI 默认的host

•默认值：0.0.0.0

hive.hwi.listen.port

•说明：Ui监听端口

•默认值：9999

hive.exec.parallel.thread.number

•说明：hive可以并行处理Job的线程数

•默认值：8

hive.exec.parallel

•说明：是否并行提交任务

•默认值：false

hive.exec.compress.output

•说明：输出使用压缩

•默认值： false

hive.mapred.mode

•说明： MapReduce的操作的限制模式，操作的运行在该模式下没有什么限制

•默认值： nonstrict

hive.join.cache.size

•说明： join操作时，可以存在内存里的条数

•默认值： 25000

hive.mapjoin.cache.numrows

•说明： mapjoin 存在内存里的数据量

•默认值：25000

hive.join.emit.interval

•说明：有连接时Hive在输出前，缓存的时间

•默认值： 1000

hive.optimize.groupby

•说明：在做分组统计时，是否使用bucket table

•默认值： true

hive.fileformat.check

•说明：是否检测文件输入格式

•默认值：true

hive.metastore.client.connect.retry.delay

•说明： client 连接失败时,retry的时间间隔

•默认值：1秒

hive.metastore.client.socket.timeout

•说明: Client socket 的超时时间

•默认值：20秒

mapred.reduce.tasks

•默认值：-1

•说明：每个任务reduce的默认值

-1 代表自动根据作业的情况来设置reduce的值

hive.exec.reducers.bytes.per.reducer

•默认值： 1000000000 （1G）

•说明：每个reduce的接受的数据量

如果送到reduce的数据为10G,那么将生成10个reduce任务

hive.exec.reducers.max

•默认值：999

•说明： reduce的最大个数

hive.exec.reducers.max

•默认值：999

•说明： reduce的最大个数

hive.metastore.warehouse.dir

•默认值：/user/hive/warehouse

•说明：默认的数据库存放位置

hive.default.fileformat

•默认值：TextFile

•说明：默认的fileformat

hive.map.aggr

•默认值：true

•说明： Map端聚合，相当于combiner

hive.exec.max.dynamic.partitions.pernode

•默认值：100

•说明：每个任务节点可以产生的最大的分区数

hive.exec.max.dynamic.partitions

•默认值：1000

•说明：默认的可以创建的分区数

hive.metastore.server.max.threads

•默认值：100000

•说明： metastore默认的最大的处理线程数

hive.metastore.server.min.threads

•默认值：200

•说明： metastore默认的最小的处理线程数

转载请注明出处【 http://sishuok.com/forum/blogPost/list/0/6225.html】

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
深入浅出学Hive：Hive参数

第一部分：Hive 参数hive.exec.max.created.files•说明：所有hive运行的map与reduce任务可以产生的文件的和•默认值:100000 hive.exec.dynamic.partition•说明：是否为自动分区•默认值：falsehive.mapred.reduce.tasks.speculative.execution•说明：是...
复制链接

扫一扫

专栏目录

hellozhxy CSDN认证博客专家 CSDN认证企业博客

码龄16年

67: 原创

2万+: 周排名

1882: 总排名

336万+: 访问

: 等级

2万+: 积分

1559: 粉丝

1777: 获赞

305: 评论

8146: 收藏

私信

关注

热门文章

分类专栏

最新评论

Python划分训练集,测试集函数——train_test_split()详解
2401_83821359: 是前百分之30还是后百分30还是随机的30？
从零实现Transformer
: 转载没有经过本人同意，请你删除！
小样本学习(Few-Shot Learning)
wzw_25: In [8] 只调用了一次“ support_set, test_set = create_support_set_and_test_dataset()”，这意味着在train_predictor的200个epoch中，都用的相同的support_set, 训练预测器时不应该在不同的任务中进行嘛？为什么要在同一个support_set中进行200轮的训练？
java agent技术原理及简单实现
杰夫·王盖茨: 知道结果了么
DPDK技术介绍
~\(≧▽≦)/~梦-.-: 写的真好啊

您愿意向朋友推荐“博客详情页”吗？

强烈不推荐
不推荐
一般般
推荐
强烈推荐

提交

最新文章

目录

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。