2024年大数据最新大数据集群搭建之Linux安装hadoop3，2024年最新跟我一起手写EventBus吧

2401_84615578

于 2024-05-14 01:23:27 发布

阅读量310

点赞数 4

分类专栏：程序员文章标签：大数据面试学习

本文链接：https://blog.csdn.net/2401_84615578/article/details/138826994

版权

程序员专栏收录该内容

51 篇文章 1 订阅

订阅专栏

网上学习资料一大堆，但如果学到的知识不成体系，遇到问题时只是浅尝辄止，不再深入研究，那么很难做到真正的技术提升。

需要这份系统化资料的朋友，可以戳这里获取

一个人可以走的很快，但一群人才能走的更远！不论你是正从事IT行业的老鸟或是对IT行业感兴趣的新人，都欢迎加入我们的的圈子（技术交流、学习资源、职场吐槽、大厂内推、面试辅导），让我们一起学习成长！

dfs.namenode.name.dir

/home/cluster/hadoop/data/nn

dfs.datanode.data.dir

/home/cluster/hadoop/data/dn

dfs.journalnode.edits.dir

/home/cluster/hadoop/data/jn

dfs.nameservices

ns1

dfs.ha.namenodes.ns1

hadoop001,hadoop002

dfs.namenode.rpc-address.ns1.hadoop001

hadoop001:8020

dfs.namenode.http-address.ns1.hadoop001

hadoop001:9870

dfs.namenode.rpc-address.ns1.hadoop002

hadoop002:8020

dfs.namenode.http-address.ns1.hadoop002

hadoop002:9870

dfs.ha.automatic-failover.enabled.ns1

true

dfs.client.failover.proxy.provider.ns1

org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider

dfs.permissions.enabled

false

dfs.replication

dfs.blocksize

HDFS blocksize of 128MB for large file-systems

dfs.namenode.handler.count

100

More NameNode server threads to handle RPCs from large number of DataNodes.

dfs.namenode.shared.edits.dir

qjournal://hadoop001:8485;hadoop002:8485;hadoop003:8485/ns1

dfs.ha.fencing.methods

sshfence

dfs.ha.fencing.ssh.private-key-files

/root/.ssh/id_rsa

mapred-site.xml

<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

mapreduce.framework.name

yarn

Execution framework set to Hadoop YARN.

mapreduce.map.memory.mb

4096

Larger resource limit for maps.

mapreduce.map.java.opts

-Xmx4096M

Larger heap-size for child jvms of maps.

mapreduce.reduce.memory.mb

4096

Larger resource limit for reduces.

mapreduce.reduce.java.opts

-Xmx4096M

Larger heap-size for child jvms of reduces.

mapreduce.task.io.sort.mb

2040

Higher memory-limit while sorting data for efficiency.

mapreduce.task.io.sort.factor

400

More streams merged at once while sorting files.

mapreduce.reduce.shuffle.parallelcopies

200

Higher number of parallel copies run by reduces to fetch outputs from very large number of maps.

mapreduce.jobhistory.address

hadoop001:10020

MapReduce JobHistory Server host:port.Default port is 10020

mapreduce.jobhistory.webapp.address

hadoop001:19888

MapReduce JobHistory Server Web UI host:port.Default port is 19888.

mapreduce.jobhistory.intermediate-done-dir

/tmp/mr-history/tmp

Directory where history files are written by MapReduce jobs.

mapreduce.jobhistory.done-dir

/tmp/mr-history/done

Directory where history files are managed by the MR JobHistory Server.

yarn-site.xml

<?xml version="1.0"?>

yarn.resourcemanager.ha.enabled

true

yarn.resourcemanager.ha.automatic-failover.enabled

true

yarn.resourcemanager.ha.automatic-failover.embedded

true

yarn.resourcemanager.cluster-id

yarn-rm-cluster

yarn.resourcemanager.ha.rm-ids

rm1,rm2

yarn.resourcemanager.hostname.rm1

hadoop001

yarn.resourcemanager.hostname.rm2

hadoop002

yarn.resourcemanager.recovery.enabled

true

yarn.resourcemanager.zk.state-store.address

hadoop001:2181,hadoop002:2181,hadoop003:2181

yarn.resourcemanager.zk-address

hadoop001:2181,hadoop002:2181,hadoop003:2181

yarn.resourcemanager.address.rm1

hadoop001:8032

yarn.resourcemanager.address.rm2

hadoop002:8032

yarn.resourcemanager.scheduler.address.rm1

hadoop001:8034

yarn.resourcemanager.webapp.address.rm1

hadoop001:8088

yarn.resourcemanager.scheduler.address.rm2

hadoop002:8034

yarn.resourcemanager.webapp.address.rm2

hadoop002:8088

yarn.acl.enable

true

Enable ACLs? Defaults to false.

yarn.admin.acl

yarn.log-aggregation-enable

false

Configuration to enable or disable log aggregation

yarn.resourcemanager.hostname

hadoop001

host Single hostname that can be set in place of setting all yarn.resourcemanager*address resources. Results in default ports for ResourceManager components.

yarn.scheduler.minimum-allocation-mb

1024

saprk调度时一个container能够申请的最小资源，默认值为1024MB

yarn.scheduler.maximum-allocation-mb

28672

saprk调度时一个container能够申请的最大资源，默认值为8192MB

yarn.nodemanager.resource.memory-mb

28672

nodemanager能够申请的最大内存，默认值为8192MB

yarn.app.mapreduce.am.resource.mb

28672

AM能够申请的最大内存，默认值为1536MB

yarn.nodemanager.log.retain-seconds

10800

yarn.nodemanager.log-dirs

/home/cluster/yarn/log/1,/home/cluster/yarn/log/2,/home/cluster/yarn/log/3

yarn.nodemanager.aux-services

mapreduce_shuffle

Shuffle service that needs to be set for Map Reduce applications.

yarn.log-aggregation.retain-seconds

-1

yarn.log-aggregation.retain-check-interval-seconds

-1

yarn.app.mapreduce.am.staging-dir

hdfs://ns1/tmp/hadoop-yarn/staging

The staging dir used while submitting jobs.

yarn.application.classpath

/usr/local/hadoop/hadoop/etc/hadoop:/usr/local/hadoop/hadoop/share/hadoop/common/lib/:/usr/local/hadoop/hadoop/share/hadoop/common/:/usr/local/hadoop/hadoop/share/hadoop/hdfs:/usr/local/hadoop/hadoop/share/hadoop/hdfs/lib/:/usr/local/hadoop/hadoop/share/hadoop/hdfs/:/usr/local/hadoop/hadoop/share/hadoop/mapreduce/:/usr/local/hadoop/hadoop/share/hadoop/yarn:/usr/local/hadoop/hadoop/share/hadoop/yarn/lib/:/usr/local/hadoop/hadoop/share/hadoop/yarn/*

Linux上打 hadoop classpath 找到的所有路径

五、初始化集群