Hadoop 配置参数摘要和默认端口整理


core-site.xml

http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml
配置项 中文含义 英文含义 示例 官方默认值
fs.defaultFS HDFS分布式文件系统访问URI The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem. <property>
<name>fs.defaultFS</name>
<value>hdfs://192.168.100.200:9000</value>
</property>
file:/// 
hadoop.tmp.dir 其他临时文件的根目录 A base for other temporary directories. <property>
<name>hadoop.tmp.dir</name>
<value>file:///uloc/hadoopdata/hadoop-${user.name}/tmp</value>
</property>
/tmp/hadoop-${user.name}
io.file.buffer.size 读写操作时的缓存大小。一般为硬件page size的整数倍 The size of buffer for use in sequence files. The size of this buffer should probably be a multiple of hardware page size (4096 on Intel x86), and it determines how much data is buffered during read and write operations. <property>
<name>io.file.buffer.size</name>
<value>4096</value>
</property>
4096
hadoop.http.staticuser.user   The user name to filter as, on static web filters while rendering content. An example use is the HDFS web UI (user to be used for browsing files). <property>
<name>hadoop.http.staticuser.user</name>
<value>bruce</value>
</property>
dr.who
         
         
         
         
         
         
         
         
         
配置项 默认地址/端口 英文含义 备注  
hadoop.registry.zk.quorum localhost:2181 List of hostname:port pairs defining the zookeeper quorum binding for the registry    
fs.defaultFS file:/// The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem. 一般配置形式为,hdfs://192.168.100.200:9000。
这样NamdeNode中会启动ipc.Server监听端口9000
 


hdf-site.xml:

http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
配置项 中文含义 英文含义 示例 官方默认值
dfs.namenode.name.dir HDFS name node 存放 name table 的目录。

如果有多个目录,则name table会在每个目录下放1份拷贝。
Determines where on the local filesystem the DFS name node should store the name table(fsimage). If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy.  <property>
<name>dfs.namenode.name.dir</name>
<value>file:///uloc/hadoopdata/hadoop-${user.name}/dfs/name</value>
<final>true</final>
</property>
file://${hadoop.tmp.dir}/dfs/name 
dfs.datanode.data.dir HDFS data node 存放数据 block 的目录。可以配置多个目录,则每个目录保存一份拷贝。 Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. The directories should be tagged with corresponding storage types ([SSD]/[DISK]/[ARCHIVE]/[RAM_DISK]) for HDFS storage policies. The default storage type will be DISK if the directory does not have a storage type tagged explicitly. Directories that do not exist will be created if local filesystem permission allows.  <property>
<name>dfs.datanode.data.dir</name>
<value>file:///uloc/hadoopdata/hadoop-${user.name}/dfs/data</value>
<final>true</final>
</property>
file://${hadoop.tmp.dir}/dfs/data
dfs.replication 数据块的副本数量 Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time.  <property>
<name>dfs.replication</name>
<value>1</value>
</property>
3
dfs.namenode.secondary.http-address 备份name node的 地址和端口 The secondary namenode http server address and port.  <property>
<name>dfs.namenode.secondary.http-address</name>
<value>192.168.100.200:50090</value>
</property>
0.0.0.0:50090
dfs.namenode.checkpoint.dir 备份name node存放临时快照的目录。可以配置多个目录,则每个目录保存一份拷贝。 Determines where on the local filesystem the DFS secondary name node should store the temporary images to merge. If this is a comma-delimited list of directories then the image is replicated in all of the directories for redundancy.  <property>
<name>dfs.namenode.checkpoint.dir</name>
<value>file:///uloc/hadoopdata/hadoop-${user.name}/dfs/namesecondary</value>
<final>true</final>
</property>
file://${hadoop.tmp.dir}/dfs/namesecondary
dfs.permissions.enabled HDFS文件访问权限控制开关 If "true", enable permission checking in HDFS. If "false", permission checking is turned off, but all other behavior is unchanged. Switching from one parameter value to the other does not change the mode, owner or group of files or directories.   <property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
 true
dfs.datanode.address data node用作数据传输的地址和端口 The datanode server address and port for data transfer. <property>
<name>dfs.datanode.address</name>
<value>192.168.100.200:50010</value>
</property>
 0.0.0.0:50010 
dfs.webhdfs.enabled WebHDFS特性开关 Enable WebHDFS (REST API) in Namenodes and Datanodes. <property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
 true
dfs.support.append Yarn框架中无此配置项   <property>   
<name>dfs.support.append</name>
   
<value>true</value>
 
</property>
 
dfs.permissions.superusergroup   The name of the group of super-users. <property>
<name>dfs.permissions.superusergroup</name>
<value>oinstall</value>
</property>
supergroup
dfs.block.invalidate.limit 每次删除block的数量。建议默认设置为1000      


配置项 默认地址/端口 英文含义
dfs.namenode.rpc-address   RPC address that handles all clients requests. In the case of HA/Federation where multiple namenodes exist, the name service id is added to the name e.g. dfs.namenode.rpc-address.ns1 dfs.namenode.rpc-address.EXAMPLENAMESERVICE The value of this property will take the form of nn-host1:rpc-port.
dfs.namenode.rpc-bind-host   The actual address the RPC server will bind to. If this optional address is set, it overrides only the hostname portion of dfs.namenode.rpc-address. It can also be specified per name node or name service for HA/Federation. This is useful for making the name node listen on all interfaces by setting it to 0.0.0.0.
dfs.namenode.servicerpc-address   RPC address for HDFS Services communication. BackupNode, Datanodes and all other services should be connecting to this address if it is configured. In the case of HA/Federation where multiple namenodes exist, the name service id is added to the name e.g. dfs.namenode.servicerpc-address.ns1 dfs.namenode.rpc-address.EXAMPLENAMESERVICE The value of this property will take the form of nn-host1:rpc-port. If the value of this property is unset the value of dfs.namenode.rpc-address will be used as the default.
dfs.namenode.servicerpc-bind-host   The actual address the service RPC server will bind to. If this optional address is set, it overrides only the hostname portion of dfs.namenode.servicerpc-address. It can also be specified per name node or name service for HA/Federation. This is useful for making the name node listen on all interfaces by setting it to 0.0.0.0.
dfs.namenode.secondary.http-address 0.0.0.0:50090 The secondary namenode http server address and port.
dfs.namenode.secondary.https-address 0.0.0.0:50091 The secondary namenode HTTPS server address and port.
dfs.namenode.http-address 0.0.0.0:50070 The address and the base port where the dfs namenode web ui will listen on.
dfs.namenode.https-address 0.0.0.0:50470 The namenode secure http server address and port.
dfs.namenode.backup.address 0.0.0.0:50100 The backup node server address and port. If the port is 0 then the server will start on a free port.
dfs.namenode.backup.http-address 0.0.0.0:50105 The backup node http server address and port. If the port is 0 then the server will start on a free port.
dfs.datanode.address 0.0.0.0:50010 The datanode server address and port for data transfer.
dfs.datanode.http.address 0.0.0.0:50075 The datanode http server address and port.
dfs.datanode.ipc.address 0.0.0.0:50020 The datanode ipc server address and port.
dfs.datanode.https.address 0.0.0.0:50475 The datanode secure http server address and port.
dfs.journalnode.rpc-address 0.0.0.0:8485 The JournalNode RPC server address and port.
dfs.journalnode.http-address 0.0.0.0:8480 The address and port the JournalNode HTTP server listens on. If the port is 0 then the server will start on a free port.
dfs.journalnode.https-address 0.0.0.0:8481 The address and port the JournalNode HTTPS server listens on. If the port is 0 then the server will start on a free port.



mapred-site.xml:

http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml
配置项 中文含义 英文含义 示例 官方默认值
mapreduce.framework.name MapReduce框架 The runtime framework for executing MapReduce jobs. Can be one of local, classic or yarn <property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
local
mapreduce.shuffle.port ShuffleHandler运行的端口 Default port that the ShuffleHandler will run on. ShuffleHandler is a service run at the NodeManager to facilitate transfers of intermediate Map outputs to requesting Reducers.  <property>
<name>mapreduce.shuffle.port</name>
<value>13562</value>
</property>
13562
mapred.system.dir Yarn不支持   <property>    
<name>mapred.system.dir</name>
    
<value>file:///uloc/hadoopdata/hadoop-${user.name}/mapred/system</value>
    
<final>true</final>
 
</property> 
 
mapred.local.dir Yarn不支持   <property>     
<name>mapred.local.dir</name>
   
<value>file:///uloc/hadoopdata/hadoop-${user.name}/mapred/local</value>
   
<final>true</final>
 
</property>  
 
mapred.child.java.opts Task任务进程的Java运行参数选项 Java opts for the task processes. The following symbol, if present, will be interpolated: @taskid@ is replaced by current TaskID. Any other occurrences of '@' will go unchanged. For example, to enable verbose gc logging to a file named for the taskid in /tmp and to set the heap maximum to be a gigabyte, pass a 'value' of: -Xmx1024m -verbose:gc -Xloggc:/tmp/@taskid@.gc Usage of -Djava.library.path can cause programs to no longer function if hadoop native libraries are used. These values should instead be set as part of LD_LIBRARY_PATH in the map / reduce JVM env using the mapreduce.map.env and mapreduce.reduce.env config settings.  <property>   
<name>mapred.child.java.opts</name>
   
<value>-Xmx3072M</value>
 
</property>  
-Xmx200m
mapreduce.reduce.java.opts Yarn不支持   <property>   
<name>mapreduce.reduce.java.opts</name>
   
<value>-Xmx1024M</value>
 
</property> 
 
mapreduce.map.memory.mb   The amount of memory to request from the scheduler for each map task.  <property>   
<name>mapreduce.map.memory.mb</name>
   
<value>1024</value>
 
</property> 
1024
mapreduce.reduce.memory.mb   The amount of memory to request from the scheduler for each reduce task.  <property>   
<name>mapreduce.reduce.memory.mb</name>
   
<value>1024</value>
 
</property>  
1024
mapreduce.task.io.sort.mb   The total amount of buffer memory to use while sorting files, in megabytes. By default, gives each merge stream 1MB, which should minimize seeks. <property>   
<name>mapreduce.task.io.sort.mb</name>
   
<value>1024</value>
 
</property>  
100
mapreduce.task.io.sort.factor   The number of streams to merge at once while sorting files. This determines the number of open file handles. <property>   
<name>mapreduce.task.io.sort.factor</name>
   
<value>100</value>
 
</property>  
10
mapreduce.reduce.shuffle.parallelcopies   The default number of parallel transfers run by reduce during the copy(shuffle) phase.  <property>   
<name>mapreduce.reduce.shuffle.parallelcopies</name>
   
<value>50</value>
 
</property>
5
mapreduce.jobhistory.address   MapReduce JobHistory Server IPC host:port <property>
<name>mapreduce.jobhistory.address</name>
<value>192.168.100.200:10020</value>
</property>
 0.0.0.0:10020 



配置项 默认地址/端口 英文含义
mapreduce.jobtracker.http.address 0.0.0.0:50030 The job tracker http server address and port the server will listen on. If the port is 0 then the server will start on a free port.
mapreduce.tasktracker.report.address 127.0.0.1:0 The interface and port that task tracker server listens on. Since it is only connected to by the tasks, it uses the local interface. EXPERT ONLY. Should only be changed if your host does not have the loopback interface.
mapreduce.tasktracker.http.address 0.0.0.0:50060 The task tracker http server address and port. If the port is 0 then the server will start on a free port.
mapreduce.jobhistory.address 0.0.0.0:10020 MapReduce JobHistory Server IPC host:port
mapreduce.jobhistory.webapp.address 0.0.0.0:19888 MapReduce JobHistory Server Web UI host:port
mapreduce.jobhistory.admin.address 0.0.0.0:10033 The address of the History server admin interface.



yarn-site.xml:

http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-common/yarn-default.xml  
配置项 中文含义 英文含义 示例 官方默认值  
yarn.resourcemanager.address   The address of the applications manager interface in the RM. <property>
<name>yarn.resourcemanager.address</name>
<value>192.168.100.200:8032</value>
</property>
${yarn.resourcemanager.hostname}:8032   
yarn.resourcemanager.scheduler.address   The address of the scheduler interface. <property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>192.168.100.200:8030</value>
</property>
${yarn.resourcemanager.hostname}:8030  
yarn.resourcemanager.resource-tracker.address     <property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>192.168.100.200:8031</value>
</property>
${yarn.resourcemanager.hostname}:8031  
yarn.resourcemanager.admin.address    The address of the RM admin interface.  <property>   
<name>yarn.resourcemanager.admin.address</name>
   
<value>192.168.100.200:8033</value>
 
</property>  
${yarn.resourcemanager.hostname}:8033  
yarn.resourcemanager.webapp.address   The http address of the RM web application. <property>   
<name>yarn.resourcemanager.webapp.address</name>
   
<value>192.168.100.200:8088</value>
 
</property>  
${yarn.resourcemanager.hostname}:8088  
yarn.nodemanager.aux-services   A comma separated list of services where service name should only contain a-zA-Z0-9_ and can not start with numbers <property>   
<name>yarn.nodemanager.aux-services</name>
   
<value>mapreduce.shuffle</value>
 
</property>  
   
yarn.nodemanager.aux-services.mapreduce.shuffle.class Yarn不支持   <property>   
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
   
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
 
</property>
   
yarn.scheduler.maximum-allocation-mb   The maximum allocation for every container request at the RM, in MBs. Memory requests higher than this will throw a InvalidResourceRequestException. <property>   
<name>yarn.scheduler.maximum-allocation-mb</name>
   
<value>10000</value>
 
</property> 
8192  
yarn.scheduler.minimum-allocation-mb   The minimum allocation for every container request at the RM, in MBs. Memory requests lower than this will throw a InvalidResourceRequestException. <property>   
<name>yarn.scheduler.minimum-allocation-mb</name>
   
<value>1000</value>
 
</property>  
1024  
mapreduce.reduce.memory.mb Yarn不支持   <property>   
<name>mapreduce.reduce.memory.mb</name>
   
<value>1000</value>
 
</property> 
   
yarn.nodemanager.local-dirs   List of directories to store localized files in. An application's localized file directory will be found in: ${yarn.nodemanager.local-dirs}/usercache/${user}/appcache/application_${appid}. Individual containers' work directories, called container_${contid}, will be subdirectories of this.  <property>
<name>yarn.nodemanager.local-dirs</name>
<value>/uloc/hadoopdata/hadoop-${user.name}/yarn/nmlocal</value>
</property>
 ${hadoop.tmp.dir}/nm-local-dir   
yarn.nodemanager.resource.memory-mb   Amount of physical memory, in MB, that can be allocated for containers. <property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>4096</value>
</property>
8192 不能设置过小(比如,1024)。否则job可以提交,但是无法运行。

实验中,2048都无法进行job运行。选择3172可以运行。
yarn.nodemanager.remote-app-log-dir   Where to aggregate logs to. <property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/uloc/hadoopdata/hadoop-${user.name}/yarn/logs</value>
</property>
/tmp/logs   
yarn.nodemanager.log-dirs   Where to store container logs. An application's localized log directory will be found in ${yarn.nodemanager.log-dirs}/application_${appid}. Individual containers' log directories will be below this, in directories named container_{$contid}. Each container directory will contain the files stderr, stdin, and syslog generated by that container.  <property>
<name>yarn.nodemanager.log-dirs</name>
<value>/uloc/hadoopdata/hadoop-${user.name}/yarn/userlogs</value>
</property>
${yarn.log.dir}/userlogs   
yarn.web-proxy.address   The address for the web proxy as HOST:PORT, if this is not given then the proxy will run as part of the RM <property>
<name>yarn.web-proxy.address</name>
<value>192.168.100.200:54315</value>
</property>
   
yarn.resourcemanager.hostname    The hostname of the RM.  <property>
<name>yarn.resourcemanager.hostname</name>
<value>robot123</value>
</property>
0.0.0.0  
yarn.nodemanager.address    The address of the container manager in the NM.  <property>
<name>yarn.nodemanager.address</name>
<value>192.168.100.200:11000</value>
</property> 
${yarn.nodemanager.hostname}:0  



配置项 默认地址/端口 英文含义
yarn.resourcemanager.hostname 0.0.0.0 The hostname of the RM.
yarn.resourcemanager.address ${yarn.resourcemanager.hostname}:8032 The address of the applications manager interface in the RM.
yarn.resourcemanager.scheduler.address ${yarn.resourcemanager.hostname}:8030 The address of the scheduler interface.
yarn.resourcemanager.webapp.address ${yarn.resourcemanager.hostname}:8088 The http address of the RM web application.
yarn.resourcemanager.webapp.https.address ${yarn.resourcemanager.hostname}:8090 The https adddress of the RM web application.
yarn.resourcemanager.resource-tracker.address ${yarn.resourcemanager.hostname}:8031  
yarn.resourcemanager.admin.address ${yarn.resourcemanager.hostname}:8033 The address of the RM admin interface.
yarn.nodemanager.hostname 0.0.0.0 The hostname of the NM.
yarn.nodemanager.address ${yarn.nodemanager.hostname}:0 The address of the container manager in the NM.
yarn.nodemanager.localizer.address ${yarn.nodemanager.hostname}:8040 Address where the localizer IPC is.
yarn.nodemanager.webapp.address ${yarn.nodemanager.hostname}:8042 NM Webapp address.
yarn.timeline-service.hostname 0.0.0.0 The hostname of the timeline service web application.
yarn.timeline-service.address ${yarn.timeline-service.hostname}:10200 This is default address for the timeline server to start the RPC server.
yarn.timeline-service.webapp.address ${yarn.timeline-service.hostname}:8188 The http address of the timeline service web application.
yarn.timeline-service.webapp.https.address ${yarn.timeline-service.hostname}:8190 The https address of the timeline service web application.
yarn.sharedcache.admin.address 0.0.0.0:8047 The address of the admin interface in the SCM (shared cache manager)
yarn.sharedcache.webapp.address 0.0.0.0:8788 The address of the web application in the SCM (shared cache manager)
yarn.sharedcache.uploader.server.address 0.0.0.0:8046 The address of the node manager interface in the SCM (shared cache manager)
yarn.sharedcache.client-server.address 0.0.0.0:8045 The address of the client interface in the SCM (shared cache manager)

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值