hadoop学习(三)

原创 2011年01月09日 19:23:00

Hadoop Core Server Configuration

Default Shared File System URI and NameNode Location for HDFS
The default value is file:///, which instructs the framework to use the local file system. An example of an HDFS URI is hdfs://NamenodeHost[:8020]/, that informs the framework to use the shared file system(HDFS).
JobTracker Host and Port
The URI specified in this parameter informs the Hadoop Core framework of the JobTracker’s location. The default value is local, which indicates that no JobTracker server is to be run, and all tasks will be run from a single JVM.The JobtrackerHost is the host on which the JobTracker server process will be run. This value may be altered by individual jobs.
Maximum Concurrent Map Tasks per TaskTracke r
The mapred.tasktracker.map.tasks.maximum parameter sets the maximum number of map tasks that may be run by a TaskTracker server process on a host at one time. One TaskTracker,one Map Task;one Map Task,many threads; This value may be altered by setting the number of threads via the following:
    JobConf.set("mapred.map.multithreadedrunner.threads", threadCount);
Maximum Concurrent Reduce Tasks per TaskTracker
Reduce tasks tend to be I/O bound, and it is not uncommon to have the per-machine maximum reduce task value set to 1 or 2.
JVM Options for the Task Virtual Machines
During the run phase of a job, there may be up to mapred.tasktracker.map.tasks.maximum map tasks and mapred.tasktracker.reduce.tasks.maximum reduce tasks running simultaneously on each TaskTracker node, as well as the TaskTracker JVM.
Enable Job Control Options on the Web Interfaces
Both the JobTracker and the NameNode provide a web interface for monitoring and control. By default, the JobTracker provides web service on http://JobtrackerHost:50030 and the NameNode provides web service on http://NamenodeHost:50070.


interprocess communications (IPC)

Configuration Requirements

Network Requirements
Hadoop Core uses Secure Shell (SSH) to launch the server processes on the slave nodes.Hadoop Core requires that passwordless SSH work between the master machines and all of the slave and secondary machines.
Advanced Networking: Support for Multihomed Machines
dfs.datanode.dns.interface: If set, this parameter is the name of the network interface to be used for HDFS transactions to the DataNode. The IP address of this interface will be advertised by the DataNode as its contact address.
dfs.datanode.dns.nameserver: If set, this parameter is the hostname or IP address of a machine to use to perform a reverse host lookup on the IP address associated with the specified network interface.

rsync unix 远程同步命令。可以将配置文件同步的其他node上。

相关文章推荐

hadoop学习笔记(三)

  • 2013年11月12日 14:54
  • 28KB
  • 下载

Hadoop学习笔记(三)——HDFS应用程序开发

大数据学习笔记之Hadoop整理。

Hadoop 学习研究(三): MR程序的基础编写和提交

Mapreduce初析   Mapreduce是一个计算框架,既然是做计算的框架,那么表现形式就是有个输入(input),mapreduce操作这个输入(input),通过本身定义好的计算模型,...

Hadoop-2.2.0学习之三YARN简介

MapReduce在hadoop-0.23版本中进行了完全的检查修改,并发展为了现在称之为的MapReduce2.0(MRv2)或者YARN。YARN的基本想法是将JobTracker的两个主要功能资...

hadoop学习笔记(三)mapreduce程序wordcount

Mapreduce程序WordCount 参考: http://www.cnblogs.com/xia520pi/archive/2012/05/16/2504205.html http://w...
  • ptrdu
  • ptrdu
  • 2013年05月04日 15:32
  • 571

Hadoop学习之HDFS架构(三)

现在看看HDFS的通信协议,HDFS的所有通信协议是在TCP/IP协议之上的。客户端连接到NameNode上的一个可配置的TCP端口,按照ClientProtocol协议与NameNode回话,Dat...

Hadoop HDFS源码学习笔记(三)

继续上一篇blog,进一步分析FileSystem的API,并完善类图 3、write 数据 FileSystem类有一系列的create方法,其中简单的方法就是给定一个Path对象,然后...

Hadoop学习笔记(三)一个实例

Hadoop学习笔记(三)一个实例 1.辅助类GenericOptionsParser,Tool和ToolRunner 上一章使用了GenericOptionsParser这个类,它用来解释常用的...

Hadoop学习笔记(三):Hive简介

定义       Hive是一个构建在Hadoop上的数据仓库框架。可以将结构化的数据文件映射为一张数据库表,并提供完整的sql查询功能,可以将sql语句转换为MapReduce任务进行运行。 其优...

hadoop学习工作总结(三)之数据优化

数据优化: 1、小表放在前面,大表放在后面。因为会把前面的表读进内存再进行关联。 2、把分区的条件在on关系后面,不要放在where后面。因为放where后面会把所有分区关联后再按分区过滤。 3...
内容举报
返回顶部
收藏助手
不良信息举报
您举报文章:hadoop学习(三)
举报原因:
原因补充:

(最多只允许输入30个字)