0、整体环境
环境: vmware11 + Ubuntu14.04+Hadoop2.6.0+jdk1.8.0_45
ubuntu用户:nob #普通用户,只有在修改系统配置的时候用"sudo su -"切换到超级管理
结点信息:(分布式集群架构:master为主节点,其余为从节点)
hostname | ip | 作用 |
master | 192.168.1.108 | NameNode and JobTracker |
slave1 | 192.168.1.111 | DataNode and TaskTracker |
slave2 | 192.168.1.112 | DataNode and TaskTracker |
1、安装vmware,ubuntu14.04
比较简单,有很多网文供参考
其中网络配置采用桥接的方式,这样使得虚拟的ubuntu和主机在同一个网段当中,并且可以访问网络,便于下载东西。
安装了vmtools就可以设置和虚拟机和宿主机的共享文件夹
2、安装jdk
安装jdk1.8.0_45时切换到超级用户,并配置JAVA_HOME环境变量,详见参考文献1
3、克隆机器,配置hosts
使用vmware的克隆功能将配置好的机器准备三台,
分别修改集群中机器的hostname,sudo vim /etc/hostname
如主节点修改为:master
修改集群中所有机器的/etc/hosts,打开该文件的命令如下:
sudo vim /etc/hosts
添加:
192.168.1.108 master
192.168.1.111 slave1
192.168.1.112 slave2
重启网络刷新dns:
sudo /etc/init.d/networking restart
4、设置ssh无密码登录
可以多台机器使用同一份ssh key,也可以使用不同的ssh,参考文献2中的方式,最终结果为机器之间可以互相登录
如master中登录slave1:ssh slave1
5、配置hadoop
当前hadoop稳定版为2.6,官网可以找到多个镜像站点,如 http://mirror.bit.edu.cn/apache/hadoop/common/stable/
下载解压:
sudo mkdir /data
sudo chown nob: /data
mkdir /data/server
cd /data/server
cp ~/download/hadoop-2.6.0.tar.gz /data/server/hadoop-2.6.0.tar.gz
tar xzfv hadoop-2.6.0.tar.gz
配置说明:
下面配置参考了最新的hadoop2.6官方文档
下面采用最简配置,注掉的配置可以打开,没有配置到的hadoop有默认配置
不同版本的配置差别请看引用文献3
本配置同时适合伪分布式,需要将HADOOP_HOME/etc/slaves文件中的节点修改为master
详细配置如下:
1.修改hadoop-2.6.0/etc/hadoop/hadoop-env.sh,添加JDK支持:
export JAVA_HOME=/usr/java/jdk1.8.0_25
如果不知道你的JDK目录,使用命令echo $JAVA_HOME查看。
2.修改hadoop-2.6.0/etc/hadoop/core-site.xml
首先创建/home/nob/hadoop_tmp
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/nob/hadoop_tmp</value>
<description>Abase for other temporary directories.</description>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
</configuration>
注:fs.defaultFS的老版配置为fs.default.name
3.修改hadoop-2.6.0/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<!--
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/nob/hadoop_tmp/dfs/name</value>
<description>Path on the local filesystem where the NameNode stores the namespace and transactions logs persistently.</description>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/nob/hadoop_tmp/dfs/data</value>
<description>Comma separated list of paths on the local filesystem of a DataNode where it should store its blocks.</description>
</property>
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
<description>If "true", enable permission checking in HDFS. If "false", permission checking is turned off.</description>
</property>
-->
</configuration>
注:dfs.namenode.name.dir和dfs.datanode.data.dir的老版配置为dfs.name.dir和dfs.data.dir
4.修改hadoop-2.6.0/etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
注意:这里配置了mapreduce的框架,之前的mapreduce地址端口相当于yarn-site.xml中的yarn.resourcemanager.scheduler.address,如果没有配置默认端口为8030.比如hadoop-eclipse插件需要填写。
5、yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
<description>the valid service name should only contain a-zA-Z0-9_ and can not start with numbers</description>
</property>
</configurati
6. 修改hadoop-2.6.0/etc/hadoop/masters
列出所有的master节点,默认的localhost要修改:
master
7.修改hadoop-2.6.0/etc/hadoop/slaves
这个是所有datanode的机器,例如:
slave1
slave2
slave3
slave4
8.将master结点上配置好的hadoop文件夹拷贝到所有的slave结点上
以slave1为例:命令如下:
scp -r ~/hadoop-2.6.0 hadoop@slave1:~/
安装完成后,我们要格式化HDFS然后启动集群所有节点。
6、启动Hadoop
下面操作都是在NameNode上
1.格式化Namenode HDFS文件系统
(这里要进入hadoop-2.6.0目录来格式化好些):
cd hadoop-2.6.0 //进入hadoop-2.6.0目录
bin/hdfs namenode -format //格式化
格式化信息:
nob@master:/data/server/hadoop-2.6.0$ bin/hdfs namenode -format
15/06/12 22:24:51 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = master/192.168.1.108
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.6.0
STARTUP_MSG: classpath = ***********
STARTUP_MSG: build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r e3496499ecb8d220fba99dc5ed4c99c8f9e33bb1; compiled by 'jenkins' on 2014-11-13T21:10Z
STARTUP_MSG: java = 1.8.0_45
************************************************************/
15/06/12 22:24:51 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
15/06/12 22:24:51 INFO namenode.NameNode: createNameNode [-format]
15/06/12 22:24:52 WARN common.Util: Path /data/server/hadoop-2.6.0/dfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
15/06/12 22:24:52 WARN common.Util: Path /data/server/hadoop-2.6.0/dfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
Formatting using clusterid: CID-2e804799-d6c9-4b13-a18a-955425221bf2
15/06/12 22:24:53 INFO namenode.FSNamesystem: No KeyProvider found.
15/06/12 22:24:53 INFO namenode.FSNamesystem: fsLock is fair:true
15/06/12 22:24:53 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000
15/06/12 22:24:53 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
15/06/12 22:24:53 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
15/06/12 22:24:53 INFO blockmanagement.BlockManager: The block deletion will start around 2015 Jun 12 22:24:53
15/06/12 22:24:53 INFO util.GSet: Computing capacity for map BlocksMap
15/06/12 22:24:53 INFO util.GSet: VM type = 64-bit
15/06/12 22:24:53 INFO util.GSet: 2.0% max memory 966.7 MB = 19.3 MB
15/06/12 22:24:53 INFO util.GSet: capacity = 2^21 = 2097152 entries
15/06/12 22:24:53 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
15/06/12 22:24:53 INFO blockmanagement.BlockManager: defaultReplication = 1
15/06/12 22:24:53 INFO blockmanagement.BlockManager: maxReplication = 512
15/06/12 22:24:53 INFO blockmanagement.BlockManager: minReplication = 1
15/06/12 22:24:53 INFO blockmanagement.BlockManager: maxReplicationStreams = 2
15/06/12 22:24:53 INFO blockmanagement.BlockManager: shouldCheckForEnoughRacks = false
15/06/12 22:24:53 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000
15/06/12 22:24:53 INFO blockmanagement.BlockManager: encryptDataTransfer = false
15/06/12 22:24:53 INFO blockmanagement.BlockManager: maxNumBlocksToLog = 1000
15/06/12 22:24:53 INFO namenode.FSNamesystem: fsOwner = nob (auth:SIMPLE)
15/06/12 22:24:53 INFO namenode.FSNamesystem: supergroup = supergroup
15/06/12 22:24:53 INFO namenode.FSNamesystem: isPermissionEnabled = true
15/06/12 22:24:53 INFO namenode.FSNamesystem: HA Enabled: false
15/06/12 22:24:53 INFO namenode.FSNamesystem: Append Enabled: true
15/06/12 22:24:53 INFO util.GSet: Computing capacity for map INodeMap
15/06/12 22:24:53 INFO util.GSet: VM type = 64-bit
15/06/12 22:24:53 INFO util.GSet: 1.0% max memory 966.7 MB = 9.7 MB
15/06/12 22:24:53 INFO util.GSet: capacity = 2^20 = 1048576 entries
15/06/12 22:24:54 INFO namenode.NameNode: Caching file names occuring more than 10 times
15/06/12 22:24:54 INFO util.GSet: Computing capacity for map cachedBlocks
15/06/12 22:24:54 INFO util.GSet: VM type = 64-bit
15/06/12 22:24:54 INFO util.GSet: 0.25% max memory 966.7 MB = 2.4 MB
15/06/12 22:24:54 INFO util.GSet: capacity = 2^18 = 262144 entries
15/06/12 22:24:54 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
15/06/12 22:24:54 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0
15/06/12 22:24:54 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension = 30000
15/06/12 22:24:54 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
15/06/12 22:24:54 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
15/06/12 22:24:54 INFO util.GSet: Computing capacity for map NameNodeRetryCache
15/06/12 22:24:54 INFO util.GSet: VM type = 64-bit
15/06/12 22:24:54 INFO util.GSet: 0.029999999329447746% max memory 966.7 MB = 297.0 KB
15/06/12 22:24:54 INFO util.GSet: capacity = 2^15 = 32768 entries
15/06/12 22:24:54 INFO namenode.NNConf: ACLs enabled? false
15/06/12 22:24:54 INFO namenode.NNConf: XAttrs enabled? true
15/06/12 22:24:54 INFO namenode.NNConf: Maximum size of an xattr: 16384
15/06/12 22:24:54 INFO namenode.FSImage: Allocated new BlockPoolId: BP-128507559-192.168.1.108-1434119094146
15/06/12 22:24:54 INFO common.Storage: Storage directory /data/server/hadoop-2.6.0/dfs/name has been successfully formatted.
15/06/12 22:24:54 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
15/06/12 22:24:54 INFO util.ExitUtil: Exiting with status 0
15/06/12 22:24:54 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master/192.168.1.108
************************************************************/
2.启动Hadoop集群
启动hdrs命令如下:
sbin/start-dfs.sh //开启进程
成功的话输入jps会出现如下界面:
nob@master:/data/server/hadoop-2.6.0$ sbin/start-dfs.sh
Starting namenodes on [master]
master: starting namenode, logging to /data/server/hadoop-2.6.0/logs/hadoop-nob-namenode-master.out
slave2: starting datanode, logging to /data/server/hadoop-2.6.0/logs/hadoop-nob-datanode-slave2.out
slave1: starting datanode, logging to /data/server/hadoop-2.6.0/logs/hadoop-nob-datanode-slave1.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /data/server/hadoop-2.6.0/logs/hadoop-nob-secondarynamenode-master.out
nob@master:/data/server/hadoop-2.6.0$ jps
3424 Jps
3108 NameNode
3321 SecondaryNameNode
在slave节点上输入jps:
nob@slave1:~$ jps
1995 DataNode
2060 Jps
我们也可以通过网页来看是否正常安装与配置,地址如下:http://master:50070/
7、测试
hdfs文件系统和mapreduce运行测试(hadoop自带的wordcount例子)
nob@master:/data/server/hadoop-2.6.0$ bin/hadoop fs -mkdir /test
nob@master:/data/server/hadoop-2.6.0$ bin/hadoop fs -ls /
Found 1 items
drwxr-xr-x - nob supergroup 0 2015-06-12 22:43 /test
nob@master:/data/server/hadoop-2.6.0$ bin/hadoop fs -put README.txt /test/readme.txt
nob@master:/data/server/hadoop-2.6.0$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount /test/readme.txt /output/readmecount
hadoop自带的统计单词数的mapreduce例子运行信息如下:
15/06/12 22:51:05 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
15/06/12 22:51:05 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
15/06/12 22:51:06 INFO input.FileInputFormat: Total input paths to process : 1
15/06/12 22:51:06 INFO mapreduce.JobSubmitter: number of splits:1
15/06/12 22:51:07 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1313573404_0001
15/06/12 22:51:07 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
15/06/12 22:51:07 INFO mapreduce.Job: Running job: job_local1313573404_0001
15/06/12 22:51:07 INFO mapred.LocalJobRunner: OutputCommitter set in config null
15/06/12 22:51:07 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
15/06/12 22:51:07 INFO mapred.LocalJobRunner: Waiting for map tasks
15/06/12 22:51:07 INFO mapred.LocalJobRunner: Starting task: attempt_local1313573404_0001_m_000000_0
15/06/12 22:51:08 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
15/06/12 22:51:08 INFO mapred.MapTask: Processing split: hdfs://master:9000/test/readme.txt:0+1366
15/06/12 22:51:08 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
15/06/12 22:51:08 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
15/06/12 22:51:08 INFO mapred.MapTask: soft limit at 83886080
15/06/12 22:51:08 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
15/06/12 22:51:08 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
15/06/12 22:51:08 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
15/06/12 22:51:08 INFO mapreduce.Job: Job job_local1313573404_0001 running in uber mode : false
15/06/12 22:51:08 INFO mapreduce.Job: map 0% reduce 0%
15/06/12 22:51:08 INFO mapred.LocalJobRunner:
15/06/12 22:51:08 INFO mapred.MapTask: Starting flush of map output
15/06/12 22:51:08 INFO mapred.MapTask: Spilling map output
15/06/12 22:51:08 INFO mapred.MapTask: bufstart = 0; bufend = 2055; bufvoid = 104857600
15/06/12 22:51:08 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26213684(104854736); length = 713/6553600
15/06/12 22:51:09 INFO mapred.MapTask: Finished spill 0
15/06/12 22:51:09 INFO mapred.Task: Task:attempt_local1313573404_0001_m_000000_0 is done. And is in the process of committing
15/06/12 22:51:09 INFO mapred.LocalJobRunner: map
15/06/12 22:51:09 INFO mapred.Task: Task 'attempt_local1313573404_0001_m_000000_0' done.
15/06/12 22:51:09 INFO mapred.LocalJobRunner: Finishing task: attempt_local1313573404_0001_m_000000_0
15/06/12 22:51:09 INFO mapred.LocalJobRunner: map task executor complete.
15/06/12 22:51:09 INFO mapred.LocalJobRunner: Waiting for reduce tasks
15/06/12 22:51:09 INFO mapred.LocalJobRunner: Starting task: attempt_local1313573404_0001_r_000000_0
15/06/12 22:51:09 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
15/06/12 22:51:09 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@202abc29
15/06/12 22:51:09 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=363285696, maxSingleShuffleLimit=90821424, mergeThreshold=239768576, ioSortFactor=10, memToMemMergeOutputsThreshold=10
15/06/12 22:51:09 INFO reduce.EventFetcher: attempt_local1313573404_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
15/06/12 22:51:09 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1313573404_0001_m_000000_0 decomp: 1832 len: 1836 to MEMORY
15/06/12 22:51:09 INFO reduce.InMemoryMapOutput: Read 1832 bytes from map-output for attempt_local1313573404_0001_m_000000_0
15/06/12 22:51:09 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 1832, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->1832
15/06/12 22:51:09 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning
15/06/12 22:51:09 INFO mapred.LocalJobRunner: 1 / 1 copied.
15/06/12 22:51:09 INFO reduce.MergeManagerImpl: finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs
15/06/12 22:51:09 INFO mapred.Merger: Merging 1 sorted segments
15/06/12 22:51:09 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 1823 bytes
15/06/12 22:51:09 INFO reduce.MergeManagerImpl: Merged 1 segments, 1832 bytes to disk to satisfy reduce memory limit
15/06/12 22:51:09 INFO reduce.MergeManagerImpl: Merging 1 files, 1836 bytes from disk
15/06/12 22:51:09 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce
15/06/12 22:51:09 INFO mapred.Merger: Merging 1 sorted segments
15/06/12 22:51:09 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 1823 bytes
15/06/12 22:51:09 INFO mapred.LocalJobRunner: 1 / 1 copied.
15/06/12 22:51:09 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
15/06/12 22:51:09 INFO mapreduce.Job: map 100% reduce 0%
15/06/12 22:51:10 INFO mapred.Task: Task:attempt_local1313573404_0001_r_000000_0 is done. And is in the process of committing
15/06/12 22:51:10 INFO mapred.LocalJobRunner: 1 / 1 copied.
15/06/12 22:51:10 INFO mapred.Task: Task attempt_local1313573404_0001_r_000000_0 is allowed to commit now
15/06/12 22:51:10 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1313573404_0001_r_000000_0' to hdfs://master:9000/output/readmecount/_temporary/0/task_local1313573404_0001_r_000000
15/06/12 22:51:10 INFO mapred.LocalJobRunner: reduce > reduce
15/06/12 22:51:10 INFO mapred.Task: Task 'attempt_local1313573404_0001_r_000000_0' done.
15/06/12 22:51:10 INFO mapred.LocalJobRunner: Finishing task: attempt_local1313573404_0001_r_000000_0
15/06/12 22:51:10 INFO mapred.LocalJobRunner: reduce task executor complete.
15/06/12 22:51:10 INFO mapreduce.Job: map 100% reduce 100%
15/06/12 22:51:11 INFO mapreduce.Job: Job job_local1313573404_0001 completed successfully
15/06/12 22:51:11 INFO mapreduce.Job: Counters: 38
File System Counters
FILE: Number of bytes read=544652
FILE: Number of bytes written=1067528
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=2732
HDFS: Number of bytes written=1306
HDFS: Number of read operations=13
HDFS: Number of large read operations=0
HDFS: Number of write operations=4
Map-Reduce Framework
Map input records=31
Map output records=179
Map output bytes=2055
Map output materialized bytes=1836
Input split bytes=99
Combine input records=179
Combine output records=131
Reduce input groups=131
Reduce shuffle bytes=1836
Reduce input records=131
Reduce output records=131
Spilled Records=262
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=86
CPU time spent (ms)=0
Physical memory (bytes) snapshot=0
Virtual memory (bytes) snapshot=0
Total committed heap usage (bytes)=238436352
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=1366
File Output Format Counters
Bytes Written=1306
nob@master:/data/server/hadoop-2.6.0$
查看结果输出:
nob@master:/data/server/hadoop-2.6.0$ bin/hadoop fs -cat /output/readmecount/*
(BIS), 1
(ECCN) 1
(TSU) 1
##省略部分
the 8
this 3
to 2
under 1
use, 2
uses 1
using 2
visit 1
website 1
which 2
wiki, 1
with 1
written 1
you 1
your 1
nob@master:/data/server/hadoop-2.6.0$
8、常见问题
1)hdfs://hadoop:9000 配置的hdfs端口外网拒绝访问?
原因1:没有启用端口,解决首先jps检查是否启动datanode节点,然后netstat -tanp |grep 9000查看是否有进程使用9000端口,如果确认启用端口继续往下看
原因2:被防火墙拦截,解决如果防火墙过滤了访问该端口的请求则设置过滤规则,放行该端口或者关闭防火墙
原因3:该端口的监听地址为本机(127.0.0.1),解决,可以通过“netstat -anp | grep 9000”命令查看该端口的监听地址:
nob@nobubuntu:/data/server/hadoop-2.6.0$ netstat -anp | grep 9000
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 127.0.1.1:9000 0.0.0.0:* LISTEN 1919/java
tcp 0 0 127.0.1.1:9000 127.0.0.1:52471 ESTABLISHED 1919/java
tcp 0 0 127.0.0.1:52489 127.0.1.1:9000 TIME_WAIT -
tcp 0 0 127.0.0.1:52471 127.0.1.1:9000 ESTABLISHED 2053/java
有上图可知,9000端口监听的地址为本地地址(127.0.1.1)
则修改其监听地址为真是ip地址(如192.168.1.100)。 在/etc/hosts中修改hostname为真实ip :
hadoop 192.168.1.100
然后修改core-site.xml中fs.default.name的值为"hdfs://hadoop:9000",不要用hdfs://localhost:9000
2)jps查看进程namenode没有启动?
其实问题就出在tmp文件,默认的tmp文件每次重新开机会被清空,与此同时namenode的格式化信息就会丢失,于是我们得重新配置一个tmp文件目录
创建/home/nob/hadoop_tmp目录,然后配置core-site.xml:
<property>
<name>hadoop.tmp.dir</name>
<value>/home/nob/hadoop_tmp</value>
<description>Abase for other temporary directories.</description>
</property>
然后格式namenode,重启就ok了
bin/hdfs namenode -format
sbin/start-all.sh
nob@master:~/opt/hadoop-2.6.0$ jps
8706 DataNode
9062 ResourceManager
9192 NodeManager
8572 NameNode
10029 Jps
8911 SecondaryNameNode
[参考文献]
1、Ubuntu 14.04安装JDK1.8.0_25与配置环境变量 http://www.linuxidc.com/Linux/2015-01/112030.htm
2、RHadoop实践系列之一:Hadoop环境搭建 http://blog.fens.me/rhadoop-hadoop/
3、 hadoop2.x常用端口、定义方法及默认端口、hadoop1.X端口对比 http://www.aboutyun.com/thread-7513-1-1.html