mac下搭建hadoop伪分布式环境
时间 2014-10-17 11:17:45 十一郎的blog
http://www.tuicool.com/articles/ANvUnm2
原文 http://www.wangqifox.cn/wordpress/?p=755
主题 Hadoop
参考:
http://blog.csdn.net/xbwer/article/details/35614679
下载hadoop
地址: http://hadoop.apache.org/releases.html
我选择了hadoop-1.2.1
关于hadoop 1.x和 hadoop 2的区别可以参考:http://blog.csdn.net/fenglibing/article/details/32916445
配置mac os自身环境
在terminal里面输入
ssh localhost
ssh localhost |
会有错误提示信息,表示当前用户没有权限。更改设置: 系统偏好设置->共享->勾选"远程登录"->
,设置允许访问所有用户。
设置免密码登录
输入
ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
ssh - keygen - t dsa - P '' - f ~ / . ssh / id _ dsa |
ssh-keygen表示生成密钥,-t表示密钥类型,-P用于提供密语,-f指定生成的密钥文件。这个命令在”~/.ssh”文件夹下创建两个文件id_dsa和id_dsa.pub,是ssh的一对私钥和公钥。接下来,将公钥追加到授权的key中去,输入:
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
cat ~ / . ssh / id_dsa . pub >> ~ / . ssh / authorized _ keys |
设置环境变量
export HADOOP_HOME=/Users/wangqi/hadoop-1.2.1 export PATH=$PATH:$HADOOP_HOME/bin
export HADOOP_HOME = / Users / wangqi / hadoop - 1.2.1 export PATH = $ PATH : $ HADOOP_HOME / bin |
配置hadoop
配置hadoop-env.sh
export JAVA_HOME=/System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home export HADOOP_HEAPSIZE=2000 export HADOOP_OPTS="-Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk"
export JAVA_HOME = / System / Library / Java / JavaVirtualMachines / 1.6.0.jdk /Contents / Home export HADOOP_HEAPSIZE = 2000 export HADOOP_OPTS = "-Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk" |
hadoop需要java版本1.6,如果已经以及安装了1.7,需要将 JAVA_HOME
设置成1.6的路径
参考: http://guibin.iteye.com/blog/1999238
输入:
/usr/libexec/java_home -V
/ usr / libexec / java_home - V |
查看所有版本的JAVA_HOME
配置core-site.xml——指定NameNode的主机名与端口
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>hadoop.tmp.dir</name> <value>/Users/wangqi/hadoop-1.2.1/tmp/hadoop-${user.name}</value> <description>A base for other temporary directories.</description> </property> <property> <name>fs.default.name</name> <value>hdfs://localhost:8020</value> </property> </configuration>
<? xml version = "1.0" ?> <? xml - stylesheet type = "text/xsl" href = "configuration.xsl" ?> < ! -- Put site - specific property overrides in this file . -- > < configuration > < property > < name > hadoop . tmp . dir < / name > < value > / Users / wangqi / hadoop - 1.2.1 / tmp / hadoop - $ { user . name } < /value > < description > A base for other temporary directories . < / description > < / property > < property > < name > fs . default . name < / name > < value > hdfs : //localhost:8020</value> < / property > < / configuration > |
配置hdfs-site.xml——指定HDFS的默认参数副本数,因为仅运行在一个节点上,所以这里的副本数为1
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration>
<? xml version = "1.0" ?> <? xml - stylesheet type = "text/xsl" href = "configuration.xsl" ?> < ! -- Put site - specific property overrides in this file . -- > < configuration > < property > < name > dfs . replication < / name > < value > 1 < / value > < / property > < / configuration > |
配置mapred-site.xml——指定了JobTracker的主机名与端口
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>mapred.job.tracker</name> <value>hdfs://localhost:9001</value> </property> <property> <name>mapred.tasktracker.map.tasks.maximum</name> <value>2</value> </property> <property> <name>mapred.tasktracker.reduce.tasks.maximum</name> <value>2</value> </property> </configuration>
<? xml version = "1.0" ?> <? xml - stylesheet type = "text/xsl" href = "configuration.xsl" ?> < ! -- Put site - specific property overrides in this file . -- > < configuration > < property > < name > mapred . job . tracker < / name > < value > hdfs : //localhost:9001</value> < / property > < property > < name > mapred . tasktracker . map . tasks . maximum < / name > < value > 2 < / value > < / property > < property > < name > mapred . tasktracker . reduce . tasks . maximum < / name > < value > 2 < / value > < / property > < / configuration > |
安装HDFS
经过以上的配置,就可以进行HDFS的安装了
hadoop namenode -format
hadoop namenode - format |
启动hadoop
start-all.sh
start - all . sh |
测试
输入
hadoop jar ./hadoop-examples-1.2.1.jar pi 10 100
hadoop jar . / hadoop - examples - 1.2.1.jar pi 10 100 |
输出
Warning: $HADOOP_HOME is deprecated. Number of Maps = 10 Samples per Map = 100 Wrote input for Map #0 Wrote input for Map #1 Wrote input for Map #2 Wrote input for Map #3 Wrote input for Map #4 Wrote input for Map #5 Wrote input for Map #6 Wrote input for Map #7 Wrote input for Map #8 Wrote input for Map #9 Starting Job 14/10/17 11:24:18 INFO mapred.FileInputFormat: Total input paths to process : 10 14/10/17 11:24:19 INFO mapred.JobClient: Running job: job_201410171123_0001 14/10/17 11:24:20 INFO mapred.JobClient: map 0% reduce 0% 14/10/17 11:24:29 INFO mapred.JobClient: map 20% reduce 0% 14/10/17 11:24:35 INFO mapred.JobClient: map 40% reduce 0% 14/10/17 11:24:39 INFO mapred.JobClient: map 60% reduce 0% 14/10/17 11:24:43 INFO mapred.JobClient: map 70% reduce 0% 14/10/17 11:24:44 INFO mapred.JobClient: map 80% reduce 0% 14/10/17 11:24:46 INFO mapred.JobClient: map 80% reduce 26% 14/10/17 11:24:47 INFO mapred.JobClient: map 90% reduce 26% 14/10/17 11:24:48 INFO mapred.JobClient: map 100% reduce 26% 14/10/17 11:24:53 INFO mapred.JobClient: map 100% reduce 100% 14/10/17 11:24:54 INFO mapred.JobClient: Job complete: job_201410171123_0001 14/10/17 11:24:54 INFO mapred.JobClient: Counters: 27 14/10/17 11:24:54 INFO mapred.JobClient: Job Counters 14/10/17 11:24:54 INFO mapred.JobClient: Launched reduce tasks=1 14/10/17 11:24:54 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=48352 14/10/17 11:24:54 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 14/10/17 11:24:54 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 14/10/17 11:24:54 INFO mapred.JobClient: Launched map tasks=10 14/10/17 11:24:54 INFO mapred.JobClient: Data-local map tasks=10 14/10/17 11:24:54 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=23343 14/10/17 11:24:54 INFO mapred.JobClient: File Input Format Counters 14/10/17 11:24:54 INFO mapred.JobClient: Bytes Read=1180 14/10/17 11:24:54 INFO mapred.JobClient: File Output Format Counters 14/10/17 11:24:54 INFO mapred.JobClient: Bytes Written=97 14/10/17 11:24:54 INFO mapred.JobClient: FileSystemCounters 14/10/17 11:24:54 INFO mapred.JobClient: FILE_BYTES_READ=226 14/10/17 11:24:54 INFO mapred.JobClient: HDFS_BYTES_READ=2410 14/10/17 11:24:54 INFO mapred.JobClient: FILE_BYTES_WRITTEN=695028 14/10/17 11:24:54 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=215 14/10/17 11:24:54 INFO mapred.JobClient: Map-Reduce Framework 14/10/17 11:24:54 INFO mapred.JobClient: Map output materialized bytes=280 14/10/17 11:24:54 INFO mapred.JobClient: Map input records=10 14/10/17 11:24:54 INFO mapred.JobClient: Reduce shuffle bytes=280 14/10/17 11:24:54 INFO mapred.JobClient: Spilled Records=40 14/10/17 11:24:54 INFO mapred.JobClient: Map output bytes=180 14/10/17 11:24:54 INFO mapred.JobClient: Total committed heap usage (bytes)=1931190272 14/10/17 11:24:54 INFO mapred.JobClient: Map input bytes=240 14/10/17 11:24:54 INFO mapred.JobClient: Combine input records=0 14/10/17 11:24:54 INFO mapred.JobClient: SPLIT_RAW_BYTES=1230 14/10/17 11:24:54 INFO mapred.JobClient: Reduce input records=20 14/10/17 11:24:54 INFO mapred.JobClient: Reduce input groups=20 14/10/17 11:24:54 INFO mapred.JobClient: Combine output records=0 14/10/17 11:24:54 INFO mapred.JobClient: Reduce output records=0 14/10/17 11:24:54 INFO mapred.JobClient: Map output records=20 Job Finished in 35.752 seconds Estimated value of Pi is 3.14800000000000000000
Warning : $ HADOOP_HOME is deprecated . Number of Maps = 10 Samples per Map = 100 Wrote input for Map #0 Wrote input for Map #1 Wrote input for Map #2 Wrote input for Map #3 Wrote input for Map #4 Wrote input for Map #5 Wrote input for Map #6 Wrote input for Map #7 Wrote input for Map #8 Wrote input for Map #9 Starting Job 14 / 10 / 17 11 : 24 : 18 INFO mapred . FileInputFormat : Total input paths toprocess : 10 14 / 10 / 17 11 : 24 : 19 INFO mapred . JobClient : Running job :job_201410171123 _ 0001 14 / 10 / 17 11 : 24 : 20 INFO mapred . JobClient : map 0 % reduce 0 % 14 / 10 / 17 11 : 24 : 29 INFO mapred . JobClient : map 20 % reduce 0 % 14 / 10 / 17 11 : 24 : 35 INFO mapred . JobClient : map 40 % reduce 0 % 14 / 10 / 17 11 : 24 : 39 INFO mapred . JobClient : map 60 % reduce 0 % 14 / 10 / 17 11 : 24 : 43 INFO mapred . JobClient : map 70 % reduce 0 % 14 / 10 / 17 11 : 24 : 44 INFO mapred . JobClient : map 80 % reduce 0 % 14 / 10 / 17 11 : 24 : 46 INFO mapred . JobClient : map 80 % reduce 26 % 14 / 10 / 17 11 : 24 : 47 INFO mapred . JobClient : map 90 % reduce 26 % 14 / 10 / 17 11 : 24 : 48 INFO mapred . JobClient : map 100 % reduce 26 % 14 / 10 / 17 11 : 24 : 53 INFO mapred . JobClient : map 100 % reduce 100 % 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : Job complete :job_201410171123 _ 0001 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : Counters : 27 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : Job Counters 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : Launched reduce tasks = 1 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : SLOTS_MILLIS_MAPS =48352 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : Total time spent by allreduces waiting after reserving slots ( ms ) = 0 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : Total time spent by all mapswaiting after reserving slots ( ms ) = 0 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : Launched map tasks = 10 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : Data - local map tasks = 10 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : SLOTS_MILLIS_REDUCES= 23343 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : File Input Format Counters 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : Bytes Read = 1180 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : File Output Format Counters 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : Bytes Written = 97 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : FileSystemCounters 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : FILE_BYTES_READ = 226 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : HDFS_BYTES_READ =2410 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : FILE_BYTES_WRITTEN =695028 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : HDFS_BYTES_WRITTEN =215 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : Map - Reduce Framework 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : Map output materializedbytes = 280 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : Map input records = 10 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : Reduce shuffle bytes = 280 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : Spilled Records = 40 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : Map output bytes = 180 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : Total committed heap usage( bytes ) = 1931190272 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : Map input bytes = 240 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : Combine input records = 0 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : SPLIT_RAW_BYTES = 1230 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : Reduce input records = 20 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : Reduce input groups = 20 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : Combine output records = 0 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : Reduce output records = 0 14 / 10 / 17 11 : 24 : 54 INFO mapred . JobClient : Map output records = 20 Job Finished in 35.752 seconds Estimated value of Pi is 3.14800000000000000000 |
安装成功