hadoop伪分布式部署

Hadoop 0.20.2版本,jdk1.6.0_13

 

1. ssh无密码验证登陆localhost

保证Linux系统的ssh服务已经启动,并保证能够通过无密码验证登陆本机Linux系统。如果不能保证,可以按照如下的步骤去做:

1)启动命令行窗口,执行命令行:

$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa

$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

2ssh登陆localhost,执行命令行:

$ ssh localhost

第一次登录,会提示你无法建立到127.0.0.1的连接,是否要建立,输入yes即可,下面是能够通过无密码验证登陆的信息:

[root@localhost Hadoop-0.19.2]# ssh localhost

Last login: Sun Aug  1 18:35:37 2010 from 192.168.0.104

2.hadoop配置

conf文件夹下

1)配置JAVA_HOME

hadoop-env.sh中,添加 export=$PATH:/jdk or jre 路径

2)配置 mapred

core-site.xml中,添加如下属性

复制代码

conf/core-site.xml:

<configuration>

  <property>

    <name>fs.default.name</name>

    <value>hdfs://localhost:9000</value>

  </property>

</configuration>

 

conf/hdfs-site.xml:

<configuration>

  <property>

    <name>dfs.replication</name>

    <value>1</value>

  </property>

</configuration>

 

conf/mapred-site.xml:

<configuration>

  <property>

    <name>mapred.job.tracker</name>

    <value>localhost:9001</value>

  </property>

</configuration>

复制代码

 

3)配置Hadoop路径:方便在任何目录下运行hadoop

export PATH=$PATH:/hadoop bin 的目录

3测试

1)命令:hadoop version

输出为:

Hadoop 0.20.2
Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707
Compiled by chrisdo on Fri Feb 19 08:07:34 UTC 2010

 

2)格式化namenode

hadoop namenode format


之后要执行 start-all.sh,否则不会生效。会出现以下异常

java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on connection exception: java.net.ConnectException: Connection refused

        at org.apache.hadoop.ipc.Client.wrapException(Client.java:767)

        at org.apache.hadoop.ipc.Client.call(Client.java:743)

        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)

        at $Proxy0.getProtocolVersion(Unknown Source)

        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)

        at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106)

        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:207)

        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:170)

        at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82)

        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)

        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)

        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)

        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)

        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95)

        at org.apache.hadoop.examples.Grep.run(Grep.java:87)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

        at org.apache.hadoop.examples.Grep.main(Grep.java:93)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

        at java.lang.reflect.Method.invoke(Method.java:597)

        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)

        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)

        at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

        at java.lang.reflect.Method.invoke(Method.java:597)

        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

Caused by: java.net.ConnectException: Connection refused

        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)

        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)

        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:404)

        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:304)

        at org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176)

        at org.apache.hadoop.ipc.Client.getConnection(Client.java:860)

        at org.apache.hadoop.ipc.Client.call(Client.java:720)

        ... 27 more

 

 

4 执行任务

 

这个例子把conf目录中的文件拷贝到input目录中,然后在这些文件的内容中匹配指定的短语将每个匹配内容打印到输出文件。输出文件被放在output目录中。

$ mkdir input 

$ cp conf/*.xml input 

$ bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+' 

$ cat output/*

如果运行成功则会看到匹配短语打印出来

 

但是,在执行第三步是会抛出异常:

org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://localhost:9000/user/admin/input

 

 

原因是并没有把input目录上传到hdfs上去。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值