ubuntu server 12 上搭建 hadoop2.2 单机伪分布式

最新推荐文章于 2024-07-29 16:53:52 发布

ontheway110

最新推荐文章于 2024-07-29 16:53:52 发布

阅读量1.6k

点赞数

分类专栏： hadoop

本文链接：https://blog.csdn.net/ontheway110/article/details/16878245

版权

hadoop 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

创建hadoop用户组：sudo addgroup hadoop
创建hadoop用户：sudo adduser -ingroup hadoop hadoop
给hadoop用户添加权限，打开/etc/sudoers文件： sudo gedit /etc/sudoers
在root ALL=(ALL:ALL) ALL下添加hadoop ALL=(ALL:ALL) ALL，
在Ubuntu下安装JDK 7
建立ssh无密码登录本机
首先要转换成hadoop用户，执行以下命令：su - hadoop
创建ssh-key，，这里我们采用rsa方式ssh-keygen -t rsa -P ""
进入~/.ssh/目录下，将id_rsa.pub追加到authorized_keys授权文件中，开始是没有authorized_keys文件的：cd ~/.sshcat id_rsa.pub >> authorized_keys
登录localhost：ssh localhost
执行退出命令：exit安装hadoop
下载完成后解压
配置系统环境变量 $ su - # vi /etc/profile在最后加上以下几行
1. export HADOOP_PREFIX="/opt/hadoop"
2. PATH=$PATH:$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin
3. export HADOOP_MAPRED_HOME=${HADOOP_PREFIX}
4. export HADOOP_COMMON_HOME=${HADOOP_PREFIX}
5. export HADOOP_HDFS_HOME=${HADOOP_PREFIX}
6. export YARN_HOME=${HADOOP_PREFIX}
7. export HADOOP_CONF_DIR="${HADOOP_PREFIX}/etc/hadoop"
注：我把解压后的hadoop文件重命名为hadoop，放在opt目录下
$source /etc/profile
$ cd /opt/hadoop/etc/hadoop/修改hadoop-env.sh修改JAVA_HOME，这里JAVA_HOME的路径必须指定为真实的路径，不能引用${JAVA_HOME}，否则运行的时候会有错误JAVA_HOME is not set
1. export JAVA_HOME=／opt/jdk
修改core-site.xml注：创建/tmp/hadoop/hadoop-hadoop 目录
1. <configuration>
2. <property>
3. <name>fs.default.name</name>
4. <value>hdfs://localhost:9000</value>
5. </property>
6. <property>
7. <name>hadoop.tmp.dir</name>
8. <value>/tmp/hadoop/hadoop-hadoop</value>
9. </property>
10. </configuration>
修改hdfs-site.xml

其中，/home/hadoop/dfs/name，/home/hadoop/dfs/data都是文件系统中的目录，需要先新建
1. <configuration>
2. <property>
3. <name>dfs.namenode.name.dir</name>
4. <value>file:/home/hadoop/dfs/name</value>
5. <description>Determines where on the local filesystem the DFS name node
6. should store the name table. If this is a comma-delimited list
7. of directories then the name table is replicated in all of the
8. directories, for redundancy. </description>
9. <final>true</final>
10. </property>
12. <property>
13. <name>dfs.datanode.data.dir</name>
14. <value>file:/home/hadoop/dfs/data</value>
15. <description>Determines where on the local filesystem an DFS data node
16. should store its blocks. If this is a comma-delimited
17. list of directories, then data will be stored in all named
18. directories, typically on different devices.
19. Directories that do not exist are ignored.
20. </description>
21. <final>true</final>
22. </property>
24. <property>
25. <name>dfs.replication</name>
26. <value>1</value>
27. </property>
29. <property>
30. <name>dfs.permissions</name>
31. <value>false</value>
32. </property>
33. </configuration>
修改mapred-site.xml
1. <configuration>
2. <property>
3. <name>mapreduce.framework.name</name>
4. <value>yarn</value>
5. </property>
7. <property>
8. <name>mapred.system.dir</name>
9. <value>file:/home/hadoop/mapred/system</value>
10. <final>true</final>
11. </property>
13. <property>
14. <name>mapred.local.dir</name>
15. <value>file:/home/hadoop/mapred/local</value>
16. <final>true</final>
17. </property>
18. </configuration>
注:由mapred-site.xml.template 复制一份
修改yarn-site.xml
1. <configuration>
3. 
4. <property>
5. <name>yarn.resourcemanager.resource-tracker.address</name>
6. <value>localhost:8081</value>
7. <description>host is the hostname of the resource manager and
8. port is the port on which the NodeManagers contact the Resource Manager.
9. </description>
10. </property>
12. <property>
13. <name>yarn.resourcemanager.scheduler.address</name>
14. <value>localhost:8082</value>
15. <description>host is the hostname of the resourcemanager and port is the port
16. on which the Applications in the cluster talk to the Resource Manager.
17. </description>
18. </property>
20. <property>
21. <name>yarn.resourcemanager.scheduler.class</name>
22. <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
23. <description>In case you do not want to use the default scheduler</description>
24. </property>
26. <property>
27. <name>yarn.resourcemanager.address</name>
28. <value>localhost:8083</value>
29. <description>the host is the hostname of the ResourceManager and the port is the port on
30. which the clients can talk to the Resource Manager. </description>
31. </property>
33. <property>
34. <name>yarn.nodemanager.local-dirs</name>
35. <value></value>
36. <description>the local directories used by the nodemanager</description>
37. </property>
39. <property>
40. <name>yarn.nodemanager.address</name>
41. <value>0.0.0.0:port</value>
42. <description>the nodemanagers bind to this port</description>
43. </property>
45. <property>
46. <name>yarn.nodemanager.resource.memory-mb</name>
47. <value>10240</value>
48. <description>the amount of memory on the NodeManager in GB</description>
49. </property>
51. <property>
52. <name>yarn.nodemanager.remote-app-log-dir</name>
53. <value>/app-logs</value>
54. <description>directory on hdfs where the application logs are moved to </description>
55. </property>
57. <property>
58. <name>yarn.nodemanager.log-dirs</name>
59. <value></value>
60. <description>the directories used by Nodemanagers as log directories</description>
61. </property>
63. <property>
64. <name>yarn.nodemanager.aux-services</name>
65. <value>mapreduce.shuffle</value>
66. <description>shuffle service that needs to be set for Map Reduce to run </description>
67. </property>
68. </configuration>
启动hdfs以及yarn
完成以上配置后可以检测是否配置成
首先格式化namenode
$ hdfs namenode -format
然后启动hdfs
$ start-dfs.sh
或者
$ hadoop-daemon.sh start namenode
$ hadoop-daemon.sh start datanode
接着启动yarn daemons
$ start-yarn.sh
或者
$ yarn-daemon.sh start resourcemanager
$ yarn-daemon.sh start nodemanager
启动完成后可以进入http://localhost:50070/dfshealth.jsp 查看dfs状态，