Hadoop 1.0.3配置安装

环境:操作系统Centos 5.5 64-bit

三台节点,计划部署分配如下:

 ip                       主机名             用途

172.16.48.201 sg201 namenode 

172.16.48.202 sg202 datanode
172.16.48.203    sg203            datanode
三台节点的etc/hosts文件设置如下:

172.16.48.203  sg203
172.16.48.202  sg202
172.16.48.201  sg201

关闭三台节点的防火墙 service iptables stop

一、安装基础工具

1.三台节点上安装jdk并设置好环境变量:http://blog.csdn.net/chenxingzhen001/article/details/7732692

2.安装ssh服务

二、配置三台节点的ssh无密码登录

参考:http://blog.csdn.net/chenxingzhen001/article/details/7740357

三、安装并配置hadoop 1.0.3

1.在hadoop官网上下载hadoop-1.0.3.tar.gz,并解压在/opt目录下

cd /opt

tar zxf hadoop-1.0.3.tar.gz

2.修改hadoop-env.sh 配置

 export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_05   #设置JAVA_HOME

 export HADOOP_HOME_WARN_SUPPRESS=1          #取消环境变量$HADOOP_HOME过时的警告

3.配置环境变量

vi /etc/profile

JAVA_HOME=/usr/java/jdk1.7.0_05
ANT_HOME=/usr/local/apache-ant-1.8.3
HADOOP_HOME=/opt/hadoop-1.0.3
CLASSPATH=.:$JAVA_HOME/lib/tools.jar
PATH=$JAVA_HOME/bin:$ANT_HOME/bin:$HADOOP_HOME/bin:$PATH
export JAVA_HOME ANT_HOME HADOOP_HOME  CLASSPATH PATH
让环境变量生效

source /etc/profile

4.设置conf/masters文件内容

vi conf/masters

172.16.48.201 # secondaryNameNode
5. 设置conf/slaves文件内容,指定slavesNodes

vi  /conf/slaves
172.16.48.202  #datanode
172.16.48.203  #datanode
6.配置文件:conf/hdfs-site.xml

vim conf/hdfs-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
        <name>dfs.name.dir</name>
        <value>/opt/hadoop-1.0.3/name</value>
    </property>

    <property>
        <name>dfs.data.dir</name>
        <value>/opt/hadoop-1.0.3/data</value>
    </property>

    <property>
        <name>dfs.replication</name>
        <value>2</value>
    </property>

</configuration>
配置文件参数说明:
dfs.name.dir:NameNode上的本地文件路径,用于持久存储命名空间和日志信息文件等内容。该参数可以有多个值,值之间用逗号分割,表示文件映射关系信息将会被复制到每个目录中做冗余备份。

dfs.data.dir:DataNode上的本地文件路径,用于存放文件数据块。该参数可以有多个值,值之间用逗号分割,表示文件数据块将会被复制到每个目录中做冗余备份,通常配置为不同的设备

注意:这里的name、data目录不能预先创建,hadoop格式化时会自动创建,如果预先创建反而会有问题

dfs.replication:表示文件数据块要冗余备份的数量,默认值为3


7.配置文件:conf/mapred-site.xml

vim   conf/mapred-site.xm

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
        <name>mapred.job.tracker</name>
        <value>172.16.48.201:9001</value>
    </property>
</configuration>

配置文件参数说明:
mapred.job.tracker:JobTracker(任务分配管理)的主机(或者IP)和端口

mapred.local.dir:本地文件路径,用逗号分割的路径列表,是Map/Reduce临时数据存放的地方

8.配置文件:conf/core-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
  <property>
      <name>fs.default.name</name>
      <value>hdfs://172.16.48.201:9000</value>
  </property>

  <property>
     <name>fs.checkpoint.period</name>
     <value>3600</value>
  </property>

  <property>
      <name>fs.checkpoint.size</name>
      <value>67108864</value>
  </property> 
            
  <property>
      <name>hadoop.tmp.dir</name> 
      <value>/opt/hadoop-1.0.3/tmp</value>
  </property>
</configuration>

9.将布署进行分发
使用scp命令,将弄好的hadoop-1.0.3文件夹复制一份,分发到各dataNode节点相同目录下:
scp -r /opt/hadoop-1.0.3 root@sg202:/opt/hadoop-1.0.3
scp -r /opt/hadoop-1.0.3 root@sg203:/opt/hadoop-1.0.3

四、启动hadoop
1.格式化一个新的分布式文件系统

[root@sg201 hadoop-1.0.3]# bin/hadoop namenode -format
12/07/13 11:08:58 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = sg201/127.0.0.1
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 1.0.3
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1335192; compiled by 

'hortonfo' on Tue May  8 20:31:25 UTC 2012
************************************************************/
12/07/13 11:08:59 INFO util.GSet: VM type       = 64-bit
12/07/13 11:08:59 INFO util.GSet: 2% max memory = 17.77875 MB
12/07/13 11:08:59 INFO util.GSet: capacity      = 2^21 = 2097152 entries
12/07/13 11:08:59 INFO util.GSet: recommended=2097152, actual=2097152
12/07/13 11:08:59 INFO namenode.FSNamesystem: fsOwner=root
12/07/13 11:08:59 INFO namenode.FSNamesystem: supergroup=supergroup
12/07/13 11:08:59 INFO namenode.FSNamesystem: isPermissionEnabled=true
12/07/13 11:08:59 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
12/07/13 11:08:59 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), 

accessTokenLifetime=0 min(s)
12/07/13 11:08:59 INFO namenode.NameNode: Caching file names occuring more than 10 times 
12/07/13 11:09:00 INFO common.Storage: Image file of size 110 saved in 0 seconds.
12/07/13 11:09:00 INFO common.Storage: Storage directory /opt/hadoop-1.0.3/name has been successfully formatted.
12/07/13 11:09:00 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at sg201/127.0.0.1
************************************************************/

2. 启动hdfs守护进程

[root@sg201 hadoop-1.0.3]# bin/start-dfs.sh
starting namenode, logging to /opt/hadoop-1.0.3/libexec/../logs/hadoop-root-namenode-sg201.out
172.16.48.202: starting datanode, logging to /opt/hadoop-1.0.3/libexec/../logs/hadoop-root-datanode-sg202.out
172.16.48.203: starting datanode, logging to /opt/hadoop-1.0.3/libexec/../logs/hadoop-root-datanode-sg203.out
root@172.16.48.201's password: 
172.16.48.201: starting secondarynamenode, logging to /opt/hadoop-1.0.3/libexec/../logs/hadoop-root-secondarynamenode-sg201.out

3.启动mapreduce守护进程

[root@sg201 hadoop-1.0.3]# bin/start-mapred.sh
starting jobtracker, logging to /opt/hadoop-1.0.3/libexec/../logs/hadoop-root-jobtracker-sg201.out
172.16.48.203: starting tasktracker, logging to /opt/hadoop-1.0.3/libexec/../logs/hadoop-root-tasktracker-sg203.out
172.16.48.202: starting tasktracker, logging to /opt/hadoop-1.0.3/libexec/../logs/hadoop-root-tasktracker-sg202.out

五、验证安装是否成功

使用jps命令查看启动的守护进程有哪些:

[root@sg201 conf]# jps
12560 NameNode
17688 Jps
12861 JobTracker
12755 SecondaryNameNode
5855 MyEclipse

[root@sg203 conf]# jps
11732 DataNode
14336 Jps
11856 TaskTracker

另外可通过浏览器访问:

NameNode            http://172.16.48.201:50070/

JobTracker            http://172.16.48.201:50030/


在namenode上查看集群状态

[root@sg201 conf]# hadoop dfsadmin -report
Configured Capacity: 1380625408000 (1.26 TB)
Present Capacity: 1174286331904 (1.07 TB)
DFS Remaining: 1174286249984 (1.07 TB)
DFS Used: 81920 (80 KB)
DFS Used%: 0%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 2 (2 total, 0 dead)

Name: 172.16.48.203:50010
Decommission Status : Normal
Configured Capacity: 545281376256 (507.83 GB)
DFS Used: 40960 (40 KB)
Non DFS Used: 157135388672 (146.34 GB)
DFS Remaining: 388145946624(361.49 GB)
DFS Used%: 0%
DFS Remaining%: 71.18%
Last contact: Tue Jul 17 10:25:12 CST 2012


Name: 172.16.48.202:50010
Decommission Status : Normal
Configured Capacity: 835344031744 (777.97 GB)
DFS Used: 40960 (40 KB)
Non DFS Used: 49203687424 (45.82 GB)
DFS Remaining: 786140303360(732.15 GB)
DFS Used%: 0%
DFS Remaining%: 94.11%
Last contact: Tue Jul 17 10:25:14 CST 2012

五、关闭hadoop

关闭hdfs守护进程

[root@sg201 hadoop-1.0.3]# bin/stop-dfs.sh
no namenode to stop
172.16.48.203: no datanode to stop
172.16.48.202: no datanode to stop
root@172.16.48.201's password: 
172.16.48.201: stopping secondarynamenode
关闭mapreduce守护进程

[root@sg201 hadoop-1.0.3]# bin/stop-mapred.sh
stopping jobtracker
172.16.48.203: no tasktracker to stop
172.16.48.202: no tasktracker to stop


附相关命令:

$HADOOP_HOME/bin/hadoop job -list    查看当前运行的job

$HADOOP_HOME/bin/hadoop job  -kill jobid   杀死正在运行的job

hadoop job -status job_id           查看job的状态
bin/Hadoop-daemon.sh start  datanode   重启坏掉的datanode
bin/Hadoop-daemon.sh start jobtracker   重启坏掉的jobtracker
bin/Hadoop-daemon.sh start tasktracker  重启坏掉的tasktracker
 
#以下两条命令用于动态把某个节点加入集群

bin/Hadoop-daemon.sh --config ./conf start DataNode 

bin/Hadoop-daemon.sh --config ./conf start tasktracker

 

评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值