hadoop-1.2安装

1. Hadoop简介 

    Hadoop是Apache软件基金会旗下的一个开源分布式计算平台。以Hadoop分布式文件系统(HDFS,Hadoop Distributed Filesystem)和MapReduce(Google MapReduce的开源实现)为核心的Hadoop为用户提供了系统底层细节透明的分布式基础架构。
    对于Hadoop的集群来讲,可以分成两大类角色:Master和Salve。一个HDFS集群是由一个NameNode和若干个DataNode组成的。其中NameNode作为主服务器,管理文件系统的命名空间和客户端对文件系统的访问操作;集群中的DataNode管理存储的数据。MapReduce框架是由一个单独运行在主节点上的JobTracker和运行在每个集群从节点的TaskTracker共同组成的。主节点负责调度构成一个作业的所有任务,这些任务分布在不同的从节点上。主节点监控它们的执行情况,并且重新执行之前的失败任务;从节点仅负责由主节点指派的任务。当一个Job被提交时,JobTracker接收到提交作业和配置信息之后,就会将配置信息等分发给从节点,同时调度任务并监控TaskTracker的执行。
    从上面的介绍可以看出,HDFS和MapReduce共同组成了Hadoop分布式系统体系结构的核心。HDFS在集群上实现分布式文件系统,MapReduce在集群上实现了分布式计算和任务处理。HDFS在MapReduce任务处理过程中提供了文件操作和存储等支持,MapReduce在HDFS的基础上实现了任务的分发、跟踪、执行等工作,并收集结果,二者相互作用,完成了Hadoop分布式集群的主要任务。

2. 软件准备

由于个人实验环境是虚拟机环境,机器分别是centos6.5 x86环境,准备如下软件:hadoop-1.2.1,jdk-6u45-linux-i586

下载地址:http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-1.2.1/hadoop-1.2.1.tar.gz

http://download.oracle.com/otn/java/jdk/6u45-b06/jdk-6u45-linux-i586.bin

2. 集群部署

2.1环境说明

    集群中包括3个节点:1个Master,2个Salve,节点之间局域网连接,可以相互ping通。节点IP地址分布如下:
机器名称
IP地址
hadoop1
192.168.227.31
hadoop2
192.168.227.32
hadoop3
192.168.227.33

三个节点上均是CentOS6.4系统,并且有一个相同的用户hadoop。Master机器主要配置NameNode和JobTracker的角色,负责总管分布式数据和分解任务的执行;2个Salve机器配置DataNode和TaskTracker的角色,负责分布式数据存储以及任务的执行。其实应该还应该有1个Master机器,用来作为备用,以防止Master服务器宕机,还有一个备用马上启用。后续经验积累一定阶段后补上一台备用Master机器。

2.2环境配置

把hadoop-1.2.1.tar.gz和jdk-6u45-linux-i586.bin上传到hadoop1机器上

2.2.1 Java环境变量配置

chmod 755 jdk-6u45-linux-i586.bin
执行 ./jdk-6u45-linux-i586.bin
mv ./jdk1.6.0_45 /usr/local

vi /etc/profile,添加如下:
export JAVA_HOME=/usr/local/jdk1.6.0_45
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

2.2.2判断是否安装ssh和rsync

rpm -qa|grep openssh
rpm -qa|grep rsync
未安装的话,直接yum安装

2.2.3生成密钥

ssh-keygen -t rsa -P ''
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys
sudo vim /etc/ssh/sshd_config
设置如下:
RSAAuthentication yes # 启用 RSA 认证 
PubkeyAuthentication yes # 启用公钥私钥配对认证方式 
AuthorizedKeysFile .ssh/authorized_keys # 公钥文件路径(和上面生成的文件同)
sudo service sshd restart
复制密钥到其他机器实现无密码登陆
scp ~/.ssh/id_rsa.pub hadoop@192.168.227.32:~/
scp ~/.ssh/id_rsa.pub cbox@10.70.33.77:~/

cat ~/id_rsa.pub >> ~/.ssh/authorized_keys
cat /home/cbox/id_rsa.pub >> ~/.ssh/authorized_keys

2.2.4 hosts配置   

vi /etc/hosts
192.168.227.31 hadoop1
192.168.227.32 hadoop2
192.168.227.33 hadoop3

2.2.4 配置hadoop集群

执行tar -zxvf hadoop-1.2.1.tar.gz -C /usr/local
1.设置hadoop_home
vi /etc/profile
# set hadoop path
export HADOOP_HOME=/usr/local/hadoop-1.2.1
export PATH=$HADOOP_HOME/bin:$PATH
source /etc/profile
2.配置hadoop-env.sh
# set java environment
export JAVA_HOME=/usr/java/jdk1.6.0_45
3. core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://hadoop1:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/var/tmp/hadoop</value>
</property>
</configuration>
4.mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>hadoop1:9001</value>
</property>
<property>
<name>mapred.map.java.opts</name>
<value>-Xmx512m</value>
</property>
<property>
<name>mapred.child.java.opts</name>
<value>-Xmx512m</value>
</property>
</configuration>
5.hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.support.append</name>
<value>true</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.datanode.max.xcievers</name>
<value>4096</value>
</property>
</configuration>
6.masters
vi /usr/local/hadoop-1.2.1/conf/masters
hadoop1
7.slaves
vi /usr/local/hadoop-1.2.1/conf/slaves
hadoop1
hadoop2
hadoop3
8.关闭防火墙(也可以设置端口过滤,这里实验环境为了简单,关闭防火墙)
1) 重启后生效 
开启: chkconfig iptables on 
关闭: chkconfig iptables off 
2) 即时生效,重启后失效 
开启: service iptables start 
关闭: service iptables stop 
需要说明的是对于Linux下的其它服务都可以用以上命令执行开启和关闭操作。 
在开启了防火墙时,做如下设置,开启相关端口, 
修改/etc/sysconfig/iptables 文件,添加以下内容: 
-A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 80 -j ACCEPT 
-A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT 
/etc/init.d/iptables status
/ etc/init.d/iptables stop

永久关闭:

chkconfig --level 35 iptables off


9.复制hadoop到其他机器

scp -r /usr/local/hadoop-1.2.1/ root@192.168.227.32:/usr/local/hadoop-1.2.1

在其他机器上执行

sudo chown -R hadoop:hadoop ./hadoop-1.2.1/


每天机器配置一样,都需要执行相同的步骤

3.测试环境
hadoop namenode -format初次使用初始化namenode
start-all.sh
执行jps


hadoop dfs -ls /


http://hadoop1:50030/jobtracker.jsp #查看job执行情况
http://hadoop1:50070/dfshealth.jsp #查看hdfs文件系统


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Warning: No configuration directory set! Use --conf <dir> to override. Info: Including Hadoop libraries found via (/opt/hadoop-3.1.2/bin/hadoop) for HDFS access Info: Including HBASE libraries found via (/opt/hbase-2.2.6/bin/hbase) for HBASE access 错误: 找不到或无法加载主类 org.apache.flume.tools.GetJavaProperty Info: Including Hive libraries found via (/opt/hive-3.1.2) for Hive access + exec /opt/jdk1.8.0_351/bin/java -Xmx20m -cp '/opt/flume-1.9.0/lib/*:/opt/hadoop-3.1.2/etc/hadoop:/opt/hadoop-3.1.2/share/hadoop/common/lib/*:/opt/hadoop-3.1.2/share/hadoop/common/*:/opt/hadoop-3.1.2/share/hadoop/hdfs:/opt/hadoop-3.1.2/share/hadoop/hdfs/lib/*:/opt/hadoop-3.1.2/share/hadoop/hdfs/*:/opt/hadoop-3.1.2/share/hadoop/mapreduce/lib/*:/opt/hadoop-3.1.2/share/hadoop/mapreduce/*:/opt/hadoop-3.1.2/share/hadoop/yarn:/opt/hadoop-3.1.2/share/hadoop/yarn/lib/*:/opt/hadoop-3.1.2/share/hadoop/yarn/*:/opt/hbase-2.2.6/conf:/opt/jdk1.8.0_351//lib/tools.jar:/opt/hbase-2.2.6:/opt/hbase-2.2.6/lib/shaded-clients/hbase-shaded-client-byo-hadoop-2.2.6.jar:/opt/hbase-2.2.6/lib/client-facing-thirdparty/audience-annotations-0.5.0.jar:/opt/hbase-2.2.6/lib/client-facing-thirdparty/commons-logging-1.2.jar:/opt/hbase-2.2.6/lib/client-facing-thirdparty/findbugs-annotations-1.3.9-1.jar:/opt/hbase-2.2.6/lib/client-facing-thirdparty/htrace-core4-4.2.0-incubating.jar:/opt/hbase-2.2.6/lib/client-facing-thirdparty/log4j-1.2.17.jar:/opt/hbase-2.2.6/lib/client-facing-thirdparty/slf4j-api-1.7.25.jar:/opt/hadoop-3.1.2/etc/hadoop:/opt/hadoop-3.1.2/share/hadoop/common/lib/*:/opt/hadoop-3.1.2/share/hadoop/common/*:/opt/hadoop-3.1.2/share/hadoop/hdfs:/opt/hadoop-3.1.2/share/hadoop/hdfs/lib/*:/opt/hadoop-3.1.2/share/hadoop/hdfs/*:/opt/hadoop-3.1.2/share/hadoop/mapreduce/lib/*:/opt/hadoop-3.1.2/share/hadoop/mapreduce/*:/opt/hadoop-3.1.2/share/hadoop/yarn:/opt/hadoop-3.1.2/share/hadoop/yarn/lib/*:/opt/hadoop-3.1.2/share/hadoop/yarn/*:/opt/hadoop-3.1.2/etc/hadoop:/opt/hbase-2.2.6/conf:/opt/hive-3.1.2/lib/*' -Djava.library.path=:/opt/hadoop-3.1.2/lib/native org.apache.flume.node.Application --name a1 --conf/opt/flume-1.9.0/conf --conf-file/opt/flume-1.9.0/conf/dhfsspool.conf-Dflume.root.logger=DEBUG,consol SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/flume-1.9.0/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/hadoop-3.1.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/hive-3.1.2/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 2023-06-08 17:26:46,403 ERROR node.Application: A fatal error occurred while running. Exception follows. org.apache.commons.cli.UnrecognizedOptionException: Unrecognized option: --conf/opt/flume-1.9.0/conf at org.apache.commons.cli.Parser.processOption(Parser.java:363) at org.apache.commons.cli.Parser.parse(Parser.java:199) at org.apache.commons.cli.Parser.parse(Parser.java:85) at org.apache.flume.node.Application.main(Application.java:287)
06-09

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值