CentOS7安装篇(hadoop平台搭建)

16 篇文章 0 订阅
1 篇文章 0 订阅

为了更好学习大数据,需要搭建一个学习的环境。有误的地方希望大家共同指教。

首先介绍下我的硬件:

2台VM

namenode 192.168.1.10

datanode 192.168.1.11

===============================================================================

安装虚拟机VMware 10.0

在VMware上安装系统 CentOS7

================================================================================

登陆系统后

查看系统版本号

cat /etc/redhat-release

cat /etc/centos-release

===========================================================================

准备工作

========================================================================

查看系统安装的程序

rpm -qa

i.e.   rpm -qa|grep -i mysql

需要卸载

yum remove software


配置yum 的下载源。

使用的是网易的linux 源  http://mirrors.163.com/centos


>cp CentOS-Base.repo CentOS-Base.repo.bk

>vi /etc/yum.repos.d/CenOS-Base.repo


mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=os&infra=$infra
#baseurl=http://mirror.centos.org/centos/$releasever/os/$basearch/

替换:

#mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=os&infra=$infra
baseurl=http://mirrors.163.com/centos/$releasever/os/$basearch/

更新yum源:

>yum clean all && yum clean metadata && yum clean dbcache && yum makecache && yum update


>yum search "abc"

>yum -y install wget.x86_64

>yum -y install gcc.x86_64

>yum -y install net-tools

>whereis wget

>which wget


setting IP

查看ip

>ifconfig

使用文字图形界面设置ip

>yum install NetworkManager-tui

>nmtui

>nmtui-edit eno16777736  修改网卡配置
>nmtui-connect eno16777736


在这里可以设置IP 网关,DNS (DNS如果不知道如何设置,最好跟网关一样,否则会出现datanode连接不上namenode的异常,下面有介绍)

另外,如果你不是自己有DNS的话,还需要设置本机的DNS对应关系

在所有的NODE上都需要同样的设置

vi /etc/hosts

192.168.1.10 namenode

192.168.1.11 datanode

这样在hadoop的配置文件里面就可以全部使用namenode作为配置主机名称.


重启网络
对于系统service的操作,使用 systemctl
>systemctl  restart network
>systemctl  status network

or

service network restart


使用下面的命令来验证网络管理器服务的状态:

  1. $ systemctl status NetworkManager.service

运行以下命令来检查受网络管理器管理的网络接口:

  1. $ nmcli dev status

======================================================================

安装 java

=================================================================

下载安装文件 jdk-8u72-linux-x64.rpm

http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

rpm -ivh jdk-8u72-linux-x64.rpm

安装到了 /usr/java/jdk1.8.0_72

==================================================================

安装hadoop

===============================================================

下载  hadoop 2.6

http://www.apache.org/dyn/closer.cgi/hadoop/common/

http://mirrors.hust.edu.cn/apache/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz

解压, 移动,建立lastest 的软连接, 以后直接更新这个 latest 的链接就行了。

tar zxvf hadoop-2.6.0.tar.gz

mkdir /usr/hadoop

mv hadoop-2.6.0/ /usr/hadoop/

cd /usr/hadoop

ln -s hadoop-2.6.0/ latest

tips: 删除软连接

rm -rf latest  注意不是 rm -rf latest/


vi /etc/profile

export JAVA_HOME=/usr/java/latest
export CLASSPATH=.:%JAVA_HOME%/lib/dt.jar:%JAVA_HOME%/lib/tools.jar:%JAVA_HOME%/lib/jdbc.jar
export PATH=$PATH:$JAVA_HOME/bin

export HADOOP_HOME=/usr/hadoop/latest
export HADOOP_PREFIX=$HADOOP_HOME

export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop 

export HIVE_HOME=/usr/hive/latest
export PATH=$HIVE_HOME/bin:$PATH

:wq

source /etc/profile


cd $HADOOP_HOME


vi libexec/hadoop-config.sh

添加

export JAVA_HOME=/usr/java/latest


vi etc/hadoop/core-site.xml

<property>
<name>fs.defaultFS</name>
<value>hdfs://namenode:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>


vi etc/hadoop/hdfs-site.xml

<property>
  <name>dfs.namenode.datanode.registration.ip-hostname-check</name>
  <value>false</value>
</property>

<property>
<name>dfs.namenode.name.dir</name>
<value>/usr/hadoop/latest/dfs/name</value>
</property>
<property>
<name>dfs.blocksize</name>
<value>268435456</value>
</property>
<property>
<name>dfs.namenode.handler.count</name>
<value>100</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/usr/hadoop/latest/dfs/data</value> 

<property>
  <name>dfs.namenode.datanode.registration.ip-hostname-check</name>
  <value>false</value>
</property>



vi etc/hadoop/yarn-site.xml

 <!-- Resource Manager Configs -->
  <property>
    <description>The hostname of the RM.</description>
    <name>yarn.resourcemanager.hostname</name>
    <value>namenode</value>
  </property>    
  
  <property>
    <description>The address of the applications manager interface in the RM.</description>
    <name>yarn.resourcemanager.address</name>
    <value>${yarn.resourcemanager.hostname}:8032</value>
  </property>
  
  <property>
    <name>yarn.resourcemanager.resource-tracker.address</name>
    <value>${yarn.resourcemanager.hostname}:8031</value>
  </property>
  
  <property>
    <description>The address of the scheduler interface.</description>
    <name>yarn.resourcemanager.scheduler.address</name>
    <value>${yarn.resourcemanager.hostname}:8030</value>
  </property>  
  
  <property>
    <description>The http address of the RM web application.</description>
    <name>yarn.resourcemanager.webapp.address</name>
    <value>${yarn.resourcemanager.hostname}:8088</value>
  </property>
  
  <property>
    <description>The class to use as the resource scheduler.</description>
    <name>yarn.resourcemanager.scheduler.class</name>
    <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
  </property>  
  
  <property>
    <description>The minimum allocation for every container request at the RM,
    in MBs. Memory requests lower than this will throw a
    InvalidResourceRequestException.</description>
    <name>yarn.scheduler.minimum-allocation-mb</name>
    <value>512</value>
  </property>

  <property>
    <description>The maximum allocation for every container request at the RM,
    in MBs. Memory requests higher than this will throw a
    InvalidResourceRequestException.</description>
    <name>yarn.scheduler.maximum-allocation-mb</name>
    <value>2048</value>
  </property>  

  <property>
    <description>Amount of physical memory, in MB, that can be allocated 
    for containers.</description>
    <name>yarn.nodemanager.resource.memory-mb</name>
    <value>2048</value>
  </property>

  <property>
    <description>Ratio between virtual memory to physical memory when
    setting memory limits for containers. Container allocations are
    expressed in terms of physical memory, and virtual memory usage
    is allowed to exceed this allocation by this ratio.
    </description>
    <name>yarn.nodemanager.vmem-pmem-ratio</name>
    <value>2.1</value>
  </property>  

  <property>
    <description>List of directories to store localized files in. An 
      application's localized file directory will be found in:
      ${yarn.nodemanager.local-dirs}/usercache/${user}/appcache/application_${appid}.
      Individual containers' work directories, called container_${contid}, will
      be subdirectories of this.
   </description>
    <name>yarn.nodemanager.local-dirs</name>
    <value>/tmp/hadoop/nm_tempfile/</value>
  </property> 

<property>
 <name>yarn.nodemanager.aux-services</name>
 <value>mapreduce_shuffle</value>
</property>


vi etc/hadoop/mapred-site.xml


<property>
  <name>mapreduce.framework.name</name>
  <value>yarn</value>
  <description>The runtime framework for executing MapReduce jobs.
  Can be one of local, classic or yarn.
  </description>
</property>


<property>
  <name>mapreduce.map.memory.mb</name>
  <value>1024</value> 
</property>


<property>
  <name>mapred.child.java.opts</name>
  <value>-Xmx820m</value>
</property>


<property>
  <name>mapreduce.reduce.memory.mb</name>
  <value>1024</value>
  <description>The amount of memory to request from the scheduler for each
  reduce task.
  </description>
</property>
<property>
  <name>mapreduce.jobhistory.address</name>
  <value>namenode:10020</value>
  <description>MapReduce JobHistory Server IPC host:port</description>
</property>

<property>
  <name>mapreduce.jobhistory.webapp.address</name>
  <value>namenode:19888</value>
  <description>MapReduce JobHistory Server Web UI host:port</description>
</property>


ssh配置  
主机namenode 192.168.1.10
主机datanode 192.168.1.11
需要配置所有主机无密码登录主机namenode
先确保所有主机的防火墙处于关闭状态。
在主机namenode上执行如下:
 1. $cd ~/.ssh
 2. $ssh-keygen -t rsa  --------------------然后一直按回车键,就会按照默认的选项将生成的密钥保存在.ssh/id_rsa文件中。
 3. $cp id_rsa.pub authorized_keys
这步完成后,正常情况下就可以无密码登录本机了,即ssh localhost,无需输入密码。


在datanode上


 1. $cd ~/.ssh
 2. $ssh-keygen -t rsa  --------------------然后一直按回车键,就会按照默认的选项将生成的密钥保存在.ssh/id_rsa文件中。
 3. $scp id_rsa.pub root@192.168.1.10:~/.ssh/   ------把刚刚产生的authorized_keys文件拷一份到namenode上.  
  4.  $cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys (追加主机公共匙的权限)
 
进入namenode的.ssh目录,改变authorized_keys文件的许可权限
 5. $chmod 644 authorized_keys


当所有datanode 完成后,再把namenode上的(已经拥有所有机器的public key )文件 分发到所有datanode上

$scp authorized_keys root@datanode:~/.ssh/ 


=========================================================

安装启动操作

$HADOOP_PREFIX/bin/hdfs namenode -format <cluster_name> 
$HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start namenode 
$HADOOP_PREFIX/sbin/hadoop-daemons.sh --config $HADOOP_CONF_DIR --script hdfs start datanode

$HADOOP_PREFIX/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start resourcemanager
$HADOOP_PREFIX/sbin/yarn-daemons.sh --config $HADOOP_CONF_DIR start nodemanager
$HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh --config $HADOOP_CONF_DIR start historyserver

或者直接使用(需要使用 etc/hadoop/slaves )

$HADOOP_PREFIX/sbin/start-dfs.sh
$HADOOP_PREFIX/sbin/start-yarn.sh

$HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh --config $HADOOP_CONF_DIR start historyserver


遇到ssh问题时,可以使用hostname 登陆第一次,然后再执行安装

i.e.   ssh namenode 而不是 ssh localhost


安装过程中遇到如下问题:

hdfs datanode denied communication with namenode because hostname cannot be resolved

<pre style="margin-top: 0px; margin-bottom: 1em; padding: 5px; border: 0px; font-size: 13px; overflow: auto; width: auto; max-height: 600px; font-family: Consolas, Menlo, Monaco, 'Lucida Console', 'Liberation Mono', 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Courier New', monospace, sans-serif; word-wrap: normal; background-color: rgb(238, 238, 238);"><code style="margin: 0px; padding: 0px; border: 0px; border-image-source: initial; border-image-slice: initial; border-image-width: initial; border-image-outset: initial; border-image-repeat: initial; font-family: Consolas, Menlo, Monaco, 'Lucida Console', 'Liberation Mono', 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Courier New', monospace, sans-serif; white-space: inherit;"><span style="color:#222426;">Initialization failed for Block pool BP-232943349-192.168.1.10-1417116665984 
(Datanode Uuid null) service to namenode/192.168.1.10:8022 
Datanode denied communication with namenode because hostname cannot be resolved 
</span><span style="color:#ff0000;">(ip=192.168.1.1, hostname=192.168.1.1)</span><span style="color:#222426;">: DatanodeRegistration(192.168.1.11, 
datanodeUuid=49a6dc47-c988-4cb8-bd84-9fabf87807bf, infoPort=50075, ipcPort=50020, 
storageInfo=lv=-56;cid=cluster24;nsid=11020533;c=0)</span></code>

 解决方法参考了: 

http://stackoverflow.com/questions/27195466/hdfs-datanode-denied-communication-with-namenode-because-hostname-cannot-be-reso

http://log.rowanto.com/why-datanode-is-denied-communication-with-namenode/

1)进入 nmtui 工具,删除dns,  然后重启网络

2) 在 $HADOOP_HOME/etc/hadoop/hdfs-site.xml 中添加

<property>
  <name>dfs.namenode.datanode.registration.ip-hostname-check</name>
  <value>false</value>
</property>

3) 检查 vi /etc/resolve.conf 是不是已经没有设置DNS了。如果有可以直接删除。这样基本问题可以解决。但没有DNS,就访问不了外网了。所以还需要重新设置DNS.

4) 进入nmtui 设置DNS 192.168.1.1  (之前出问题的是设置了 202.96.128.86 )


补充:后来发现了根本的原因!!!! 是我使用nmtui设置了ip,但没有把 dhcp 改为 manual ,不知道为什么我的IP可以起到作用,但dhcp有随时给我新的IP。所以datanode一直不能把最后的IP确定下来。

解决办法:

进入nmtui,把 automatic 改为 manual 就行了。


======================================================================================

安装HIVE,MYSQL

====================================================================================

解压缩、重命名、设置环境变量
下载HIVE http://apache.fayea.com/hive/stable/
解压到 /usr/hive/hive-version
建立软连接 latest => hive-version


$HADOOP_HOME/bin/hadoop fs -mkdir       /tmp
$HADOOP_HOME/bin/hadoop fs -mkdir -p    /user/hive/warehouse
$HADOOP_HOME/bin/hadoop fs -chmod g+w   /tmp
$HADOOP_HOME/bin/hadoop fs -chmod g+w   /user/hive/warehouse
  
在目录$HIVE_HOME/conf/下,执行命令
  cp hive-default.xml.template  hive-site.xml
  cp hive-env.sh.template  hive-env.sh


Mysql安装
从最新版本的linux系统开始,默认的是 Mariadb而不是mysql!
使用系统自带的repos安装很简单:
yum install mariadb mariadb-server
systemctl start mariadb ==> 启动mariadb
systemctl enable mariadb ==> 开机自启动
mysql_secure_installation ==> 设置 root密码等相关 (第一次直接按回车,因为首次root密码为空)
mysql -uroot -p123456 ==> 测试登录!


使用mysql作为hive的metastore
下载最新jdbc driver
wget http://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.38.tar.gz
tar zxvf mysql-connector-java-5.1.38.tar.gz
cp mysql-connector-java-5.1.38-bin.jar $JAVA_HOME/lib/

cd $JAVA_HOME/lib/
ln -s mysql-connector-java-5.1.38-bin.jar jdbc.jar

cd $HIVE_HOME/lib/

ln -s $JAVA_HOME/lib/mysql-connector-java-5.1.38-bin.jar jdbc.jar


把mysql的jdbc驱动添加到$CLASSPATH
vi /etc/profile
export CLASSPATH=.:%JAVA_HOME%/lib/dt.jar:%JAVA_HOME%/lib/tools.jar:%JAVA_HOME%/lib/jdbc.jar
:wq


source /etc/profile 使其生效

修改hive-site.xml文件,修改内容如下: 


<property>
 <name>javax.jdo.option.ConnectionURL</name>
 <value>jdbc:mysql://namenode:3306/hive?createDatabaseIfNotExist=true</value>
</property>


<property>
 <name>javax.jdo.option.ConnectionDriverName</name>
 <value>com.mysql.jdbc.Driver</value>
</property>
<property>
 <name>javax.jdo.option.ConnectionUserName</name>
 <value>root</value>
</property>
<property>
 <name>javax.jdo.option.ConnectionPassword</name>
 <value>Test1234</value>
</property>


故障:
[ERROR] Terminal initialization failed; falling back to unsupported
java.lang.IncompatibleClassChangeError: Found class jline.Terminal, but interface was expected
        at jline.TerminalFactory.create(TerminalFactory.java:101)
        at jline.TerminalFactory.get(TerminalFactory.java:158)
        at jline.console.ConsoleReader.<init>(ConsoleReader.java:229)
        at jline.console.ConsoleReader.<init>(ConsoleReader.java:221)
        at jline.console.ConsoleReader.<init>(ConsoleReader.java:209)
        at org.apache.Hadoop.hive.cli.CliDriver.getConsoleReader(CliDriver.java:773)
        at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:715)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:212)


原因:

hadoop目录下存在老版本jline:
/hadoop-2.6.0/share/hadoop/yarn/lib:
-rw-r--r-- 1 root root  87325 Mar 10 18:10 jline-0.9.94.jar

解决:
cp /usr/hive/latest/lib/jline-2.12.jar /usr/hadoop/latest/share/hadoop/yarn/lib/
rm -y /usr/hadoop/latest/share/hadoop/yarn/lib/jline-0.9.94.jar



Exception in thread "main"java.lang.RuntimeException: java.lang.IllegalArgumentException:java.net.URISyntaxException: Relative path in absolute URI:${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
        atorg.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:444)
        atorg.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:672)
        atorg.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
        atsun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        atjava.lang.reflect.Method.invoke(Method.java:606)
        atorg.apache.hadoop.util.RunJar.main(RunJar.java:160)
Caused by: java.lang.IllegalArgumentException:java.net.URISyntaxException: Relative path in absolute URI:${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
        atorg.apache.hadoop.fs.Path.initialize(Path.java:148)
        atorg.apache.hadoop.fs.Path.<init>(Path.java:126)
        atorg.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:487)
        atorg.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:430)
        ... 7more
Caused by: java.net.URISyntaxException:Relative path in absolute URI:${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
        atjava.net.URI.checkPath(URI.java:1804)
        atjava.net.URI.<init>(URI.java:752)
        atorg.apache.hadoop.fs.Path.initialize(Path.java:145)
        ... 10more


解决方案如下:
查看hive-site.xml配置,会看到配置值含有"${system:java.io.tmpdir}/${system:user.name}"的配置项
将配置项的值修改为/tmp/hive
启动hive,成功!



推荐使用的linux 性能监控工具 nmon, htop, glances

参考:http://os.51cto.com/art/201412/460698_all.htm




挂载U盘

1.在vmware上connect u盘

2.查看u盘的设备名称 如 /dev/sdb0

> fdisk -l 

3. >mkdir /mnt/usb

>mount /dev/sdb0 /mnt/usb

>cd /mnt/usb

OK!

卸载:

退出所有U盘的目录

>cd /mnt

>umount /mnt/usb



  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值