hadoop-0.20.2伪分布式安装简记

hadoop-0.20.2伪分布式安装简记

1.准备环境

虚拟机(redhat enterprise linux 6.5)

jdk-8u92-linux-x64.tar.gz

hadoop-0.20.2.tar.gz

2.关闭虚拟机的防火墙,selinux,配置SSH免密码登录

[root@sishen ~]# vim /etc/sysconfig/selinux

image

[root@sishen ~]# iptables -F
[root@sishen ~]# service iptables save
iptables: Saving firewall rules to /etc/sysconfig/iptables:[  OK  ]
[root@sishen ~]# service iptables stop
iptables: Setting chains to policy ACCEPT: filter          [  OK  ]
iptables: Flushing firewall rules:                         [  OK  ]
iptables: Unloading modules:                               [  OK  ]
[root@sishen ~]# chkconfig iptables off
[root@sishen ~]# chkconfig iptables --list
iptables           0:off    1:off    2:off    3:off    4:off    5:off    6:off

[root@sishen ~]# ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):                  #这里直接回车
Enter same passphrase again:                                                 #这里直接回车
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
64:d2:fa:ee:61:ef:29:b0:c8:55:1e:6b:a3:6d:1b:d4 root@sishen.161.cn
The key's randomart image is:
+--[ RSA 2048]----+
|                 |
|       .         |
|      . +        |
|       =o.       |
|      .oSoE      |
|      oo=        |
|   . o *=.       |
|    o oo++ .     |
|       o+++      |
+-----------------+
[root@sishen ~]# ssh-copy-id localhost             #这里写localhost或者主机名均可
The authenticity of host 'localhost (::1)' can't be established.
RSA key fingerprint is a5:c4:4e:54:ea:2d:72:3f:9e:65:a2:ac:cd:41:ce:ca.
Are you sure you want to continue connecting (yes/no)? yes             #这里输入yes
Warning: Permanently added 'localhost' (RSA) to the list of known hosts.
root@localhost's password:                                                                               #这里输入密码
Now try logging into the machine, with "ssh 'localhost'", and check in:

  .ssh/authorized_keys

to make sure we haven't added extra keys that you weren't expecting.

测试登陆下

[root@sishen ~]# ssh localhost
Last login: Sat Oct  8 17:16:27 2016 from sishen.161.cn
[root@sishen ~]# exit
logout
Connection to localhost closed.

成功!!!

3.配置环境

首先解压

[root@sishen ~]# tar -xf jdk-8u92-linux-x64.tar.gz -C /usr/src/hadoop/

然后编辑/etc/profile文件

[root@sishen ~]# vim /etc/profile

末尾添加以下内容

79 export JAVA_HOME=/usr/src/hadoop/jdk1.8.0_92
80 export HADOOP_HOME=/usr/src/hadoop/hadoop-0.20.2
81 export PATH=$JAVA_HOME/bin:$PATH:$HADOOP_HOME/bin

保存退出

 

测试

[root@sishen ~]# java -version
java version "1.8.0_92"
Java(TM) SE Runtime Environment (build 1.8.0_92-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.92-b14, mixed mode)

成功!!!

然后开始配置hadoop的配置文件

[root@sishen hadoop-0.20.2]# cd /usr/src/hadoop/hadoop-0.20.2/conf/
[root@sishen conf]# ls
capacity-scheduler.xml     hadoop-policy.xml  slaves
configuration.xsl         hdfs-site.xml      ssl-client.xml.example
core-site.xml              log4j.properties   ssl-server.xml.example
hadoop-env.sh              mapred-site.xml
hadoop-metrics.properties  masters

标红的文件是我们要编辑的

首先编辑hadoop-env.sh ,使用vim打开文件后,找到 # export JAVA_HOME=/usr/lib/j2sdk1.5-sun(大约在第9行左右),然后在此行下面添加如下内容

export JAVA_HOME=/usr/src/hadoop/jdk1.8.0_92,保存并退出

然后编辑core-site.xml文件,使用vim打开之后找到<configuration>,修改为以下内容

<configuration>
  <property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>

</configuration>

编辑hdfs-site.xml文件,在<configuration>标签内添加如下内容(数字是行号!不用写数字)

  7 <property>
  8 <name>dfs.data.dir</name>
  9 <value>/usr/src/hadoop/hadoop-0.20.2/data</value>
10 </property>
11 <property>
12 <name>dfs.replication</name>
13 <value>1</value>
14 </property>
编辑mapred-site.xml文件,找到<configuration>标签后,在添加如下内容(数字为行号!不用写数字)

7 <property>
8 <name>mapred.job.tracker</name>
9 <value>localhost:9001</value>
10 </property>

保存退出以后,开始格式化

[root@sishen ~]# hadoop namenode –format

16/10/09 11:42:11 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = sishen.161.cn/192.168.186.161
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 0.20.2
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
16/10/09 11:42:12 INFO namenode.FSNamesystem: fsOwner=root,root
16/10/09 11:42:12 INFO namenode.FSNamesystem: supergroup=supergroup
16/10/09 11:42:12 INFO namenode.FSNamesystem: isPermissionEnabled=true
16/10/09 11:42:12 INFO common.Storage: Image file of size 94 saved in 0 seconds.
16/10/09 11:42:12 INFO common.Storage: Storage directory /tmp/hadoop-root/dfs/name has been successfully formatted. //格式化成功的标志
16/10/09 11:42:12 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at sishen.161.cn/192.168.186.161
************************************************************/

启动hadoop集群

[root@sishen ~]# start-all.sh

检查

[root@sishen ~]# jps
3058 TaskTracker
2898 SecondaryNameNode
2694 NameNode
2966 JobTracker
2790 DataNode
3111 Jps

停止hadoop集群

[root@sishen ~]# stop-all.sh
stopping jobtracker
localhost: stopping tasktracker
stopping namenode
localhost: stopping datanode
localhost: stopping secondarynamenode
[root@sishen ~]# jps
3426 Jps

至此hadoop-0.20.2伪分布式安全完成!

posted on 2016-10-09 11:50 Lucky_7 阅读(...) 评论(...) 编辑 收藏

centos6.4+hadoop2.2.0 spark伪伪分布式安装

05-21

hadoop版本是2.2.0的稳定版本 下载地址rnspark版本:spark-0.9.1-bin-hadoop2 下载地址http://spark.apache.org/downloads.htmlrn这里的spark有三个版本:rnrn For Hadoop 1 (HDP1, CDH3): find an Apache mirror or direct file downloadrn For CDH4: find an Apache mirror or direct file downloadrn For Hadoop 2 (HDP2, CDH5): find an Apache mirror or direct file downloadrn我的hadoop版本是hadoop2.2.0的,所以下载的是for hadoop2rnrn关于spark的介绍可以参看http://spark.apache.org/rnApache Spark is a fast and general engine for large-scale data processing. rnrnspark运行时需要scala环境,这里下载最新版本的scala http://www.scala-lang.org/rnrnscala是一种可伸缩的语言是一种多范式的编程语言,一种类似java的编程,设计初衷是要集成面向对象编程和函数式编程的各种特性。Scala是在JVM上运行,Scala是一种纯粹的面向对象编程语言,而又无缝地结合了命令式和函数式的编程风格rnrnrnok 开始配置spark:rnrn我是在hadoop的安装用户下面安装的,所以这里直接编辑/home/hadoop/.bashrcrnrn[hadoop@localhost ~]$ cat .bashrc rn# .bashrcrnrn# Source global definitionsrnif [ -f /etc/bashrc ]; thenrn . /etc/bashrcrnfirnrnrn# User specific aliases and functionsrnexport HADOOP_HOME=/home/hadoop/hadooprnexport HBASE_HOME=/home/hadoop/hbasernexport HIVE_HOME=/home/hadoop/hivernexport HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadooprnexport YARN_HOME=/etc/home/hadooprnexport YARN_CONF_DIR=$HADOOP_HOME/etc/hadooprnexport SCALA_HOME=/home/hadoop/scalarnexport SPARK_HOME=/home/hadoop/sparkrnrnexport PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HBASE_HOME/bin:$HIVE_HOME/bin:$SCALA_HOME/bin:$SPARK_HOME/binrnexport CLASSPATH=$CLASSPATH:$HADOOP/lib:$HBASE_HOME/librnrn1.scala安装:rn将scala解压到hadoop根目录下 rnln -ls scala-2.11.0 scala#建立软链接 rnlrwxrwxrwx. 1 hadoop hadoop 12 May 21 09:15 scala -> scala-2.11.0rndrwxrwxr-x. 6 hadoop hadoop 4096 Apr 17 16:10 scala-2.11.0rnrn编辑.bashrc 加入 export SCALA_HOME=/home/hadoop/scalarnexport PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HBASE_HOME/bin:$HIVE_HOME/bin[color=#FF0000]:$SCALA_HOME/bin:[/color]$SPARK_HOME/binrn保存 并使环境变量生效 source .bashrc rn验证安装:rn[hadoop@localhost ~]$ scala -versionrnScala code runner version 2.11.0 -- Copyright 2002-2013, LAMP/EPFLrn能够正常显示版本说明安装成功rnrn2:spark配置:rntar -xzvf spark-0.9.1-bin-hadoop2.tgzrnln -s spark-0.9.1-bin-hadoop2 sparkrn然后配置.bashrc rnexport SPARK_HOME=/home/hadoop/sparkrnexport PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HBASE_HOME/bin:$HIVE_HOME/bin:$SCALA_HOME/bin:[color=#FF0000]$SPARK_HOME/bin[/color]rnrn编辑完成source .bashrc 使环境变量生效rnrnspark-env.sh配置:rnspark-env.sh是不存在的 需要从 cat spark-env.sh.template >> spark-env.sh 生成rnrn然后编辑spark-env.shrnrn加入一下内容rnexport SCALA_HOME=/home/hadoop/scalarnexport JAVA_HOME=/usr/java/jdkrnexport SPARK_MASTER=localhostrnexport SPARK_LOCAL_IP=localhostrnexport HADOOP_HOME=/home/hadoop/hadooprnexport SPARK_HOME=/home/hadoop/sparkrnexport SPARK_LIBARY_PATH=.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$HADOOP_HOME/lib/nativernexport YARN_CONF_DIR=$HADOOP_HOME/etc/hadooprnrn保存退出rnrn3.启动sparkrn跟hadoop的目录结构相似 在spark下面的sbin里边放了启动和关闭的shell文件rn-rwxrwxr-x. 1 hadoop hadoop 2504 Mar 27 13:44 slaves.shrn-rwxrwxr-x. 1 hadoop hadoop 1403 Mar 27 13:44 spark-config.shrn-rwxrwxr-x. 1 hadoop hadoop 4503 Mar 27 13:44 spark-daemon.shrn-rwxrwxr-x. 1 hadoop hadoop 1176 Mar 27 13:44 spark-daemons.shrn-rwxrwxr-x. 1 hadoop hadoop 965 Mar 27 13:44 spark-executorrn-rwxrwxr-x. 1 hadoop hadoop 1263 Mar 27 13:44 start-all.shrn-rwxrwxr-x. 1 hadoop hadoop 2384 Mar 27 13:44 start-master.shrn-rwxrwxr-x. 1 hadoop hadoop 1520 Mar 27 13:44 start-slave.shrn-rwxrwxr-x. 1 hadoop hadoop 2258 Mar 27 13:44 start-slaves.shrn-rwxrwxr-x. 1 hadoop hadoop 1047 Mar 27 13:44 stop-all.shrn-rwxrwxr-x. 1 hadoop hadoop 1124 Mar 27 13:44 stop-master.shrn-rwxrwxr-x. 1 hadoop hadoop 1427 Mar 27 13:44 stop-slaves.shrn[hadoop@localhost sbin]$ pwdrn/home/hadoop/spark/sbinrnrn这里只需要运行start-all就可以了~~~rn[hadoop@localhost sbin]$ ./start-all.sh rnrsync from localhostrnrsync: change_dir "/home/hadoop/spark-0.9.1-bin-hadoop2/sbin/localhost" failed: No such file or directory (2)rnrsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1039) [sender=3.0.6]rnstarting org.apache.spark.deploy.master.Master, logging to /home/hadoop/spark/logs/spark-hadoop-org.apache.spark.deploy.master.Master-1-localhost.outrnlocalhost: rsync from localhostrnlocalhost: rsync: change_dir "/home/hadoop/spark-0.9.1-bin-hadoop2/localhost" failed: No such file or directory (2)rnlocalhost: rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1039) [sender=3.0.6]rnlocalhost: starting org.apache.spark.deploy.worker.Worker, logging to /home/hadoop/spark/logs/spark-hadoop-org.apache.spark.deploy.worker.Worker-1-localhost.outrnrn通过jps查看启动是否成功:rn[hadoop@localhost sbin]$ jpsrn4706 Jpsrn3692 DataNodern3876 SecondaryNameNodern4637 Workerrn4137 NodeManagerrn4517 Masterrn4026 ResourceManagerrn3587 NameNodernrn可以看到有一个Master跟Worker进程 说明启动成功rn可以通过http://localhost:8080/查看spark集群状况rnrn4 运行spark自带的程序 rn首先需要进入spark下面的bin目录 :rn[hadoop@localhost sbin]$ ll ../bin/rntotal 56rn-rw-rw-r--. 1 hadoop hadoop 2601 Mar 27 13:44 compute-classpath.cmdrn-rwxrwxr-x. 1 hadoop hadoop 3330 Mar 27 13:44 compute-classpath.shrn-rwxrwxr-x. 1 hadoop hadoop 2070 Mar 27 13:44 pysparkrn-rw-rw-r--. 1 hadoop hadoop 1827 Mar 27 13:44 pyspark2.cmdrn-rw-rw-r--. 1 hadoop hadoop 1000 Mar 27 13:44 pyspark.cmdrn-rwxrwxr-x. 1 hadoop hadoop 3055 Mar 27 13:44 run-examplern-rw-rw-r--. 1 hadoop hadoop 2046 Mar 27 13:44 run-example2.cmdrn-rw-rw-r--. 1 hadoop hadoop 1012 Mar 27 13:44 run-example.cmdrn-rwxrwxr-x. 1 hadoop hadoop 5151 Mar 27 13:44 spark-classrn-rwxrwxr-x. 1 hadoop hadoop 3212 Mar 27 13:44 spark-class2.cmdrn-rw-rw-r--. 1 hadoop hadoop 1010 Mar 27 13:44 spark-class.cmdrn-rwxrwxr-x. 1 hadoop hadoop 3184 Mar 27 13:44 spark-shellrn-rwxrwxr-x. 1 hadoop hadoop 941 Mar 27 13:44 spark-shell.cmdrnrnrnrun-example org.apache.spark.examples.SparkLR spark://localhost:7077rnrnrnrun-example org.apache.spark.examples.SparkPi spark://localhost:7077rnrnrnrnrnrnrnrnrnrnrnrnrn

没有更多推荐了,返回首页

私密
私密原因:
请选择设置私密原因
  • 广告
  • 抄袭
  • 版权
  • 政治
  • 色情
  • 无意义
  • 其他
其他原因:
120
出错啦
系统繁忙,请稍后再试