腾讯服务器上伪分布式集群搭建
0.修改hostname
vi /etc/hostname
sudo hostname 修改名 (临时有效)
reboot
1.安装JDK1.7
tar -zxvf ... -C ...
配置环境变量(/etc/profile):
JAVA_HOME=/opt/modules/jdk1.7.0_80
PATH=$PATH:$JAVA_HOME/bin
source /etc/profile
2.安装hadoop2.5
tar -zxvf ... -C ...
配置环境变量:
**hadoop-env.sh**
export JAVA_HOME=/opt/modules/jdk1.7.0_80
**core-site.xml**
<property>
<name>fs.defaultFS</name>
<value>hdfs://node1:8020</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/modules/hadoop-2.5.1/data/tmp</value>
</property>
<property>
<name>fs.trash.interval</name>
<value>604800</value>
</property>
**hdfs-site.xml**
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
**yarn-site.xml**
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>node1</value>
</property>
**slaves**
hostname
**mapred-env.sh**
export JAVA_HOME=/opt/modules/jdk1.7.0_80
**mapred-site.xml**
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
3.启动
sbin/hadoop-daemon.sh start namenode
sbin/hadoop-daemon.sh start datanode
bin/yarn-daemon.sh start resourcemanager
bin/yarn-daemon.sh start nodemanager
makdir ..
touch wc.input
vi wc.input
bin/hdfs dfs -makdir -p tmp/wc.input
bin/hdfs dfs -put tmp/wc.input /user/lg/tmp/wordcount/input
bin/hdfs dfs -ls /user/lg/tmp/wordcount/input
bin/hdfs dfs -cat /user/lg/tmp/wordcount/input
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar wordcount /user/lg/tmp/wordcount/input /tmp/output
http://134.175.166.121:50070/
http://134.175.166.121:8088/
4.上传spark 环境
scala1.10 spark2.6
5.解压 配置环境变量
SCALA_HOME=.....
PATH=$PATH:$SCALA_HOME/bin
source
6.测试scala
sbin/hadoop-daemon.sh stop namenode
sbin/hadoop-daemon.sh stop datanode
bin/yarn-daemon.sh stop resourcemanager
bin/yarn-daemon.sh stop nodemanager
scala -version
scala
...
7.Local模式的spark
进入解压的目录
bin/spark-shell
http://134.175.166.121:4040/
val tmp = sc.textFile("README.MC").count
8.Standalone
依赖 java hadoop scala spark
修改配置文件
**log4j.properties**
改名
**slaves**
node1
**spark-env.sh**
sbin/hadoop-daemon.sh stop namenode
sbin/hadoop-daemon.sh stop datanode
sbin/start-marst.sh
start/start-slaves.sh
http://134.175.166.121:8080/