windows下运行spark,hadoop,并简单实现伪集群环境

windows下运行spark,hadoop,并简单实现伪集群环境

材料准备

  • hadoop-3.2.2

  • spark-3.2.1

  • java 1.8.0_291

  • scala 2.12.10

安装步骤

1.安装java1.8

2.安装scala

直接下载scala.msi安装后

scala -version

=》 Scala code runner version 2.12.10

3.安装hadoop

将安装包以管理员权限解压后,配置环境变量

HADOOP_HOME F:/hadoop(位置自己搞)

在path中配置如下

X:/hadoop/bin

4.安装spark

解压安装包后,添加到path里面的内容

X:/spark/bin

安装补丁

1.winutils

运行

git https://github.com/cdarlint/winutils.git

将hadoop-3.2.2/bin中的内容全部拷贝至步骤3中对应文件夹内

2.spark设置

进入spark安装路径

X:\spark\conf

将文件 spark-defaults.conf.template修改为spark-defaults.conf

修改内容

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# Default system properties included when running spark-submit.
# This is useful for setting default environmental settings.

# Example:
# spark.master                     spark://master:7077
spark.eventLog.enabled           true
spark.eventLog.dir               file:///X://spark//log
# spark.serializer                 org.apache.spark.serializer.KryoSerializer
# spark.driver.memory              5g
# spark.executor.extraJavaOptions  -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"
spark.history.fs.logDirectory    file:///X://spark//log

注意:file:///X://spark 是安装路径 新建一个log文件夹

3.配置脚本

注意 需要修改ip地址

IP地址获取如下

cmd中输入

ipconfig

找到对应的ipv4地址将其替换给如下ip即可

分别创建如下文件在spark安装路径中 X:\spark\bin

spark-start.bat(如果中文乱码 别用utf-8保存)

echo 关闭端口
netstat -ano|findstr "8080"
netstat -ano|findstr "7077"
netstat -ano|findstr "18080"

echo 打开master
start "master" cmd /k call master.bat
timeout /t 8
echo 打开slave1
start "slave1" cmd /k call slave1.bat
timeout /t 2

echo 打开slave2
start "slave2" cmd /k call slave2.bat
timeout /t 2

echo 打开history
start "historyserver" cmd /k call historyserver.bat
timeout /t 5

echo 打开页面
start http://ip:8080
timeout /t 1
start http://ip:18080

master.bat

spark-class org.apache.spark.deploy.master.Master

slave1.bat

spark-class org.apache.spark.deploy.worker.Worker ip:7077

slave2.bat

spark-class org.apache.spark.deploy.worker.Worker ip:7077

historyserver.bat

spark-class org.apache.spark.deploy.history.HistoryServer

验证安装

进入cmd后输入

spark-start

如果弹出浏览器页面为2 并且可以正常浏览 安装成功

任务提交

spark-submit --class aaa --master spark://ip:7077 jar
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值