ELK学习笔记:1- 单台虚拟机部署spark2.4.5+miniconda3-2023-2-8

1- 单台虚拟机部署spark2.4.5

miniconda部署

  • 安装包(清华源)https://mirrors.bfsu.edu.cn/anaconda/miniconda/
sh /software/package/Miniconda3-py37_4.9.2-Linux-x86_64.sh
# 回车
# 空格翻页
# yes
# /software/server/miniconda3
==========稍等一下下=============
# yes
  • 断开连接、再次连接、前面出现(base)
  • 切换国内源,创建~/.condarc
vim ~/.condarc
==============空文件、新增如下内容=================
channels:
  - defaults
show_channel_urls: true
default_channels:
  - http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
  - http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/r
  - http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/msys2
custom_channels:
  conda-forge: http://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  msys2: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  bioconda: http://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  menpo: http://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  pytorch: http://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  simpleitk: http://mirrors.tuna.tsinghua.edu.c/anaconda/cloud
  • 下载其他数据分析相关的python包
pip install jupyter pandas scikit-learn pyspark==2.4.5 -i https://pypi.tuna.tsinghua.edu.cn/simple
  • jupyter配置本地浏览器访问
jupyter notebook --generate-config
# Writing default config to: /root/.jupyter/jupyter_notebook_config.py
jupyter notebook password
# [NotebookPasswordApp] Wrote hashed password to /root/.jupyter/jupyter_notebook_config.json
vim /root/.jupyter/jupyter_notebook_config.py
==============查找 yy复制 p粘贴 i写入 去掉'#'并修改即可=================
c.NotebookApp.allow_remote_access = True
c.NotebookApp.open_browser = False
c.NotebookApp.ip = '*'
c.NotebookApp.allow_root = True
c.NotebookApp.port = 8888 #端口可以更改
=====================================
# jupyter notebook 即可打开8888端口 ss -ntl可查看

spark部署

  • jdk1.8
# 1 卸载openjdk
# 运行rpm -qa | grep java找到如下包、如果有数字不同需要改一下
rpm -e --nodeps java-1.8.0-openjdk-1.8.0.262.b10-1.el7.x86_64
rpm -e --nodeps java-1.7.0-openjdk-1.7.0.261-2.6.22.2.el7_8.x86_64
rpm -e --nodeps java-1.7.0-openjdk-headless-1.7.0.261-2.6.22.2.el7_8.x86_64
rpm -e --nodeps java-1.8.0-openjdk-headless-1.8.0.262.b10-1.el7.x86_64
# 2 安装
mkdir /software/server/java
tar -zxvf /software/package/jdk-8u221-linux-x64.tar.gz -C /software/server/java
  • hadoop2.7.7
mkdir /software/server/hadoop
tar -zxvf /software/package/hadoop-2.7.7.tar.gz -C /software/server/hadoop
  • spark2.4.5
mkdir /software/server/spark
tar -zxvf /software/package/spark-2.4.5-bin-hadoop2.7.tgz -C /software/server/spark
  • 配置环境变量
vi /etc/profile
==============末尾添加如下内容=================
export JAVA_HOME=/software/server/java/jdk1.8.0_221
export HADOOP_HOME=/software/server/hadoop/hadoop-2.7.7
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export SPARK_HOME=/software/server/spark/spark-2.4.5-bin-hadoop2.7
export PYSPARK_PYTHON=/software/server/miniconda3/bin/python3.7
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$JAVA_HOME/bin
==========================
source /etc/profile
  • 命令行测试
cd /software/server/spark/spark-2.4.5-bin-hadoop2.7/bin
./pyspark --master local[*]
=============等待、直到出现">>>"、然后输入如下内容=================
sc.parallelize([1,2,3,4,5]).map(lambda x:x+10).collect()
# 输出 [11, 12, 13, 14, 15] 即为成功
# exit()  退出
  • jupyter测试
mkdir ~/test-2023-2
cd ~/test-2023-2
echo "hello world" > word.txt  # 创建测试文件
jupyter notebook
# 在本地打开浏览器 192.168.174.131:8888
from pyspark import SparkContext, SparkConf
conf = SparkConf().setMaster("local[*]").setAppName("test01")
sc = SparkContext(conf=conf)
sc.parallelize([1,2,3,4,5]).map(lambda x:x+10).collect()
# [11, 12, 13, 14, 15]
sc.textFile("./word.txt").flatMap(lambda x: x.split(" ")).collect()
# ['hello', 'world']
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值