在Mac下搭建Pyspark需要安装:
- Java JDK
- Scala
- apache-spark
- Hadoop
- Pyspark
安装的两种方式
1.官网下载安装包,解压后配置环境变量
2.使用brew进行安装
/bin/zsh -c "$(curl -fsSL https://gitee.com/cunkai/HomebrewCN/raw/master/Homebrew.sh)"
2-1.使用brew安装所需包
brew install scala
brew install apache-spark
brew install hadoop
2-2.安装完毕之后可以选择配置环境变量
使用vim ~/.bash_profile进入环境变量配置文件以设置环境变量
内容如下:
# HomeBrew
export HOMEBREW_BOTTLE_DOMAIN=https://mirrors.tuna.tsinghua.edu.cn/homebrew-bottles
export PATH="/usr/local/bin:$PATH"
export PATH="/usr/local/sbin:$PATH"
# HomeBrew END
#Scala
SCALA_HOME=/usr/local/scala
export PATH=$PATH:$SCALA_HOME/bin
# Scala END
# Hadoop
HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
# Hadoop END
# spark
export SPARK_PATH="/usr/local/Cellar/apache-spark/3.0.1"
export PATH="$SPARK_PATH/bin:$PATH"
# Spark End
2-3.生效环境变量
source ~/.bash_profile
2-4.安装Pyspark
pip install pyspark
运行Pyspark
在Terminal中输入pyspark,显示如下界面就大功告成啦!