1.安装JAVA8,设置JAVA_HOME
2.安装SPARK,设置SPARK_HOME
下载地址
https://spark.apache.org/downloads.html
3.在anaconda中安装相应包
pip install py4j
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple pip -U pyspark
pip install findspark
4.打开Jupyter Notebook,编写测试代码:
import pyspark
from pyspark import SparkContext, SparkConf
import findspark
findspark.init()
conf = SparkConf().setAppName("test").setMaster("local[4]")
sc = SparkContext(conf=conf)
print("spark version:", pyspark.__version__)
rdd = sc.parallelize(["hello", "spark"])
print(rdd.collect())