window安装python3后怎么用pyspark_如何在Windows上的Jupyter Notebook中安装和运行PySpark...

本文将指导你在Windows上安装并使用Jupyter Notebook运行PySpark。你需要准备Spark发行版、Python和Jupyter Notebook、winutils.exe、findspark模块以及Java。设置环境变量后,解压Spark文件并将winutils.exe放在正确位置。在Jupyter Notebook中运行特定代码启动PySpark,如果成功,将看到输出结果。
摘要由CSDN通过智能技术生成

When I write PySpark code, I use Jupyter notebook to test my code before submitting a job on the cluster. In this post, I will show you how to install and run PySpark locally in Jupyter Notebook on Windows. I’ve tested this guide on a dozen Windows 7 and 10 PCs in different languages.

A. Items needed

Spark distribution from spark.apache.org

Python and Jupyter Notebook. You can get both by installing the Python 3.x version of Anaconda distribution.

winutils.exe — a Hadoop binary for Windows — from Steve Loughran’s GitHub repo. Go to the corresponding Hadoop version in the Spark distribution and find winutils.exe under /bin. For example, https://github.com/steveloughran/winutils/blob/master/hadoop-2.7.1/bin/winutils.exe .

The findspark Python module, which can be installed by running python -m pip install findspark eit

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值