1、下载livy https://livy.incubator.apache.org/
解压,进入livy文件夹,然后运行bin/livy-server
2、安装sparkmagic
pip install sparkmagic
jupyter nbextension enable --py --sys-prefix widgetsnbextension
下边是可选部分:pip show sparkmagic
,会出现类似如下结果:
Name: sparkmagic
Version: 0.12.5
Summary: SparkMagic: Spark execution via Livy
Home-page: https://github.com/jupyter-incubator/sparkmagic
Author: Jupyter Development Team
Author-email: jupyter@googlegroups.org
License: BSD 3-clause
Location: /Users/xiligey/anaconda3/lib/python3.6/site-packages
Requires: tornado, notebook, autovizwidget, hdijupyterutils, nose, ipykernel, ipywidgets, pandas, requests-kerberos, numpy, mock, ipython, requests
Required-by:
进入到上边的location,运行
jupyter-kernelspec install sparkmagic/kernels/sparkkernel
jupyter-kernelspec install sparkmagic/kernels/pysparkkernel
jupyter-kernelspec install sparkmagic/kernels/pyspark3kernel
jupyter-kernelspec install sparkmagic/kernels/sparkrkernel
3、启动jupyter notebookjupyter notebook
然后新建一个Spark
4、遇到的问题
在新建一个spark notebook之后报错:
The code failed because of a fatal error:
Failed to register auto viz for notebook.
Exception details:
"cannot import name 'DataError'".
Some things to try:
a) Make sure Spark has enough available resources for Jupyter to create a Spark context.
b) Contact your Jupyter administrator to make sure the Spark magics library is configured correctly.
c) Restart the kernel.
原因是目前而言,sparkmagic最新版和python pandas最新版本不兼容
可以回退一下pandas版本来解决报错:pip install pandas==0.22.0
最终的界面:
然后就可以开心的用spark交互界面码代码了:)