python3.6不支持pyspark, 好在用的是Anaconda这种神器,可以随意切换python版本。因为我的Spark是1.6的,所以python2.7应该是可以的。
首先
conda create -n py27 python=2.7 anaconda source activate py27
conda install python=2.7 就将当前的python环境切换到了2.7(其实这步不做也不影响Pyspark的运行), 然后修改
/usr/local/share/jupyter/kernels/pyspark/kernel.json
{
"display_name": "PySpark",
"language": "python",
"argv": [ "/home/.../anaconda3/envs/py27/bin/python", "-m", "ipykernel", "-f", "{connection_file}" ],
"env": {
"SPARK_HOME": "/.../spark/spark-1.6.0-bin-hadoop2.6/",
"PYSPARK_PYTHON": "/.../anaconda3/envs/py27/bin/python",
"PYSPARK_DRIVER_PYTHON": "ipython2",
"PYTHONPATH": "/.../spark/spark-1.6.0-bin-hadoop2.6/python/:/.../spark/sp