1.问题现象:
Caused by: org.apache.spark.SparkException:
Error from python worker:
/usr/bin/python: No module named pyspark
PYTHONPATH was:
/root/hadoop/tmp/nm-local-dir/usercache/root/filecache/22/__spark_libs__297119794367585949.zip/spark-core_2.11-2.4.0.jar
org.apache.spark.SparkException: No port number in pyspark.daemon's stdout
at org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:204)
at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:122)
at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:95)
at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:117)
at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:108)
at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:65)
at org.apache.spark.rdd.RDD.computeOrRead