唉,这玩意怎么就变成玄学了呢… 还是我太菜啊;
百度说:要把pyarrow降到0.14.0,就奇了怪了,我一安装0.14.0,我的程序就完全找不见pyarrow了…
因此一怒之下还是直接用最新的pyarrow7.0.0版本;
然后根据百度提示,设置spark_session:
spark_session = SparkSession.builder \
.master("yarn") \
.config('spark.yarn.appMasterEnv.ARROW_PRE_0_15_IPC_FORMAT',1)\
.config('spark.executorEnv.ARROW_PRE_0_15_IPC_FORMAT',1)
spark = spark_session.getOrCreate()
如果是yarn的话则直接添加:
.config('spark.yarn.appMasterEnv.ARROW_PRE_0_15_IPC_FORMAT',1)\
.config('spark.executorEnv.ARROW_PRE_0_15_IPC_FORMAT',1)
如果是local的话则改为:
.config('spark.local.appMasterEnv.ARROW_PRE_0_15_IPC_FORMAT',1)\
.config('spark.executorEnv.ARROW_PRE_0_15_IPC_FORMAT',1)
解决!