spark笔记
迷茫猿小明
单纯的java程序员->AI算法数据挖掘工程师
展开
-
py4j.protocol.Py4JJavaError: An error...(pyspark aws s3读取数据配置)
bug提示py4j.protocol.Py4JJavaError: An error occurred while calling o29.csv.java.lang.IllegalAccessError: tried to access method org.apache.hadoop.metrics2.lib.MutableCounterLong问题原因pyspark安装配置好读取...原创 2018-12-12 22:45:08 · 8808 阅读 · 0 评论 -
Can't assign requested address: Service 'sparkDriver' failed(pyspark启动失败)
bug描述py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.java.net.BindException: Can’t assign requested address: Service ‘sparkDriver’ faile...原创 2018-12-12 22:59:46 · 4213 阅读 · 1 评论 -
pyspark read.csv一个小坑(转义符居然是")
1.bug描述下面代码一般可正常读取本地csv文件from pyspark.sql import SparkSessionspark = SparkSession.builder.getOrCreate()df = spark.read.csv('my_test.csv', header=True)print(df)但是最近用GA数据库时,sql查询数据转成csv后。用上述代码读取文...原创 2018-12-12 23:33:55 · 4807 阅读 · 3 评论