1、构建一个RDD
##path指定文件所在的位置,第一个默认的路径是HDFS的路径,而且可以省略hdfs:主机名:8020/,第二个如果是linux文件的路径,那么需要写file:// + 文件的绝对路径
val textFile = sc.textFile("README.md")
org.apache.hadoop.mapred.InvalidInputException:
Input path does not exist: hdfs://spark.ibeifeng.com:8020/user/ibeifeng/README.md
val textFile = sc.textFile("file:///README.md")
org.apache.hadoop.mapred.InvalidInputException:
Input path does not exist: file:/README.md
val textFile = sc.textFile("file:///opt/modules/cdh-5.
val textFile = sc.textFile("README.md")
org.apache.hadoop.mapred.InvalidInputException:
Input path does not exist: hdfs://spark.ibeifeng.com:8020/user/ibeifeng/README.md
val textFile = sc.textFile("file:///README.md")
org.apache.hadoop.mapred.InvalidInputException:
Input path does not exist: file:/README.md
val textFile = sc.textFile("file:///opt/modules/cdh-5.