text_file = sc.textFile("hdfs://...")
counts = text_file.flatMap(lambda x: x.split(" ")) \
.map(lambda x: (x, 1)) \
.reduceByKey(lambda a, b: a + b)
counts.saveAsTextFile("hdfs://...")
spark wordcount
最新推荐文章于 2024-07-03 15:06:34 发布