How to read Json file or text file in Spark

Currently spark is most popular so that I want to learn it. Not because deal with data speed more than hadoop but also used in MLlib,GraphX and so on. And spark support Python language, so I can use Pyspark to study.

Today I try to read text used pyspark, you also can refer the https://spark.apache.org/docs/latest/sql-programming-guide.html#creating-dataframes, I also study from the spark page.

一、Read Text File used Pyspark

Step1. I create a new text file called people01.txt in the /home/cindy/file. The file content as below:

 

Step2. See the detail as below picture, the code come from https://spark.apache.org/docs/latest/sql-programming-guide.html#creating-dataframes, you can refer them:

 

二、Read Json File used Pyspark

Read Json file is more easily than read Text file in spark.

Step1. Create Json File as below:

Step2. Read the Json File and print the result:

By the way, we also can type code: >>>data = [('Alice',15),('Bob',20)   to create a simply DataFrame to study.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值