文章目录
一、pandas是什么?
示例:pandas 是基于NumPy 的一种工具,该工具是为了解决数据分析任务而创建的。dataFrame 为其主要的数据类型。
二、json --> dataFrame
1. 直接使用pandas
1.1 orient=‘split’ : columns,index,data
exp: {“columns”:[“name”,“values”,“describe”], “index”:[0,1,2],
“data”:[[“aa”,123,“sgsggfsgsdfh”],[“bb”,135,“shsdhdghdgh\u3002”],[“cc”,146,“sjglasgs”]]}
df–>json: data_json = data.to_json(orient=‘split’)
json–>df: df1 = pd.read_json(data_json,orient=‘split’)
1.2 orient=‘index’,按照index转化
exp:{“0”:{“name”:“aa”,“values”:123,“describe”:“sgsggfsgsdfh”},
“1”:{“name”:“bb”,“values”:135,“describe”:“shsdhdghdgh\u3002”},
“2”:{“name”:“cc”,“values”:146,“describe”:“sjglasgs”}}
df–>json: data_json2 = data.to_json(orient=‘index’)
json–>df: df2 = pd.read_json(data_json2,orient=‘index’)
1.3 orient=‘records’
exp:[{“name”:“aa”,“values”:123,“describe”:“sgsggfsgsdfh”},
{“name”:“bb”,“values”:135,“describe”:“shsdhdghdgh\u3002”},
{“name”:“cc”,“values”:146,“describe”:“sjglasgs”}]
df–>json:data_json3 = data.to_json(orient=‘records’)
json–>df:df3=pd.read_json(data_json3, orient=‘records’)
1.4 orient=‘columns’
exp:{“name”:{“0”:“aa”,“1”:“bb”,“2”:“cc”},
“values”:{“0”:123,“1”:135,“2”:146},
“describe”:{“0”:“sgsggfsgsdfh”,“1”:“shsdhdghdgh\u3002”,“2”:“sjglasgs”}}
df–>json: data_json4 = data.to_json(orient=‘columns’)
json–>df: df4 =d.read_json(orient=‘columns’)
备注:json文件都是双引号,如果得到是一个单引号的,可以通过replace转化 data_json5 =
data_json4.replace(’"’,’’’) # 双引号转化为单引号
2. json_normalize
代码如下:
from pandas.io.json import json_normalize
data = '{"a":"value1","b":"value1"}'
json.loads(data) # 读取json文件
json_normalize(json.loads(data)) # 将json文件转化为dataFrame
# 读取文件的形式:
with open("test.json", 'r') as f:
temp = json.loads(f.read())
temp_df = json_normalize(temp)
print(temp_df.T) # 数据需要转置
json/dict -->dataFrame 推荐: df = pd.DataFrame(dict0) # dict
需要固定格式:dict0 ={‘a’:[1,2,3,4],‘b’:[‘a’,‘b’,‘c’,‘d’]} df =
pd.DataFrame(json0) # 以上json的几种格式都可以使用
代码示例1:
#
# dict-->df
dict0 ={
'a':[1,2,3,4],'b':['a','b','c','d']}
df = pd.DataFrame(dict0)
df
json0 = {
"0":{
"name":"aa","values":123,"describe":"sgsggfsgsdfh"},
"1":{
"name":"bb","values":135