引言
dataframe
是pandas的数据类型;
ndarray
是numpy的数据类型;
list
和dict
是python的数据类型;
series
是pandas的一种数据类型,Series是一个定长的,有序的字典,因为它把索引和值映射起来了。
通过以下例子,可以更加清楚它们的数据表示。
1. list to others
import numpy as np
import pandas as pd
from pandas import Series, DataFrame
# list
data = [[2000, 'Ohino', 1.5],
[2001, 'Ohino', 1.7],
[2002, 'Ohino', 3.6],
[2001, 'Nevada', 2.4],
[2002, 'Nevada', 2.9]] # type(data) 为 list
# list to series
ser = Series(data, index = ['one', 'two', 'three', 'four', 'five'])
# list to dataframe
df = DataFrame(data, index = ['one', 'two', 'three', 'four', 'five'], columns = ['year', 'state', 'pop'])
# list to array
ndarray = np.array(data)
运行结果:
# Series
one [2000, Ohino, 1.5]
two [2001, Ohino, 1.7]
three [2002, Ohino, 3.6]
four [2001, Nevada, 2.4]
five [2002, Nevada, 2.9]
dtype: object
# dataframe
year state pop
one 2000 Ohino 1.5
two 2001 Ohino 1.7
three 2002 Ohino 3.6
four 2001 Nevada 2.4
five 2002 Nevada 2.9
# ndarray
[['2000' 'Ohino' '1.5']
['2001' 'Ohino' '1.7']
['2002' 'Ohino' '3.6']
['2001' 'Nevada' '2.4']
['2002' 'Nevada' '2.9']]
2. ndarray to others
# array to dataframe
pd = DataFrame(ndarray, index = ['one', 'two', 'three', 'four',
'five'],
columns = ['year', 'state', 'pop'])
# ndarray to list
mylist = ndarray.tolist()
3. dict to others
import numpy as np
import pandas as pd
from pandas import Series, DataFrame
# dict
data = { 'name': ['Li', 'Zhang', 'Wang'],
'year': [2000, 2001, 2002]} # type(data) 为 dict
# dict to series
# 若不指定 index,data 的 key 充当 Series 的 index
ser = Series(data)
print('ser\n', ser)
# dict to dataframe
# 若不指定 columns,data 的 key 充当 DataFrame 的 columns
df = DataFrame(data)
print('df\n', df)
4. Series to others
如果把DataFrame取一列就是Series格式了。
# series to np array
# 需要pandas version 0.24以上
arr = ser.to_numpy()
# 或者
arr = np.array(ser)
# Series转换成dict
dt = ser.to_dict()
5. DataFrame to others
# dataframe
data = DataFrame(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]), columns=['a', 'b', 'c'])
print(data)
# dataframe to array
arr = data.values
print('arr\n', arr)
print(type(arr))
# dataframe to dict
dict = data.to_dict()
print(dict)
DataFrame.to_dict(self, orient='dict', into=<class 'dict'>)
还可以转换成 list,series
等:
orient : str {‘dict’, ‘list’, ‘series’, ‘split’, ‘records’, ‘index’}
Determines the type of the values of the dictionary.
‘dict’ (default) : dict like {column -> {index -> value}}
‘list’ : dict like {column -> [values]}
‘series’ : dict like {column -> Series(values)}
‘split’ : dict like {‘index’ -> [index], ‘columns’ -> [columns], ‘data’ -> [values]}
‘records’ : list like [{column -> value}, … , {column -> value}]
‘index’ : dict like {index -> {column -> value}}
Abbreviations are allowed. s indicates series and sp indicates split.
最近开通了个公众号,主要分享python原理与应用,推荐系统,风控等算法相关的内容,感兴趣的伙伴可以关注下。
公众号相关的学习资料会上传到QQ群596506387,欢迎关注。
参考: