Pandas提供3种基本数据结构
1.Series: 带标签的一维数组
2.DataFrame: 带标签的二维数组(即表格)
3.Panel: 带标签的 3维数组
pd.version
Out[10]: ‘0.20.1’
s2=Series([10,20,30],index=[‘a’,‘b’,‘c’])
s2
创建一个以为数组,可以用 标签 或者 需要下标访问这个数组
s1.head()
Out[30]:
0 10
1 20
2 30
dtype: int64
s9.head()
Out[31]:
0 NaN
1 1.0
2 0.0
3 2.0
dtype: float64
s2.head()
Out[32]:
a 10
b 20
c 30
dtype: int64
s2.value_counts()
Out[33]:
10 1
30 1
20 1
dtype: int64
s2.index.value_counts()
Out[34]:
b 1
a 1
c 1
dtype: int64
dtype: int64
frame = DataFrame(np.arange(12).reshape(3,4),index=[‘a’,‘b’,‘c’],columns=[‘c1’,‘c2’,‘c3’,‘c4’])
frame
Out[40]:
c1 c2 c3 c4
a 0 1 2 3
b 4 5 6 7
c 8 9 10 11
In[44]: frame
Out[44]:
c1 c2 c3 c4
a 0 1 2 3
b 4 5 6 7
c 8 9 10 11
frame.index
Out[45]: Index([‘a’, ‘b’, ‘c’], dtype=‘object’)
frame[‘a’]
获得数据的描述:
frame.describe()
Out[47]:
c1 c2 c3 c4
count 3.0 3.0 3.0 3.0
mean 4.0 5.0 6.0 7.0
std 4.0 4.0 4.0 4.0
min 0.0 1.0 2.0 3.0
25% 2.0 3.0 4.0 5.0
50% 4.0 5.0 6.0 7.0
75% 6.0 7.0 8.0 9.0
max 8.0 9.0 10.0 11.0