pandas的两个主要数据结构:Series和DataFrame
Series是一种类似于一维数组的对象,由一组数据和一组数据标签(即索引)组成。
Series对象创建方法:
①直接传入值列表,创建默认0到N-1(N为数组的长度)整数索引Series对象
In [1]: import numpy as np
In [2]: import pandas as pd
In [3]: pd.Series([1,3,5,np.nan,6,8])
Out[3]:
0 1.0
1 3.0
2 5.0
3 NaN
4 6.0
5 8.0
dtype: float64
②创建指定数据标签的Series对象
In [8]: pd.Series([1,3,5,np.nan,6,8],index=['A','B','C','D','E','F'])
Out[8]:
A 1.0
B 3.0
C 5.0
D NaN
E 6.0
F 8.0
dtype: float64
③利用字典创建Series对象,字典的键当作Series对象的索引(有序排列)
In [13]: data = {'A':2,'B':4,'D':8,'C':16}
In [14]: s2 = pd.Series(data)
In [15]: s2
Out[15]:
A 2
B 4
C 16
D 8
dtype: int64
④字典键根据指定索引匹配创建Series对象,匹配上的取字典values,否则取Nan
In [16]: idx = ['A','B','D','G']
In [17]: s3 = pd.Series(data,index=idx)
In [18]: s3
Out[18]:
A 2.0
B 4.0
D 8.0
G NaN
dtype: float64
Series对象属性:
①shape属性获取数组的形状
In [9]: s = pd.Series([1,3,5,np.nan,6,8])
In [10]: s.shape
Out[10]: (6,)
②dtype属性获取数组的数据类型
In [12]: s = pd.S