数据类型-DataFrame
- DataFrame是由多个Series数据列组成的表格数据类型,每行Series值都增加了一个共用的索引
- 既有行索引,又有列索引
- 行索引,表明不同行,横向索引,叫index,0轴,axis=0
- 列索引,表名不同列,纵向索引,叫columns,1轴,axis=1
- DataFrame数据类型可视为:二维 带标签 数组
- 每列值的类型可以不同
- 基本操作类似Series,依据行列索引操作
- 常用于表达二维数据,但也可以表达多维数据(Dataframe嵌套,极少用)
DataFrame数据类型创建
Python list列表 创建DataFrame
import pandas as pd
df = pd.DataFrame([True, 1, 2.3, 'a', '你好'])
df
|
0 |
0 |
True |
1 |
1 |
2 |
2.3 |
3 |
a |
4 |
你好 |
df = pd.DataFrame([[True,1,2.3,'a','你好'],[1,2,3,4,5]])
df
|
0 |
1 |
2 |
3 |
4 |
0 |
True |
1 |
2.3 |
a |
你好 |
1 |
1 |
2 |
3.0 |
4 |
5 |
df = pd.DataFrame([[[True,1,2.3,'a','你好'],
[1,2,3,4,5]],
[[True,1,2.3,'a','你好'],
[1,2,3,4,5]]
])
df
|
0 |
1 |
0 |
[True, 1, 2.3, a, 你好] |
[1, 2, 3, 4, 5] |
1 |
[True, 1, 2.3, a, 你好] |
[1, 2, 3, 4, 5] |
Python 字典 创建DataFrame
df = pd.DataFrame({
'one':[1,2,3,4],
'two':[9,8,7,6]})
df
|
one |
two |
0 |
1 |
9 |
1 |
2 |
8 |
2 |
3 |
7 |
3 |
4 |
6 |
df = pd.DataFrame({
'one':[1,2,3,4],
'two':[9,8,7,6]},index = ['a','b','c','d'])
df
|
one |
two |
a |
1 |
9 |
b |
2 |
8 |
c |
3 |
7 |
d |
4 |
6 |
df = pd.DataFrame({
'A' : 1,
'B' : 2.3,
'C' : ['x','y',5]
})
df
|
A |
B |
C |
0 |
1 |
2.3 |
x |
1 |
1 |
2.3 |
y |
2 |
1 |
2.3 |
5 |
dt = {
'one' : pd.Series([1,2,3],index=['a','b','c']),
'two' : pd.Series([9,8,7,6],index=['a','b','c','d',])
}
dt
{'one': a 1
b 2
c 3
dtype: int64, 'two': a 9
b 8
c 7
d 6
dtype: int64}
d = pd.DataFrame(dt)
d
|
one |
two |
a |
1.0 |
9 |
b |
2.0 |
8 |
c |
3.0 |
7 |
d |
NaN |
6 |
d_2 = pd.DataFrame(dt,index=['b','c','d'],columns=['two','three'])
d_2
|
two |
three |
b |
8 |
NaN |
c |
7 |
NaN |
d |
6 |
NaN |
ndarray数组 创建DataFrame
import numpy as np
df = pd.DataFrame(np.arange(10).reshape(2,5))
df
|
0 |
1 |
2 |
3 |
4 |
0 |
0 |
1 |
2 |
3 |
4 |
1 |
5 |
6 |
7 |
8 |
9 |