import numpy as np
import pandas as pd
data = [[1,None],[4,5][None,None],[8,9],[3,4]]
Traceback (most recent call last):
File “<pyshell#2>”, line 1, in
data = [[1,None],[4,5][None,None],[8,9],[3,4]]
TypeError: list indices must be integers or slices, not tupledata = [[1,None],[4,5],[None,None],[8,9],[3,4]]
df = pd.DataFrame(data,columns=[‘a’,‘b’])
df
a b
0 1.0 NaN
1 4.0 5.0
2 NaN NaN
3 8.0 9.0
4 3.0 4.0df.head() #默认显示前5行
a b
0 1.0 NaN
1 4.0 5.0
2 NaN NaN
3 8.0 9.0
4 3.0 4.0df.tail() #后5行
a b
0 1.0 NaN
1 4.0 5.0
2 NaN NaN
3 8.0 9.0
4 3.0 4.0df.info() #数据基础信息
<class ‘pandas.core.frame.DataFrame’>
RangeIndex: 5 entries, 0 to 4
Data columns (total 2 columns):
a 4 non-null float64
b 3 non-null float64
dtypes: float64(2)
memory usage: 120.0 bytesdf.describe() #数据汇总
a b
count 4.00000 3.000000
mean 4.00000 6.000000
std 2.94392 2.645751
min 1.00000 4.000000
25% 2.50000 4.500000
50% 3.50000 5.000000
75% 5.00000 7.000000
max 8.00000 9.000000df.count()
a 4
b 3
dtype: int64df.mean()
a 4.0
b 6.0
dtype: float64df.sum()
a 16.0
b 18.0
dtype: float64df.sum(axis=1)
0 1.0
1 9.0
2 0.0
3 17.0
4 7.0
dtype: float64df
a b
0 1.0 NaN
1 4.0 5.0
2 NaN NaN
3 8.0 9.0
4 3.0 4.0df.cumsum() #累加求和
a b
0 1.0 NaN
1 5.0 5.0
2 NaN NaN
3 13.0 14.0
4 16.0 18.0df.std() #标准差
a 2.943920
b 2.645751
dtype: float64df.var() #方差
a 8.666667
b 7.000000
dtype: float64df.max()
a 8.0
b 9.0
dtype: float64df.min()
a 1.0
b 4.0
dtype: float64df.quantile(0.5) #中位数,可改数值
a 3.5
b 5.0
Name: 0.5, dtype: float64
Pandas数据汇总
最新推荐文章于 2023-10-24 15:48:44 发布