Task1

最新推荐文章于 2024-09-26 11:06:39 发布

勇敢的阿甘

最新推荐文章于 2024-09-26 11:06:39 发布

阅读量115

点赞数 1

文章标签：数学建模

本文链接：https://blog.csdn.net/m0_74453171/article/details/140705166

版权

# （1）Python创建一个数据框DataFrame：
import pandas as pd
import numpy as np
data = {'animal': ['cat', 'cat', 'snake', 'dog', 'dog', 'cat', 'snake', 'cat', 'dog', 'dog'],
        'age': [2.5, 3, 0.5, np.nan, 5, 2, 4.5, np.nan, 7, 3],
        'visits': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
        'priority': ['yes', 'yes', 'no', 'yes', 'no', 'no', 'no', 'yes', 'no', 'no']}

labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']

df = pd.DataFrame(data)
df


#（2）显示该 DataFrame 及其数据相关的基本信息：
df.describe()


（3）返回DataFrame df 的前5列数据：
df.head(5)


#（4）从 DataFrame df 选择标签列为 animal 和 age 的列
df[['animal', 'age']]


#（5）在 [3, 4, 8] 行中，列为 ['animal', 'age'] 的数据
df.loc[[3, 4, 8], ['animal', 'age']]


#（6）选择列为visits中等于3的行 （: 在这里表示选取所有列。）
df.loc[df['visits']==3, :]


#（7）选择 age 为缺失值的行
df.loc[df['age'].isna(), :]


#（8）选择 animal 是cat且age 小于 3 的行
df.loc[(df['animal'] == 'cat') & (df['age'] < 3), :]


#（9）选择 age 在 2 到 4 之间的数据（包含边界值）
df.loc[(df['age']>=2)&(df['age']<=4), :]      # 不能写联不等式，得拆开写


#（10）将 'f' 行的 age 改为 1.5
df.index = labels             # 若要对DataFrame行索引操作，需要自行创建行索引。（DataFrame默认是没有激活行索引功能）
df.loc[['f'], ['age']] = 1.5
print(df)


#（11）对 visits 列的数据求和
df['visits'].sum()


#（12）计算每种 animal age 的平均值
df.groupby(['animal'])['age'].mean()