欧洲杯数据知识点-基础pandas练习

最新推荐文章于 2022-08-07 16:15:29 发布

MaeveShi

最新推荐文章于 2022-08-07 16:15:29 发布

阅读量724

点赞数

分类专栏： pandas基础练习

本文链接：https://blog.csdn.net/MaeveShi/article/details/107658329

版权

pandas基础练习专栏收录该内容

4 篇文章 0 订阅

订阅专栏

文章链接

对数据按照先red cards，后yellow cards排序：

import pandas as pd
import numpy as np
euroData = pd.read_csv('/Users/Desktop/十套python练习/exercise_data/Euro2012_stats.csv)
euroData.sort_values(by = ['Red Cards', 'Yellow Cards'], ascending =False)

pandas.Dataframe.sort_values(by, axis = 0, ascending = True, inlace = False)

by: str or list of str
Name or list of names to sort by.
if axis is 0 or ‘index’ then by may contain index levels and/or column labels.
if axis is 1 or ‘columns’ then by may contain column levels and/or index labels.
Changed in version 0.23.0: Allow specifying index or column level names.
注意: 当by为列表时按照列表中的顺序排列
axis: {0 or ‘index’, 1 or ‘columns’}, default 0
Axis to be sorted.
ascending: bool or list of bool, default True
Sort ascending vs. descending. Specify list for multiple sort orders. If this is a list of bools, must match the length of the by.

官网链接

选取以球队名称以字母G开头的行

euroData[euroData.Team.str.startswith('G')]

通过str访问数据：1. 可以对series和dataframe使用；2. 会自动过滤NaN

（1）lower、upper、startswith、endswith、len
（2）strip
（3）replace
（4）split、rsplit

找到England、Itaty和Russia的正射率（shooting accuracy）

euroData.loc[euroData.Team.isin(['England', 'Itaty','Russia']), ['Team', 'Shooting Accuracy']]

pandas.Dataframe.isin(values)

values: iterable, Series, DataFrame or dict
当为list时，判断list中的元素是否存在于dataframe中；
当为dic时，可以对指定列进行元素的判断；
当为series或dataframe时，index和column必须与原dataframe对应。
The result will only be true at a location if all the labels match. If values is a Series, that’s the index. If values is a dict, the keys must be the column names, which must match. If values is a DataFrame, then both the index and column labels must match.
returns： dataframe
DataFrame of booleans showing whether each element in the DataFrame is contained in values.

例

#values.type = list，直接判断每个元素是否存在于list中
df = pd.DataFrame({'A': [1, 2, 3],
                      'B':['a', 'b', 'f']
                      })
df.isin([1, 3, 12, 'a'])

output:
A B
0 True True
1 False False
2 True False

#values.type = dic，按照顺序判断dic的index是否存在，若不存在全部输出false
df = pd.DataFrame({'A': [1, 2, 3],
                      'B':[1, 4, 7]
                      })
df.isin({
    'A':[1, 3],
    'B':[4, 2]
})

output：
A B
0 True False
1 False True
2 True False
若isin中的 ‘B’:[4, 2]改为 ‘D’:[4, 2]，则输出的B列全为false

#values.type = dataframe,要求列名存在的同时需要对应行元素一一匹配才输出True
df = pd.DataFrame({'A': [1, 2, 3],
                      'B':[1, 4, 7]
                      })
df1 = pd.DataFrame({
    'A':[1, 3],
    'B':[4, 2]
})
df.isin(df1)

output:
A B
0 True False
1 False False
2 False False

df = pd.DataFrame({'A': [1, 2, 3],
                      'B':[1, 4, 7]
                      })
df1 = pd.DataFrame({
    'A':[1, 2, 3],
    'B':[4, 2, 7]
})   
df.isin(df1)

output:
A B
0 True False
1 True False
2 True True

pandas.Dataframe.loc

可同时取行列，也可以取Boolean值，注意同时取的时候使用切片操作[：]前后都包含
官网举例详细，详见：tutorials_df.loc

MaeveShi

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
欧洲杯数据知识点-基础pandas练习

文章链接对数据按照先red cards，后yellow cards排序：import pandas as pdimport numpy as npeuroData = pd.read_csv('/Users/Desktop/十套python练习/exercise_data/Euro2012_stats.csv)euroData.sort_values(by = ['Red Cards', 'Yellow Cards'], ascending =False)pandas.Dataframe.
复制链接

扫一扫