49，[行列转换+分组求TopN]Python数分之Pandas训练，力扣，1194. 锦标赛优胜者-CSDN博客

本文链接：https://blog.csdn.net/qq_55006020/article/details/142772066

学习：知识的初次邂逅
复习：知识的温故知新
练习：知识的实践应用

一，原题力扣链接

. - 力扣（LeetCode）

二，题干

Players 玩家表

+-------------+-------+
| Column Name | Type  |
+-------------+-------+
| player_id   | int   |
| group_id    | int   |
+-------------+-------+
player_id 是此表的主键(具有唯一值的列)。
此表的每一行表示每个玩家的组。

Matches 赛事表

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| match_id      | int     |
| first_player  | int     |
| second_player | int     | 
| first_score   | int     |
| second_score  | int     |
+---------------+---------+
match_id 是此表的主键(具有唯一值的列)。
每一行是一场比赛的记录，first_player 和 second_player 表示该场比赛的球员 ID。
first_score 和 second_score 分别表示 first_player 和 second_player 的得分。
你可以假设，在每一场比赛中，球员都属于同一组。

每组的获胜者是在组内累积得分最高的选手。如果平局，player_id 最小的选手获胜。

编写解决方案来查找每组中的获胜者。

返回的结果表单 没有顺序要求 。

返回结果格式如下所示。

示例 1:

输入：
Players 表:
+-----------+------------+
| player_id | group_id   |
+-----------+------------+
| 15        | 1          |
| 25        | 1          |
| 30        | 1          |
| 45        | 1          |
| 10        | 2          |
| 35        | 2          |
| 50        | 2          |
| 20        | 3          |
| 40        | 3          |
+-----------+------------+
Matches 表:
+------------+--------------+---------------+-------------+--------------+
| match_id   | first_player | second_player | first_score | second_score |
+------------+--------------+---------------+-------------+--------------+
| 1          | 15           | 45            | 3           | 0            |
| 2          | 30           | 25            | 1           | 2            |
| 3          | 30           | 15            | 2           | 0            |
| 4          | 40           | 20            | 5           | 2            |
| 5          | 35           | 50            | 1           | 1            |
+------------+--------------+---------------+-------------+--------------+
输出：
+-----------+------------+
| group_id  | player_id  |
+-----------+------------+ 
| 1         | 15         |
| 2         | 35         |
| 3         | 40         |
+-----------+------------+

三，建表语句

import pandas as pd
data = [[10, 2], [15, 1], [20, 3], [25, 1], [30, 1], [35, 2], [40, 3], [45, 1], [50, 2]]
players = pd.DataFrame(data, columns=['player_id', 'group_id']).astype({'player_id':'Int64', 'group_id':'Int64'})
data = [[1, 15, 45, 3, 0], [2, 30, 25, 1, 2], [3, 30, 15, 2, 0], [4, 40, 20, 5, 2], [5, 35, 50, 1, 1]]
matches = pd.DataFrame(data, columns=['match_id', 'first_player', 'second_player', 'first_score', 'second_score']).astype({'match_id':'Int64', 'first_player':'Int64', 'second_player':'Int64', 'first_score':'Int64', 'second_score':'Int64'})

四，分析

表格大法

第一步:拆matches表,取第一次比赛的id和成绩,第二次比赛的id和成绩

第二步:把它们纵向合并为一个表

第三步:用这个表内连接players表

第四步:以 group_id,playerid分组 sum 分数

第五步:以group_id 以分数降序排名生成一个排序允许并列

第六步:取并列第一然后以group_id排序,以玩家id排名生成一个排序

第七步,去第一,然后指定映射对应的id,并且输出

解题过程

代码实现上述思路

第一步:拆matches表,取第一次比赛的id和成绩,第二次比赛的id和成绩

在pandas

第二步:把它们纵向合并为一个表

在pandas

第三步:用这个表内连接players表

在pandas

第四步:以 group_id,playerid分组 sum 分数

在pandas

第五步:以group_id 以分数降序排名生成一个排序允许并列

在pandas

第六步:取并列第一然后以group_id排序,以玩家id排名生成一个排序

在pandas

第七步,去第一,然后指定映射对应的id,并且输出

在pandas

五，Pandas解答

import pandas as pd

def tournament_winners(players: pd.DataFrame, matches: pd.DataFrame) -> pd.DataFrame:
    # 拆matches表
    df1 = matches[['first_player','first_score']].rename(columns={'first_player':'player_id','first_score':'score'})
    df2 =matches[['second_player','second_score']].rename(columns={'second_player':'player_id','second_score':'score'})
    #纵向合并这2个表
    df3 =pd.concat([df1,df2],axis=0)
    #内连接 players表
    df4 = pd.merge(df3,players,how='inner',on='player_id')
    # 以group_id,player_id分组 聚合sum 分数score
    res =df4.groupby(['group_id','player_id'])['score'].sum().reset_index()
    #以group_id分组 以sum的分数排序 降序 生成一个排名
    res['rn'] = res.groupby(['group_id'])['score'].rank(method='min',ascending=False)
    # 去排名第一的 允许并列
    res1 = res[res['rn']==1].reset_index(drop=True)
    #以group_id分组,以player_id排序 生成一个排名
    res1['rn1'] = res1.groupby(['group_id'])['player_id'].rank(method='min',ascending=True)
    #去第一  
    res2 =res1[res1['rn1']==1].reset_index(drop=True)
    #映射指定的列 并且输出
    res3 =res2[['group_id','player_id']]
    return res3

六，验证

七，知识点总结

Panadas中改名api的运用
Pandas中映射指定的列的运用
Pandas中纵向合并2个表的运用
Padnas中内连接的运用横向合并两个表的运用
Panas中多列分组聚合的运用
Pandas中实现并列排序也就是sql中rank开窗的运用
Pandas中条件过滤的运用
Pandas中实现排序不允许并列的运用
Pandas中重置索引的运用
Python函数的运用