35，Python数分之Pandas训练，力扣，1082. 销售分析 I

凡梦_leo

于 2024-09-30 15:17:34 发布

阅读量347

点赞数 8

分类专栏：数分之Pandas实战训练文章标签： python pandas leetcode 算法大数据数据库开发语言

本文链接：https://blog.csdn.net/qq_55006020/article/details/142657157

版权

数分之Pandas实战训练专栏收录该内容

36 篇文章 0 订阅

订阅专栏

学习：知识的初次邂逅
复习：知识的温故知新
练习：知识的实践应用

一，原题力扣链接

. - 力扣（LeetCode）

二，题干

产品表：Product

+--------------+---------+
| Column Name  | Type    |
+--------------+---------+
| product_id   | int     |
| product_name | varchar |
| unit_price   | int     |
+--------------+---------+
product_id 是这个表的主键(具有唯一值的列)。
该表的每一行显示每个产品的名称和价格。

销售表：Sales

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| seller_id   | int     |
| product_id  | int     |
| buyer_id    | int     |
| sale_date   | date    |
| quantity    | int     |
| price       | int     |
+------ ------+---------+
这个表它可以有重复的行。 
product_id 是 Product 表的外键(reference 列)。
该表的每一行包含关于一个销售的一些信息。

编写解决方案，找出总销售额最高的销售者，如果有并列的，就都展示出来。

以 任意顺序 返回结果表。

返回结果格式如下所示。

示例 1:

输入：
Product 表：
+------------+--------------+------------+
| product_id | product_name | unit_price |
+------------+--------------+------------+
| 1          | S8           | 1000       |
| 2          | G4           | 800        |
| 3          | iPhone       | 1400       |
+------------+--------------+------------+
Sales 表：
+-----------+------------+----------+------------+----------+-------+
| seller_id | product_id | buyer_id | sale_date  | quantity | price |
+-----------+------------+----------+------------+----------+-------+
| 1         | 1          | 1        | 2019-01-21 | 2        | 2000  |
| 1         | 2          | 2        | 2019-02-17 | 1        | 800   |
| 2         | 2          | 3        | 2019-06-02 | 1        | 800   |
| 3         | 3          | 4        | 2019-05-13 | 2        | 2800  |
+-----------+------------+----------+------------+----------+-------+
输出：
+-------------+
| seller_id   |
+-------------+
| 1           |
| 3           |
+-------------+
解释：Id 为 1 和 3 的销售者，销售总金额都为最高的 2800。

三，建表语句

import pandas as pd
data = [[1, 'S8', 1000], [2, 'G4', 800], [3, 'iPhone', 1400]]
product = pd.DataFrame(data, columns=['product_id', 'product_name', 'unit_price']).astype({'product_id':'Int64', 'product_name':'object', 'unit_price':'Int64'})
data = [[1, 1, 1, '2019-01-21', 2, 2000], [1, 2, 2, '2019-02-17', 1, 800], [2, 2, 3, '2019-06-02', 1, 800], [3, 3, 4, '2019-05-13', 2, 2800]]
sales = pd.DataFrame(data, columns=['seller_id', 'product_id', 'buyer_id', 'sale_date', 'quantity', 'price']).astype({'seller_id':'Int64', 'product_id':'Int64', 'buyer_id':'Int64', 'sale_date':'datetime64[ns]', 'quantity':'Int64', 'price':'Int64'})

四，分析

表格大法,分组聚合,然后rank求并列第一

第一步:以销售者分组,sum聚合价格

第二步:rank排序求价格的并列第一

第三步:取rn=1的行,然后映射输出指定的列

第一步:以销售者分组,sum聚合价格

df =sales.groupby('seller_id')['price'].sum().reset_index()
df

第二步:rank排序求价格的并列第一


df['rn'] = df['price'].rank(method = 'min', ascending = False)
df

第三步:取rn=1的行,然后映射输出指定的列

res = df[df['rn'] ==1].reset_index()
res1=res['seller_id'].to_frame()
res1

五，Pandas解答

import pandas as pd

def sales_analysis(product: pd.DataFrame, sales: pd.DataFrame) -> pd.DataFrame:
    df =sales.groupby('seller_id')['price'].sum().reset_index()
    df['rn'] = df['price'].rank(method = 'min', ascending = False)
    res = df[df['rn'] ==1].reset_index()
    res1=res['seller_id'].to_frame()
    return res1

六，验证

七，知识点总结

Pandas分组聚合的运用 API:groupby...sum
Pandas重置索引的运用 API:reset_index()
Pandas中并列求第一类似sql中rank的用法 rank.....method....ascending..
Pandas series对象转为datafarme对象的用法
Python中函数的用法