2329. Product Sales Analysis V

SQL架构

Table: Sales

+-------------+-------+
| Column Name | Type  |
+-------------+-------+
| sale_id     | int   |
| product_id  | int   |
| user_id     | int   |
| quantity    | int   |
+-------------+-------+
sale_id is the primary key of this table.
product_id is a foreign key to Product table.
Each row of this table shows the ID of the product and the quantity purchased by a user.

Table: Product

+-------------+------+
| Column Name | Type |
+-------------+------+
| product_id  | int  |
| price       | int  |
+-------------+------+
product_id is the primary key of this table.
Each row of this table indicates the price of each product.

Write an SQL query that reports the spending of each user.

Return the resulting table ordered by spending in descending order. In case of a tie, order them by user_id.

The query result format is in the following example.

Example 1:

Input: 
Sales table:
+---------+------------+---------+----------+
| sale_id | product_id | user_id | quantity |
+---------+------------+---------+----------+
| 1       | 1          | 101     | 10       |
| 2       | 2          | 101     | 1        |
| 3       | 3          | 102     | 3        |
| 4       | 3          | 102     | 2        |
| 5       | 2          | 103     | 3        |
+---------+------------+---------+----------+
Product table:
+------------+-------+
| product_id | price |
+------------+-------+
| 1          | 10    |
| 2          | 25    |
| 3          | 15    |
+------------+-------+
Output: 
+---------+----------+
| user_id | spending |
+---------+----------+
| 101     | 125      |
| 102     | 75       |
| 103     | 75       |
+---------+----------+
Explanation: 
User 101 spent 10 * 10 + 1 * 25 = 125.
User 102 spent 3 * 15 + 2 * 15 = 75.
User 103 spent 3 * 25 = 75.
Users 102 and 103 spent the same amount and we break the tie by their ID while user 101 is on the top.
# Write your MySQL query statement below
select
s.user_id,sum(s.quantity * p.price) spending
from
Sales s left join Product  p 
using(product_id) 
group by s.user_id
order by spending desc, user_id

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
```scala import org.apache.spark.SparkContext import org.apache.spark.SparkConf import org.apache.spark.rdd.RDD import org.apache.spark.sql._ import org.apache.spark.sql.functions._ class Sale { } object Sale { case class Sale(cardId:String,productId:Int,quantity:Int,money:Double) def main(args: Array[String]): Unit = { val conf = new SparkConf().setAppName("Pharmacy Sales Analysis").setMaster("local") val sc = new SparkContext(conf) sc.setLogLevel("ERROR") val sqlContext = new SQLContext(sc) // load data val data = sc.textFile(args(0)) // parse data val salesRDD: RDD[Sale] = data.map(line => line.split(" ")).map(arr => Sale(arr(0), arr(1).toInt, arr(2).toInt, arr(3).toDouble)) import sqlContext.implicits._ val df = salesRDD.toDF() // select columns and show top 5 rows val nameMoneyDF = df.select("productId","money","quantity") nameMoneyDF.show(5) // group by productId and sum the quantity, then show top 5 products with highest sales val totalSales = nameMoneyDF.groupBy("productId").sum("quantity") val topSales = totalSales.select(totalSales("productId"),totalSales("sum(quantity)").as("Totalquantity")).orderBy(desc("Totalquantity")) topSales.show(5) // group by cardId and sum the quantity, then show top 5 customers with highest purchase quantity val totalbuy = df.groupBy("cardId").sum("quantity") val topbuy = totalbuy.select(totalbuy("cardId"),totalbuy("sum(quantity)").as("Totalquantity")).orderBy(desc("Totalquantity")) topbuy.show(5) // select rows for one specific product (productId = 230010) and show the result val Oneproduct:Dataset[Row]= topSales.select("productId","Totalquantity").filter(row=>row.getAs[String]("productId") == 230010) Oneproduct.show() sc.stop() } } ``` 注释如下: ```scala // 导入Spark相关的库 import org.apache.spark.SparkContext import org.apache.spark.SparkConf import org.apache.spark.rdd.RDD import org.apache.spark.sql._ import org.apache.spark.sql.functions._ // 定义Sale类 class Sale { } // Sale伴生对象 object Sale { // 定义Sale类的属性 case class Sale(cardId:String,productId:Int,quantity:Int,money:Double) def main(args: Array[String]): Unit = { // 设置Spark的配置信息,并创建SparkContext val conf = new SparkConf().setAppName("Pharmacy Sales Analysis").setMaster("local") val sc = new SparkContext(conf) sc.setLogLevel("ERROR") val sqlContext = new SQLContext(sc) // 加载数据 val data = sc.textFile(args(0)) // 解析数据并创建RDD val salesRDD: RDD[Sale] = data.map(line => line.split(" ")).map(arr => Sale(arr(0), arr(1).toInt, arr(2).toInt, arr(3).toDouble)) // 将RDD转换为DataFrame import sqlContext.implicits._ val df = salesRDD.toDF() // 选择需要的列并展示前5行 val nameMoneyDF = df.select("productId","money","quantity") nameMoneyDF.show(5) // 按productId分组并计算销售量,然后展示销售量最高的5个商品 val totalSales = nameMoneyDF.groupBy("productId").sum("quantity") val topSales = totalSales.select(totalSales("productId"),totalSales("sum(quantity)").as("Totalquantity")).orderBy(desc("Totalquantity")) topSales.show(5) // 按cardId分组并计算购买量,然后展示购买量最高的5个客户 val totalbuy = df.groupBy("cardId").sum("quantity") val topbuy = totalbuy.select(totalbuy("cardId"),totalbuy("sum(quantity)").as("Totalquantity")).orderBy(desc("Totalquantity")) topbuy.show(5) // 选择productId为230010的行并展示结果 val Oneproduct:Dataset[Row]= topSales.select("productId","Totalquantity").filter(row=>row.getAs[String]("productId") == 230010) Oneproduct.show() // 停止SparkContext sc.stop() } } ```

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值