一 返回的是一个table:
问题:find the average number of events for each day for each channel.
The first table will provide us the number of events for each day and channel, and then we will need to average these values together using a second query.
1 find the number of events that occur for each day for each channel
2 create a subquery that simply provides all of the data from your first query
3 find the average number of events for each channel. Since you broke out by day earlier, this is giving you an average per day.
SELECT channel, AVG(num_events) avg_events
FROM (SELECT DATE_PART('day', occurred_at) AS day, channel, COUNT(*) num_events
FROM web_events /*day 前面的 AS不能少*/
GROUP BY 1, 2) AS sub
GROUP BY 1
ORDER BY 2 DESC;
二 返回的是一个值:
问题:返回与首个订单相同月份的订单
SELECT *
FROM orders
WHERE DATE_TRUNC('month', occurred_at) =
(SELECT DATE_TRUNC('month', MIN(occurred_at))
FROM orders);
1 The average amount of standard/gloss/poster paper sold on the first month that any order was placed in the orders table (in terms of quantity).
SELECT AVG(standard_qty) avg_std, AVG(gloss_qty) avg_gls, AVG(poster_qty) avg_pst
FROM orders
WHERE DATE_TRUNC('month', occurred_at) =
(SELECT DATE_TRUNC('month', MIN(occurred_at)) FROM orders);
2 The total amount spent on all orders on the first month that any order was placed in the orderstable (in terms of usd).
SELECT SUM(total_amt_usd)
FROM orders
WHERE DATE_TRUNC('month', occurred_at) =
(SELECT DATE_TRUNC('month', MIN(occurred_at)) FROM orders);
解决方案:
- 提供每个区域拥有最高销售额 (total_amt_usd) 的销售代表的姓名。
首先,我要算出与每个销售代表相关的总销售额 (total_amt_usd),并且要得出他们所在的区域。以下查询提供了这一信息。
接着,得出每个区域的最高销售额,然后使用该信息从最终结果中获取这些行。SELECT s.name rep_name, r.name region_name, SUM(o.total_amt_usd) total_amt FROM sales_reps s JOIN accounts a ON a.sales_rep_id = s.id JOIN orders o ON o.account_id = a.id JOIN region r ON r.id = s.region_id GROUP BY 1,2 ORDER BY 3 DESC;
本质上,这是两个表格的连接,其中区域和销售额相匹配。SELECT region_name, MAX(total_amt) total_amt FROM(SELECT s.name rep_name, r.name region_name, SUM(o.total_amt_usd) total_amt FROM sales_reps s JOIN accounts a ON a.sales_rep_id = s.id JOIN orders o ON o.account_id = a.id JOIN region r ON r.id = s.region_id GROUP BY 1, 2) t1 GROUP BY 1;
SELECT t3.rep_name, t3.region_name, t3.total_amt FROM(SELECT region_name, MAX(total_amt) total_amt FROM(SELECT s.name rep_name, r.name region_name, SUM(o.total_amt_usd) total_amt FROM sales_reps s JOIN accounts a ON a.sales_rep_id = s.id JOIN orders o ON o.account_id = a.id JOIN region r ON r.id = s.region_id GROUP BY 1, 2) t1 GROUP BY 1) t2 JOIN (SELECT s.name rep_name, r.name region_name, SUM(o.total_amt_usd) total_amt FROM sales_reps s JOIN accounts a ON a.sales_rep_id = s.id JOIN orders o ON o.account_id = a.id JOIN region r ON r.id = s.region_id GROUP BY 1,2 ORDER BY 3 DESC) t3 ON t3.region_name = t2.region_name AND t3.total_amt = t2.total_amt;
- 对于具有最高销售额 (total_amt_usd) 的区域,总共下了多少个订单?
我写的第一个查询是获取每个区域的 total_amt_usd。
然后,我们仅从该表格中获取销售额最高的区域。可以通过两种方法来获取,一种是使用子查询后的最大值,另一种是按降序排序,然后获取最高值。SELECT r.name region_name, SUM(o.total_amt_usd) total_amt FROM sales_reps s JOIN accounts a ON a.sales_rep_id = s.id JOIN orders o ON o.account_id = a.id JOIN region r ON r.id = s.region_id GROUP BY r.name;
最终,我们要获取具有该区域销售额的总订单量:SELECT MAX(total_amt) FROM (SELECT r.name region_name, SUM(o.total_amt_usd) total_amt FROM sales_reps s JOIN accounts a ON a.sales_rep_id = s.id JOIN orders o ON o.account_id = a.id JOIN region r ON r.id = s.region_id GROUP BY r.name) sub;
结果就是 Northeast,总订单为 1230378 个。SELECT r.name, SUM(o.total) total_orders FROM sales_reps s JOIN accounts a ON a.sales_rep_id = s.id JOIN orders o ON o.account_id = a.id JOIN region r ON r.id = s.region_id GROUP BY r.name HAVING SUM(o.total_amt_usd) = ( SELECT MAX(total_amt) FROM (SELECT r.name region_name, SUM(o.total_amt_usd) total_amt FROM sales_reps s JOIN accounts a ON a.sales_rep_id = s.id JOIN orders o ON o.account_id = a.id JOIN region r ON r.id = s.region_id GROUP BY r.name) sub);
-
对于购买标准纸张数量 (standard_qty) 最多的客户(在作为客户的整个时期内),有多少客户的购买总数依然更多?(问题可以表述为:有多少客户的总消费额大于购买标准纸张数量最多的客户的总消费额?)
首先,我们要得出购买标准纸张数量 (standard_qty) 最多的客户。以下查询获取了该客户,以及总消费:SELECT a.name account_name, SUM(o.standard_qty) total_std, SUM(o.total) total FROM accounts a JOIN orders o ON o.account_id = a.id GROUP BY 1 ORDER BY 2 DESC LIMIT 1;
现在,我将使用上述信息获取总消费更高的所有客户:
SELECT a.name FROM orders o JOIN accounts a ON a.id = o.account_id GROUP BY 1 HAVING SUM(o.total) > (SELECT total FROM (SELECT a.name act_name, SUM(o.standard_qty) tot_std, SUM(o.total) total FROM accounts a JOIN orders o ON o.account_id = a.id GROUP BY 1 ORDER BY 2 DESC LIMIT 1) sub);
上述查询列出了具有更多订单的客户列表。我们还可以使用另一个简单的子查询获取数量。
SELECT COUNT(*) FROM (SELECT a.name FROM orders o JOIN accounts a ON a.id = o.account_id GROUP BY 1 HAVING SUM(o.total) > (SELECT total FROM (SELECT a.name act_name, SUM(o.standard_qty) tot_std, SUM(o.total) total FROM accounts a JOIN orders o ON o.account_id = a.id GROUP BY 1 ORDER BY 2 DESC LIMIT 1) inner_tab) ) counter_tab;
-
对于(在作为客户的整个时期内)总消费 (total_amt_usd) 最多的客户,他们在每个渠道上有多少 web_events?
我们首先需要获取在整个客户时期内消费最多的客户。SELECT a.id, a.name, SUM(o.total_amt_usd) tot_spent FROM orders o JOIN accounts a ON a.id = o.account_id GROUP BY a.id, a.name ORDER BY 3 DESC LIMIT 1;
现在,我们要获取该企业(可以使用 id 进行匹配)在每个渠道上的事件数。
SELECT a.name, w.channel, COUNT(*) FROM accounts a JOIN web_events w ON a.id = w.account_id AND a.id = (SELECT id FROM (SELECT a.id, a.name, SUM(o.total_amt_usd) tot_spent FROM orders o JOIN accounts a ON a.id = o.account_id GROUP BY a.id, a.name ORDER BY 3 DESC LIMIT 1) inner_table) GROUP BY 1, 2 ORDER BY 3 DESC;
我添加了 ORDER BY,其实并没特别的理由,并添加了客户名称,确保仅从一个客户那获取数据。
- 对于总消费前十名的客户,他们的平均消费 (total_amt_usd) 是多少?
首先,我们需要找出总消费 (total_amt_usd) 在前十名的客户。
现在计算这十个客户的平均消费。SELECT a.id, a.name, SUM(o.total_amt_usd) tot_spent FROM orders o JOIN accounts a ON a.id = o.account_id GROUP BY a.id, a.name ORDER BY 3 DESC LIMIT 10;
SELECT AVG(tot_spent) FROM (SELECT a.id, a.name, SUM(o.total_amt_usd) tot_spent FROM orders o JOIN accounts a ON a.id = o.account_id GROUP BY a.id, a.name ORDER BY 3 DESC LIMIT 10) temp;
- 比所有客户的平均消费高的企业平均终身消费 (total_amt_usd) 是多少?
首先,算出所有客户的总消费 (total_amt_usd) 平均值:
然后,只获取高于这一平均值的客户。SELECT AVG(o.total_amt_usd) avg_all FROM orders o JOIN accounts a ON a.id = o.account_id;
最后,算出这些值的平均值。SELECT o.account_id, AVG(o.total_amt_usd) FROM orders o GROUP BY 1 HAVING AVG(o.total_amt_usd) > (SELECT AVG(o.total_amt_usd) avg_all FROM orders o JOIN accounts a ON a.id = o.account_id);
SELECT AVG(avg_amt) FROM (SELECT o.account_id, AVG(o.total_amt_usd) avg_amt FROM orders o GROUP BY 1 HAVING AVG(o.total_amt_usd) > (SELECT AVG(o.total_amt_usd) avg_all FROM orders o JOIN accounts a ON a.id = o.account_id)) temp_table;