群成员表mysql,如何使用mysql创建每周同类群组分析表?

Let's say you have a user table that has at least the date the user signed up and an id.

Now let's say you have a separate table that tracks an action like a payment that can happen at any point in the user's lifetime. (Say like an in-app purchase.) In that table we track the userId, payment date, and an id for the payment.

So we have something that looks like this to get our schema set up:

CREATE TABLE users (

UserId INT,

AddedDate DATETIME

);

CREATE TABLE payments (

PaymentId INT,

UserId INT,

PaymentDate Datetime

);

Now you want a table that shows weekly cohorts. A table that looks something like this:

Week size w1 w2 w3 w4 w5 w6 w7

2017-08-28 1 0 0 0 1 0 0 0

2017-09-04 3 1 0 2 0 1 1 2

2017-09-11 2 0 0 1 0 0 0 1

2017-09-18 6 3 1 4 3 1 1 2

2017-09-25 2 1 1 1 0 1 2 0

2017-10-02 7 5 2 3 4 3 1 0

2017-10-09 7 4 5 1 2 5 0 0

2017-10-16 2 1 2 1 1 0 0 0

2017-10-23 7 5 4 4 3 0 0 0

2017-10-30 8 8 7 0 0 0 0 0

2017-11-06 5 5 2 0 0 0 0 0

So the first column has the week, the second has number of people that signed up that week. Say we look at week 2017-09-18. 6 people signed up that week. The 3 under the w1 column means that 3 people out of that 6 made a purchase the week they signed up. The 1 under w2 means 1 person out of that 6 made a purchase the second week they were signed up, and so on.

What query would I use to get a table that looks like that?

解决方案

This query is modified from the one I wrote here: Cohort analysis in SQL

Here's the final query:

SELECT

STR_TO_DATE(CONCAT(tb.cohort, ' Monday'), '%X-%V %W') as date,

size,

w1,

w2,

w3,

w4,

w5,

w6,

w7

FROM (

SELECT u.cohort,

IFNULL(SUM(s.Offset = 0), 0) w1,

IFNULL(SUM(s.Offset = 1), 0) w2,

IFNULL(SUM(s.Offset = 2), 0) w3,

IFNULL(SUM(s.Offset = 3), 0) w4,

IFNULL(SUM(s.Offset = 4), 0) w5,

IFNULL(SUM(s.Offset = 5), 0) w6,

IFNULL(SUM(s.Offset = 6), 0) w7

FROM (

SELECT

UserId,

DATE_FORMAT(AddedDate, "%Y-%u") AS cohort

FROM users

) as u

LEFT JOIN (

SELECT DISTINCT

payments.UserId,

FLOOR(DATEDIFF(payments.PaymentDate, users.AddedDate)/7) AS Offset

FROM payments

LEFT JOIN users ON (users.UserId = payments.UserId)

) as s ON s.UserId = u.UserId

GROUP BY u.cohort

) as tb

LEFT JOIN (

SELECT DATE_FORMAT(AddedDate, "%Y-%u") dt, COUNT(*) size FROM users GROUP BY dt

) size ON tb.cohort = size.dt

So the core of this is we grab the users and the date they signed up and format the date by year-week number, since we are doing a weekly cohort.

SELECT

UserId,

DATE_FORMAT(AddedDate, "%Y-%u") AS cohort

FROM users

Since we want to group by the cohort we have to put this in a subquery in the FROM part of the query.

Then we want join the payment information on the users.

SELECT DISTINCT

payments.UserId,

FLOOR(DATEDIFF(payments.PaymentDate, users.AddedDate)/7) AS Offset

FROM payments

LEFT JOIN users ON (users.UserId = payments.UserId)

This will get unique weekly payment events per user by the numbers of weeks they have been a user. We use distinct because if a user made 2 purchase in one week, we don't want to count that as two users.

We don't just use the payments table, because some users may sign up and not have payments. So we select from the users table and join on the payments table.

You then group by the week - u.cohort. Then you aggregate on the week numbers to find out how many people made payments the weeks after they signed up.

The version of mysql I used had sql_mode set to only_full_group_by. So to get the cohort size I put the bulk of the query in subquery so I could join on the users to get the size of the cohort.

Further considerations:

Filter by weeks is simple. tb.cohort > start date and tb.cohort < end date where start and end date are formatted with "%Y-%u". To make the query more efficient you'll probably want to filter out payment events that don't fall within the date range as well so you're not joining on data you don't need.

You may want to consider using a calender table to cover cases where there are no user sign ups during the week.

Here's a fiddle with everything working: http://sqlfiddle.com/#!9/172dbe/1

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值