MySQL
Lets say there is a credit card processing company. Every time a credit card is used a row gets inserted into a table.
create table tran(
id int,
tran_dt datetime,
card_id int,
merchant_id int,
amount int
);
One wants to know what cards have been used 3+ times in any 15 minute window at the same merchant.
My attempt:
select card_id, date(tran_dt), hour(tran_dt), merchant_id, count(*)
from tran
group by card_id, date(tran_dt), hour(tran_dt), merchant_id
having count(*)>=3
The first problem is that would give excessive transactions per hour, not per a 15 minute window. The second problem is that would not catch transactions that cross the hour mark ie at 1:59pm and 2:01pm.
To make this simpler, it would ok to split up the hour into 5 minute increments. So we would not have to check 1:00-1:15pm, 1:01-1:16pm, etc. It would be ok to check 1:00-1:15pm, 1:05-1:20pm, etc., if that is easier.
Any ideas how to fix the sql? I have a feeling maybe I need sql window functions, that are not yet available in MySQL. Or write a stored procedure that can look at each 15 block.
解决方案
You can convert the date/time to seconds and do arithmetic on the seconds to get the value within a 15 minute clock interval:
select card_id, min(date(tran_dt)) as first_charge_time, merchant_id, count(*)
from tran
group by card_id, floor(to_seconds(tran_dt) / (60 * 15)), merchant_id
having count(*) >= 3;
The above uses to_seconds(). In earlier versions of MySQL, you can use unix_timestamp().
Getting any 15 minute interval is more challenging. You can express the query as:
select t1.*, count(*) as numTransactions
from tran t1 join
tran t2
on t1.merchant_id = t2.merchanti_d and
t1.card_id = t2.card_id and
t2.tran_dt >= t1.tran_dt and
t2.tran_dt < t1.tran_dt + interval 15 minute
group by t1.id
having numTransactions >= 3;
Performance of this query might be problematic. An index on trans(card_id, merchant_id, tran_dt) should help a lot.