mysql group by 时间戳,MySQL GROUP BY DateTime +/- 3秒

Suppose I have a table with 3 columns:

id (PK, int)

timestamp (datetime)

title (text)

I have the following records:

1, 2010-01-01 15:00:00, Some Title

2, 2010-01-01 15:00:02, Some Title

3, 2010-01-02 15:00:00, Some Title

I need to do a GROUP BY records that are within 3 seconds of each other. For this table, rows 1 and 2 would be grouped together.

There is a similar question here: Mysql DateTime group by 15 mins

I don't know how to convert these methods into something that will work for seconds. The trouble with the method on the SO question is that it seems to me that it would only work for records falling within a bin of time that starts at a known point. For instance, if I were to get FLOOR() to work with seconds, at an interval of 5 seconds, a time of 15:00:04 would be grouped with 15:00:01, but not grouped with 15:00:06.

Does this make sense? Please let me know if further clarification is needed.

EDIT: For the set of numbers, {1, 2, 3, 4, 5, 6, 7, 50, 51, 60}, it seems it might be best to group them {1, 2, 3, 4, 5, 6, 7}, {50, 51}, {60}, so that each grouping row depends on if the row is within 3 seconds of the previous. I know this changes things a bit, I'm sorry for being wishywashy on this.

I am trying to fuzzy-match logs from different servers. Server #1 may log an item, "Item #1", and Server #2 will log that same item, "Item #1", within a few seconds of server #1. I need to do some aggregate functions on both log lines. Unfortunately, I only have title to go on, due to the nature of the server software.

解决方案

I'm using Tom H.'s excellent idea but doing it a little differently here:

Instead of finding all the rows that are the beginnings of chains, we can find all times that are the beginnings of chains, then go back and ifnd the rows that match the times.

Query #1 here should tell you which times are the beginnings of chains by finding which times do not have any times below them but within 3 seconds:

SELECT DISTINCT Timestamp

FROM Table a

LEFT JOIN Table b

ON (b.Timestamp >= a.TimeStamp - INTERVAL 3 SECONDS

AND b.Timestamp < a.Timestamp)

WHERE b.Timestamp IS NULL

And then for each row, we can find the largest chain-starting timestamp that is less than our timestamp with Query #2:

SELECT Table.id, MAX(StartOfChains.TimeStamp) AS ChainStartTime

FROM Table

JOIN ([query #1]) StartofChains

ON Table.Timestamp >= StartOfChains.TimeStamp

GROUP BY Table.id

Once we have that, we can GROUP BY it as you wanted.

SELECT COUNT(*) --or whatever

FROM Table

JOIN ([query #2]) GroupingQuery

ON Table.id = GroupingQuery.id

GROUP BY GroupingQuery.ChainStartTime

I'm not entirely sure this is distinct enough from Tom H's answer to be posted separately, but it sounded like you were having trouble with implementation, and I was thinking about it, so I thought I'd post again. Good luck!

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值