mysql优化分组查询,如何优化MySQL查询(分组和顺序)

Hey all, I've got a query in need of optimizing. It works but its a dog, performance wise.

It reads like this:

SELECT *

FROM (

SELECT *

FROM views

WHERE user_id = '1'

ORDER BY

page DESC

) v

GROUP BY

v.session

I'm tracking views to different pages, and I want to know the highest page per session, in order to know how far they've clicked through (they're required to view every page all the way to the end) in any given session.

Basically what I'm trying to do is ORDER the results before the GROUP. Which the above achieves, at significant cost.

Anyone who can slap me over the head with how to do this? Thanks guys!

Update:

The Explain:

"1" "PRIMARY" "" "ALL" \N \N \N \N "3545" "Using temporary; Using filesort"

"2" "DERIVED" "views" "index" \N "page" "5" \N "196168" "Using where"

The schema:

ID int(8) unsigned (NULL) NO PRI (NULL) auto_increment select,insert,update,references

page int(8) (NULL) YES MUL (NULL) select,insert,update,references

user_id int(8) (NULL) YES (NULL) select,insert,update,references

session int(8) (NULL) YES (NULL) select,insert,update,references

created datetime (NULL) NO select,insert,update,references

Index Info:

views 0 PRIMARY 1 ID A 196008 (NULL) (NULL) BTREE

views 1 page 1 page A 259 (NULL) (NULL) YES BTREE

解决方案I'm tracking views to different pages, and I want to know the highest page per session, in order to know how far they've clicked through (they're required to view every page all the way to the end) in any given session.

Ordering before grouping is a highly unreliable way to do this.

MySQL extends GROUP BY syntax: you can use ungrouped and unaggregated fields in SELECT and ORDER BY clauses.

In this case, a random value of page is output per each session.

Documentation explicitly states that you should never make any assumptions on which value exactly will it be:

Do not use this feature if the columns you omit from the GROUP BY part are not constant in the group. The server is free to return any value from the group, so the results are indeterminate unless all values are the same.

However, in practice, the values from the first row scanned are returned.

Since you are using an ORDER BY page DESC in your subquery, this row happens to be the rows with a maximal page per session.

You shouldn't rely on it, since this behaviour is undocumented and if some other row will be returned in next version, it will not be considered a bug.

But you don't even have to do such nasty tricks.

Just use aggregate functions:

SELECT MAX(page)

FROM views

WHERE user_id = '1'

GROUP BY

session

This is documented and clean way to do what you want.

Create a composite index on (user_id, session, page) for the query to run faster.

If you need all columns from your table, not only the aggregated ones, use this syntax:

SELECT v.*

FROM (

SELECT DISTINCT user_id, session

FROM views

) vo

JOIN views v

ON v.id =

(

SELECT id

FROM views vi

WHERE vi.user_id = vo.user_id

AND vi.session = vo.session

ORDER BY

page DESC

LIMIT 1

)

This assumes that id is a PRIMARY KEY on views.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值