mysql 成绩排名 字段,提前查询。在mysql中排名最相关的字段

Let's assume we have a database like this:

Project_tbl:

-----------------

id | Project_name

-----------------

1 | A

2 | B

3 | C

-----------------

personel_project_tbl:

--------------------

user_id | Project_id

--------------------

1 | 1

2 | 2

3 | 1

3 | 2

2 | 3

--------------------

instrument_project_tbl:

--------------------------

instrument_id | Project_id

--------------------------

1 | 1

1 | 2

2 | 2

2 | 1

1 | 3

--------------------------

Now, I need to sort the list of projects and rank them with regard to their similarity to the project A.

For example:

A and B have 1 users in common over the 3 users and 2 instruments over the 2 instrument so their similarity ranking is (1/2 + 2/2) / 2 = 75%

A and C have no user in common but have 1 over 2 instruments so it will be (1/2)/2 = 25%

So B is more similar than be and output should be

--------------

Project | Rank

--------------

2 | 75

3 | 25

That's the first solution came to my mind...

If I did it in PHP and MySQL, it would be something like:

for all tables as table_x

for all projects (except A) as prj_y

unique = (Select distinct count(items) from table_x where project is A)

count += (Select distinct count(items) from table_x

where project is prj_x and items are in

(select distinct items from table_x where project is a)

)/unique

So the complexity would be O(n2) and with indexing the select also would cost O(log n) which wouldn't be affordable.

Do you have any idea to do it totally in MySQL or do it in a better and faster way?

******** More information and notes:**

I'm limited to PHP and MySQL.

This is just an example, in my real project the tables are more than 20 tables so the solution should have high performance.

this question is the supplementary question for this one : Get the most repeated similar fields in MySQL database if yr solution can be used or applied in a way for both of them (somehow) It would be more than great.

I want to multiply the value of related projects with the similarity of items to get the best option...

In conclusion, these two questions will : get the most related projects, get the similar items of all projects and find the most similar item for current project where the project is also similar to the current one! yo

Thanks for your intellectual answers, its really appreciated if you could shed some light on the situations

解决方案

You could do it this way:

SET @Aid = (SELECT id

FROM Project_tbl

WHERE Project_name = 'A');

SELECT P.id

, (IFNULL(personel.prop, 0) +

IFNULL(instrument.prop, 0)

)/2*100 Rank

, personel.prop AS personell

, instrument.prop AS instrument

FROM Project_tbl P

LEFT JOIN

( SELECT B.Project_id pid, COUNT(*)/C.ref prop

FROM personel_project_tbl A,

personel_project_tbl B,

(SELECT COUNT(*) AS ref

FROM personel_project_tbl

WHERE Project_id = @Aid

) AS C

WHERE A.user_id = B.user_id

AND A.Project_id = @Aid

GROUP BY B.Project_id

) personel ON P.id = personel.pid

LEFT JOIN

( SELECT B.Project_id pid, COUNT(*)/C.ref prop

FROM instrument_project_tbl A,

instrument_project_tbl B,

(SELECT COUNT(*) AS ref

FROM instrument_project_tbl

WHERE Project_id = @Aid

) AS C

WHERE A.instrument_id = B.instrument_id

AND A.Project_id = @Aid

GROUP BY B.Project_id

) instrument ON P.id = instrument.pid

WHERE P.id <> @Aid

ORDER BY Rank DESC

The idea is to have one subquery for each table, and each of these subqueries maps project id to correspondence ratio for a given table.

I'm saying nothing at all about performance. You'll have to try and see whether it is fast enough for your needs, but as I see it there is no way to beat the O(n2) complexity you mention, as you have to inspect all the data.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值