sphinx mysql连表查询,Sphinx vs. MySql - 搜索朋友列表(效率/速度)

I'm porting my application searches over to Sphinx from MySQL and am having a hard time figuring this one out, or if it even needs to be ported at all (I really want to know if it's worth using sphinx for this specific case for efficiency/speed):

users

uid uname

1 alex

2 barry

3 david

friends

uid | fid

1 2

2 1

1 3

3 1

Details are:

- InnoDB

- users: index on uid, index on uname

- friends: combined index on uid,fid

Normally, to search all of alex's friends with mysql:

$uid = 1

$searchstr = "%$friendSearch%";

$query = "SELECT f.fid, u.uname FROM friends f

JOIN users u ON f.fid=u.uid

WHERE f.uid=:uid AND u.uname LIKE :friendSearch";

$friends = $dbh->prepare($query);

$friends->bindParam(':uid', $uid, PDO::PARAM_INT);

$friends->bindParam(':friendSearch', $searchstr, PDO::PARAM_STR);

$friends->execute();

Is it any more efficient to find alex's friends with sphinx vs mysql or would that be an overkill?

If sphinx would be faster for this as the list hits thousands of people,

what would the indexing query look like? How would I delete a friendship that no longer exists with sphinx as well, can I have a detailed example in this case? Should I change this query to use Sphinx?

解决方案

Ok this is how I see this working.

I have the exact same problem with MongoDB. MongoDB "offers" searching capabilities but just like MySQL you should never use them unless you wanna be choked with IO, CPU and memory problems and be forced to use a lot more servers to cope with your index than you normally would.

The whole idea if using Sphinx (or another search tech) is to lower cost per server by having a performant index searcher.

Sphinx however is not a storage engine. It is not as simple to query exact relationships across tables, they have remmedied this a little with SphinxQL but due to the nature of the full text index it still doesn't do an integral join like you would get in MySQL.

Instead I would store the relationships within MySQL but have an index of "users" within Sphinx.

In my website I personally have 2 indexes:

main (houses users,videos,channels and playlists)

help (help system search)

These are delta updated once every minute. Since realtime indexes are still bit experimental at times and I personally have seen problems with high insertion/deletion rates I keep to delta updates. So I would use a delta index to update the main searchable objects of my site since this is less resource intensive and more performant than realtime indexes (from my own tests).

Do note inorder to process deletions and what not your Sphinx collection through delta you will need a killlist and certain filters for your delta index. Here is an example from my index:

source main_delta : main

{

sql_query_pre = SET NAMES utf8

sql_query_pre =

sql_query = \

SELECT id, deleted, _id, uid, listing, title, description, category, tags, author_name, duration, rating, views, type, adult, videos, UNIX_TIMESTAMP(date_uploaded) AS date_uploaded \

FROM documents \

WHERE id>( SELECT max_doc_id FROM sph_counter WHERE counter_id=1 ) OR update_time >( SELECT last_index_time FROM sph_counter WHERE counter_id=1 )

sql_query_killlist = SELECT id FROM documents WHERE update_time>=( SELECT last_index_time FROM sph_counter WHERE counter_id=1 ) OR deleted = 1

}

This processes deletions and additions once every minute which is pretty much realtime for a real web app.

So now we know how to store our indexes. I need to talk about the relationships. Sphinx (even though it has SphinxQL) won't do integral joins across data so I would personally recommend doing the relationship outside of Sphinx, not only that but as I said this relationship table will get high load so this is something that could impact the Sphinx index.

I would do a query to pick out all ids and using that set of ids use the "filter" method on the sphinx API to filter the main index down to specific document ids. Once this is done you can search in Sphinx as normal. This is the most performant method I have found to date of dealing with this.

The key thing to remember at all times is that Sphinx is a search tech while MySQL is a storage tech. Keep that in mind and you should be ok.

Edit

As @N.B said (which I overlooked in my answer) Sphinx does have SphinxSE. Although primative and still in sort of testing stage of its development (same as realtime indexes) it does provide an actual MyISAM/InnoDB type storage to Sphinx. This is awesome. However there are caveats (as with anything):

The language is primative

The joins are primative

However it can/could do the job your looking for so be sure to look into it.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值