mysql查出同义词,在数据库mysql中存储和检索同义词的最佳方法

I am making a synonyms list that I will store it in database and retrieve it before doing full text search.

When users enters like: word1

I need to lookup for this word in my synonyms table. So if the word is found, I would SELECT all the synonyms of this word and use it in the fulltext search on the next query where I contruct the query like

MATCH (columnname) AGAINST ((word1a word1b word1c) IN BOOLEAN MODE)

So how do I store the synonyms in a table? I found 2 choices:

using key and word columns like

val keyword

-------------

1 word1a

1 word1b

1 word1c

2 word2a

2 word2b

3 word3a

etc.

So then I can find exact match of the entered word in one query and find it's ID. In the next select I get all the words with that ID and somehow concate them using a recordset loop in server side langauge. I can then construct the real search on the main table that I need to look for the words.

using only word columns like

word1a|word1b|word1c

word2a|word2b|word2c

word3a

Now I so the SELECT for my word if it is inside any record, if it is, extract all the record and explode it at | and I have my words again that I can use.

This second approach lookes easier to maintain for the one who would make this database of synonyms, but I see 2 problems:

a) How do I find in mysql if a word is inside the string? I can not LIKE 'word1a' it because synonims can be very alike in a way word1a could be strowberry and strowberries could be birds and word 2a could be berry. Obviously I need exact match, so how could a LIKE statement exact match inside a string?

b) I see a speed problem, using LIKE would I guess take more mysql take than "=" using the first approach where I exact match a word. On the other hand in the first option I need 2 statements, one to get the ID of the word and second to get all the words with this ID.

How would you solve this problem, more of a dilemma which approach to take? Is there a third way I don't see that is easy for admin to add/edit synonyms and in the same time fast and optimal? Ok I know there is no best way usually ;-)

UPDATE: The solution to use two tables one for master word and second for the synonym words will not work in my case. Because I don't have a MASTER word that user types in search field. He can type any of the synonyms in the field, so I am still wondering how to set this tables as I don't have master words that I would have ID's in one table and synonims with ID of the master in second table. There is no master word.

解决方案

Don't use a (one) string to store different entries.

In other words: Build a word table (word_ID,word) and a synonym table (word_ID,synonym_ID) then add the word to the word table and one entry per synonym to the synonyms table.

UPDATE (added 3rd synonym)

Your word table must contain every word (ALL), your synonym table only holds pointers to synonyms (not a single word!) ..

If you had three words: A, B and C, that are synonyms, your DB would be

WORD_TABLE SYNONYM_TABLE

ID | WORD W_ID | S_ID

---+----- -----+-------

1 | A 1 | 2

2 | B 2 | 1

3 | C 1 | 3

3 | 1

2 | 3

3 | 2

Don't be afraid of the many entries in the SYNONYM_TABLE, they will be managed by the computer and are needed to reflect the existing relations between the words.

2nd approach

You might also be tempted (I don't think you should!) to go with one table that has separate fields for word and a list of synonyms (or IDs) (word_id,word,synonym_list). Beware that that is contrary to the way a relational DB works (one field, one fact).

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值