Sphinx-for-chinese (中文全文搜索)安装步骤

最新推荐文章于 2021-01-19 13:53:01 发布

可克

最新推荐文章于 2021-01-19 13:53:01 发布

阅读量9.5k

点赞数

分类专栏： php 文章标签： mysql dictionary file 数据库 query 测试

php 专栏收录该内容

164 篇文章 0 订阅

订阅专栏

Sphinx-for-chinese (中文全文搜索)安装步骤

2010-02-26 10:47:29| 分类：搜索引擎 |字号订阅

前提，你的机器上已经安装过mysql数据库。如果没有安装，则运行以下命令安装

sudo apt-get install mysql-client-5.0 mysql-server-5.0

1. 下载所需的安装包
     sphinx-for-chinese-0.9.9-r2117.tar.gz
    xdict_1.1.tar.gz
    下载地址： http://code.google.com/p/sphinx-for-chinese/downloads/list

2. 解压 tar -zxvf sphinx-for-chinese-0.9.9-r2117.tar.gz

3. 编译安装
$ cd sphinx-for-chinese-0.9.9-r2117.tar.gz
$ ./configure
$ make
$ sudo make install

4. 创建test数据库,并创建sphinx用户
mysql> create database test;
mysql> create user 'sphinx'@'localhost' identified by 'sphinx';
mysql> grant all privileges on test.* to 'sphinx'@'localhost';

5. 指定sphinx配置文件
$ cd /usr/local/etc
$ cp sphinx.conf.dist sphinx.conf

6. 编辑该配置文件
vi sphinx.conf
改动内容如下：
sql_host        = localhost
sql_user        = sphinx
sql_pass        = sphinx
sql_db          = test
sql_port        = 3306 # optional, default is 3306
说明：加粗部分是修改的内容

到这里为止，sphinx已经可以使用了，但还不能支持中文切词,以下是加入中文切词的步骤
1. 解压字典文件 xdict_1.1.tar.gz

$ tar zxvf xdict_1.1.tar.gz

2. 借助先前安装的 mkdict 工具生成字典

$ /usr/local/sphinx/bin/mkdict xdict.txt xdict

3. 将字典 xdict 拷贝到 /usr/local/sphinx/etc目录下

4. 配置中文切词
打开 sphinx.conf文件，找到 ‘charset_type = sbcs’ 字样，将其改为:

charset_type    = utf-8
chinese_dictionary = /usr/local/sphinx/etc/xdict

至此中文切词配置完成，下面做一个简单的测试
1. 编辑sphinx-for-chinese自带的SQL脚本，加入中文数据
$ vi /usr/local/sphinx/etc/example.sql

REPLACE INTO test.documents ( id, group_id, group_id2, date_added, title, content ) VALUES
  ( 1, 1, 5, NOW(), 'test one', 'this is my test document number one. also checking search within phrases.' ),
  ( 2, 1, 6, NOW(), 'test two', 'this is my test document number two' ),
  ( 3, 2, 7, NOW(), 'another doc', 'this is another group' ),
  ( 4, 2, 8, NOW(), 'doc number four', 'this is to test groups' ),
  ( 5, 2, 8, NOW(), 'doc number five', '一个' ),
  ( 6, 2, 8, NOW(), 'doc number six', '我' ),
  ( 7, 2, 8, NOW(), 'doc number seven', '中国人' );

说明：加粗部分是添加的中文测试数据

2. 导入数据

$ mysql -usphinx -psphinx < example.sql

3. 建立索引

$ sudo /usr/local/sphinx/bin/indexer --all

如果出以下错误：就给他建一个

FATAL: failed to open /var/data/test1.spl: No such file or directory

2011-03-26 09:03

FATAL: failed to open /var/data/test1.spl: No such file or directory, will not index. Try --rotate option.

Thats not trying to read that file, but rather create it.
Does /var/data/ folder exist, and is it writable?

mkdir data

http://sphinxsearch.com/forum/view.html?id=3511

4. 检索
$ /usr/local/sphinx/bin/search 我是一个中国人
Sphinx 0.9.9-release (r2117)
Copyright (c) 2001-2009, Andrew Aksyonoff

using config file '/usr/local/sphinx/etc/sphinx.conf'...
index 'test1': query '我是一个中国人 ': returned 0 matches of 0 total in
0.000 sec
words:
1. '我': 1 documents, 1 hits
2. '是': 0 documents, 0 hits
3. '一个': 1 documents, 1 hits
4. '中国人': 1 documents, 1 hits

index 'test1stemmed': query '我是一个中国人 ': returned 0 matches of 0 total in 0.000 sec
words:
1. '我': 1 documents, 1 hits
2. '是': 0 documents, 0 hits
3. '一个': 1 documents, 1 hits
4. '中国人': 1 documents, 1 hits

至此，Sphinx-for-chinese已经成功安装，并通过测试。