linux centos7下搭建sphinx+scwc+mysql实现百万级别商品全文搜索实战

准备工作

  • 服务器 装的是centos7.1
  • 环境 lnmp1.5[mysql版本5.6.36+php5.7.22版本+nginx1.15.0]
  • scws :simple chinese words Segmentation  简易中文分词系统

原理

安装sphinx

 安装前先准备好需要的依赖包

 yum -y install make gcc g++ gcc-c++ libtool autoconf automake imake php-devel mysql-devel libxml2-devel expat-devel expat 

安装sphinx
 

yum -y install git

git clone https://github.com/sphinxsearch/sphinx.git

cd sphinx

mkdir -p /urs/local/sphinx

./configure --prefix=/usr/local/sphinx  --with-mysql --with-libexpat --enable-id64

make

make install

 

 

进入 sphinx配置
 

cd /usr/local/sphinx

ls

vi sphinx.conf

 

source goods {
type = mysql
sql_host = localhost
sql_user = xxxx           #数据库账号
sql_pass = xxxx            #数据库密码
sql_db = xxxx         #数据库名字
sql_port=3306         #数据端口号
sql_sock=/tmp/mysql.sock
sql_query_pre = SET NAMES utf8


sql_query = select id,goods_desc as attr_desc,goods_name as attr_name,price as attr_price,storage as attr_storage from tp5_goods

sql_field_string=attr_desc 
sql_field_string=attr_name
sql_field_string=attr_price
sql_field_string=attr_storage


}


#主索引
index goods {
source = goods
path = /usr/local/sphinx/var/data/goods
docinfo = extern
morphology = none
min_word_len = 1
min_prefix_len = 0
html_strip = 1
html_remove_elements = style, script
ngram_len = 1
ngram_chars = U+3000..U+2FA1F
#charset_type = utf-8
charset_table = 0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F
preopen = 1
min_infix_len = 1
}

indexer {
mem_limit = 256M
}

searchd {
listen                   = 9312
listen                   = 9306:mysql41 #Used for SphinxQL
log                      = /usr/local/sphinx/var/log/searchd.log
query_log          = /usr/local/sphinx/var/log/query.log
#compat_sphinxql_magics   = 0
attr_flush_period                 = 600
mva_updates_pool   = 16M
read_timeout           = 5
max_children           = 0
dist_threads             = 2
pid_file                    = /usr/local/sphinx/var/log/searchd.pid
#max_matches          = 1000
seamless_rotate       = 1
preopen_indexes     = 1
unlink_old               = 1
workers                  = threads # for RT to work
binlog_path            = /usr/local/sphinx/var/data

}

 

将数据导入数据库进行测试

 

 

 

sphinx生成索引
 

cd /usr/local/sphinx

ls

cd bin

./indexer --all

./searched 

 

 

 

搭建好lnmp环境后,接下来测试

测试文件 test.php 

<?php 
 require 'sphinxapi.php';
        $cl=new SphinxClient();

        $q='测试';
        $host='localhost';
        $port=9312;
        $index='goods';
        $cl->setServer($host,$port);
        $cl->setArrayResult(true);
        $res=$cl->Query($q,$index);
        echo '<pre>';
        print_r($res['matches']);
?>

将sphinx下文件api/sphinxapi.php复制到项目目录

 

 

cp sphinxapi.php /home/wwwroot/default/

 

运行test.php

 

 

接着来安装分成scws


wget -c http://www.xunsearch.com/scws/down/scws-1.2.3.tar.bz2
tar jxvf scws-1.2.3.tar.bz2
cd scws-1.2.3
mkdir -p /usr/local/swcs
./configure --prefix=/usr/local/scws
make && make install

 

scws的PHP扩展编译安装

cd ./phpext
phpize 
./configure --with-php-config=/usr/local/php/bin/php-config 
make && make install

 

安装词库

wget http://www.xunsearch.com/scws/down/scws-dict-chs-utf8.tar.bz2
tar jxvf scws-dict-chs-utf8.tar.bz2 -C /usr/local/scws/etc/
chown www:www /usr/local/scws/etc/dict.utf8.xdb

将以下语句,放在php的配置文件php.ini中

[scws]
extension = scws.so
scws.default.charset = utf-8
scws.default.fpath = /usr/local/scws/etc/

 

 

php -m 查看扩展是否安装好

也可以直接执行这个查看http://106.13.91.39/phpinfo.php

 

 

 

进行代码测试,

将sphinxapi.php加上方法

 public function wordSplit($keywords) {
       $fpath = ini_get('scws.default.fpath');
        $so = scws_new();
        $so->set_charset('utf-8');
        $so->add_dict($fpath . '/dict.utf8.xdb');
        //$so->add_dict($fpath .'/custom_dict.txt', SCWS_XDICT_TXT);
        $so->set_rule($fpath . '/rules.utf8.ini');
        $so->set_ignore(true);
        $so->set_multi(false);
        $so->set_duality(false);
        $so->send_text($keywords);
        $words = [];
        $results =  $so->get_result();
        foreach ($results as $res) {
            $words[] = '(' . $res['word'] . ')';
        }
        $words[] = '(' . $keywords . ')';
        return join('|', $words);
    }

 

写测试文件

search2.php

<?php
require 'sphinxapi.php';
 $cl=new SphinxClient();
 $words = $cl->wordSplit("长虹测试一下漂亮的电视机");
echo '<pre>';
print_r($words);
?>

 

 

 

 

发现php cli下的运行结果和浏览器运行结果不一致

 

 

 

 

php 安装sphinx扩展

cd ~
cd sphinx
cd api/libsphinxclient
./configure --prefix=/usr/local/sphinx/libsphinxcliet


下载PHP扩展,并编译安装
cd ~

wget -c http://pecl.php.net/get/sphinx-1.3.3.tgz
tar zxvf sphinx-1.3.3.tgz
cd sphinx-1.3.3
phpize
./configure --with-sphinx=/usr/local/sphinx/libsphinxclient/ --with-php-config=/usr//php/bin/php-config
make && make install

 

 

在目录/usr/local/php/lib/php/extensions/no-debug-non-zts-20131226/下出现sphinx.so表示成功

接着把配置写进/usr/local/php/etc/php.ini

[Sphinx]
extension = sphinx.so

 vi /usr/local/php/etc/php.ini

 

 

 

 

编写searchgoods.php

如果要调试,可以开启

 

ini_set("display_errors", "On");方便查看错误信息

php searchgoods.php

 

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

芝麻开门2015

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值