mysql的sphinx插件_php+mysql+sphinx优化全文检索 | 呆萌的程序猿

Sphinx概述

一个基于SQL的全文检索引擎,它提供了比数据库更专业的搜索功能,并且sphinx特别为一些脚本语言设计了搜索的API接口,如PHP、Python等,同时为MySQL也设计了一个存储引擎插件。

51da3c6875a2bc893fca3cfeabee0cfb.png

Sphinx应用场景

网站中的搜索

CMS中的搜索

安装和配置Sphinx

①。三种安装方式:

下载二进制

下载MAC OSx系统的二进制文件。

解压

1

tar -zxvf sphinx-2.2.10-release-osx10.10-x86_64.tar.gz移动到指定的安装目录

本人一般安装在~/Documents/software目录下面。

1

mv sphinx-2.2.10-release-osx10.10-x86_64 ~/Documents/software/sphinx-2.2.10

下载源码

编译

1

2

3

./configure

make

make install

使用brew安装

1

brew install sphinx --with-mysql

安装信息:使用命令安装的时候,使用–with参数,将相应的数据库的库文件也编译到里面。否则会报错:ERROR: source ‘user_core_0’: unknown type ‘pgsql’; skipping.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

==> Using the sandbox

==> Downloading http://sphinxsearch.com/files/sphinx-2.2.11-release.tar.gz

######################################################################## 100.0%

==> Cloning https://github.com/snowballstem/snowball.git

Cloning into '/Users/zhoujunwen/Library/Caches/Homebrew/sphinx--stemmer--git'...

remote: Counting objects: 2433, done.

remote: Total 2433 (delta 0), reused 1 (delta 0), pack-reused 2432

Receiving objects: 100% (2433/2433), 578.26 KiB | 343.00 KiB/s, done.

Resolving deltas: 100% (1499/1499), done.

==> Checking out revision 9b58e92c965cd7e3208247ace3cc00d173397f3c

==> make dist_libstemmer_c

==> tar xzf dist/libstemmer_c.tgz -C /private/tmp/sphinx-20170531-15389-1rdhnqy/sphinx-2.2.11-release

==> ./configure --prefix=/usr/local/Cellar/sphinx/2.2.11 --localstatedir=/usr/local/var --with-libstemmer --with-mysql --without-pgsql

==> make install

==> Caveats

This is not sphinx - the Python Documentation Generator.

To install sphinx-python use pip.

Sphinx has been compiled with libstemmer support.

Sphinx depends on either MySQL or PostreSQL as a datasource.

You can install these with Homebrew with:

brew install mysql

For MySQL server.

brew install mysql-connector-c

For MySQL client libraries only.

brew install postgresql

For PostgreSQL server.

We don't install these for you when you install this formula, as

we don't know which datasource you intend to use.

==> Summary

? /usr/local/Cellar/sphinx/2.2.11: 15 files, 16.5MB, built in 1 minute 21 seconds

②。配置环境变量

注意: 我们根据第一种安装方式进行配置。其他方式相同,仅仅是文件的位置不同而已。命令行安装后的配置文件见附录命令行安装后的配置文件

1

2

export SPHINX_HOME=/Users/zhoujunwen/Documents/software/sphinx-2.2.10

export PATH=$PATH:$SPHINX_HOME/bin

③。创建测试数据

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

DROP TABLE IF EXISTS `test`.`user`;

CREATE TABLE `test`.`user` (

`id` int(11) NOT NULL AUTO_INCREMENT,

`name` varchar(20) NOT NULL,

`age` int(11) NOT NULL,

`gender` tinyint(4) NOT NULL DEFAULT '0',

`status` tinyint(4) DEFAULT '0',

`create_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,

PRIMARY KEY (`id`)

) ENGINE=InnoDB DEFAULT CHARSET=utf8

REPLACE INTO test.user (name, age, gender, status, create_time ) VALUES

( '张三', 22, 0, 0, NOW()),

( '李四', 28, 0, 0, NOW()),

( '王五', 45, 0, 0, NOW()),

( '周惇', 70, 0, 0, NOW()),

( '李广', 76, 0, 0, NOW()),

( '霍去病', 56, 0, 0, NOW());

REPLACE INTO test.user (name, age, gender, status, create_time ) VALUES('zhangsan',24,0,0,now());

REPLACE INTO test.user (name, age, gender, status, create_time ) VALUES('fanbinbin',24,0,0,now());

REPLACE INTO test.user (name, age, gender, status, create_time ) VALUES('chenyanxi',24,0,0,now());

REPLACE INTO test.user (name, age, gender, status, create_time ) VALUES('zhangsan123',24,0,0,now());

REPLACE INTO test.user (name, age, gender, status, create_time ) VALUES('zhangsan456',24,0,0,now());

REPLACE INTO test.user (name, age, gender, status, create_time ) VALUES('zhang123san',24,0,0,now());

REPLACE INTO test.user (name, age, gender, status, create_time ) VALUES('zhangsan',24,0,0,now());

④。配置spinx+mysql

注意:由于直接使用了编译好的二进制文件,因此不像其他操作系统或者教程一样,配置文件在/usr/local/etc/sphinx/下面,而是在我们的二进制解压目录下面。如果是命令行安装的,则各个文件均在不同的目录。

在解压目录下面,有两个.in结尾的文件:sphinx-min.conf.in和 sphinx.conf.in。这是两个默认的配置文件模板。

我们copy一份,做修改即可,一般conf文件的名称和数据库一致亦或从命名上可以看出是哪个数据库就OK。

1

cp sphinx.conf.in sphinx-test.conf

我们使用vim sphinx-test.conf打开test数据库对应的文件:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

#############################################################################

## 数据源定义,数据从哪里来。

#############################################################################

source test

{

# 数据源类型,没有设置默认值,需要自己指定,可以指定的数据库类型:mysql, pgsql, mssql, xmlpipe, xmlpipe2, odbc

type = mysql

sql_host = localhost

sql_user = test

sql_pass =

sql_db = test

sql_port = 3306

}

# 注意: ERROR: index 'test_user': No fields in schema - will not index.

# 需要将属性中的某个字符串自动建立为sql_field_string。所以,此处修改

# name的sql_attr_string 为 sql_field_string。

source test_user : test # test_user继承test

{

sql_query = SELECT id,name, age, gender,create_time,status from user

sql_attr_uint = id

sql_attr_uint = age

sql_attr_uint = status

sql_field_string = name

sql_attr_uint = gender

sql_attr_timestamp = create_time

}

#############################################################################

## 索引定义。索引,当有数据源之后,从数据源处构建索引。索引实际上就是相当于一个字典检索。有了整本字典内容以后,才会有字典检索。

#############################################################################

index test

{

type = plain

source = test_user

path = @CONFDIR@/data/test_user

docinfo = extern

dict = keywords

mlock = 0

min_word_len = 1

morphology = none

}

# 注意:对于一个没有sql_query的source,不能建立index。

#index test_user: test

#{

# path = @CONFDIR@data/test_user

# source = test_user

# morphology = stem_en

#}

index rt

{

type = rt

path = @CONFDIR@/data/rt

rt_field = title

rt_field = content

rt_attr_uint = gid

# rt_attr_bigint = guid

# rt_attr_float = gpa

# rt_attr_timestamp = ts_added

# rt_attr_string = author

# rt_attr_multi = tags

# rt_attr_multi_64 = tags64

# rt_attr_json = extra_data

}

#############################################################################

## indexer settings

#############################################################################

indexer

{

# 建立索引的时候,索引内存限制

mem_limit = 128M

# 每秒最大I/O操作次数,用于限制I/O操作

max_iops = 40

# 最大允许的I/O操作大小,以字节为单位,用于I/O节流

max_iosize = 1048576

# 对于XMLLpipe2数据源允许的最大的字段大小,以字节为单位

max_xmlpipe2_field = 4M

# 写缓冲区的大小,单位是字节

write_buffer = 1M

# 文件字段可用的最大缓冲区大小,字节为单位

max_file_field_buffer = 32M

}

#############################################################################

## searchd settings

#############################################################################

searchd

{

# listen = 127.0.0.1

# listen = 192.168.0.1:9312

# listen = 9312

# listen = /var/run/searchd.sock

listen = 9312

listen = 9306:mysql41

log = @CONFDIR@/log/searchd.log

query_log = @CONFDIR@/log/query.log

read_timeout = 5

client_timeout = 300

max_children = 30

persistent_connections_limit = 30

pid_file = @CONFDIR@/log/searchd.pid

seamless_rotate = 1

preopen_indexes = 1

unlink_old = 1

mva_updates_pool = 1M

max_packet_size = 8M

max_filters = 256

max_filter_values = 4096

max_batch_queries = 32

}

Sphinx如何工作

sphinx配置文件

利用indexer工具生成索引数据

1

2

pwd

# /Users/zhoujunwen/Documents/software/sphinx-2.2.101

./bin/indexer --config ./shpinx-test.conf --all

索引生成成功的话会在 ./data下生成相应的索引文件。

运行indexer时,窗口信息如下,表示配置成功了:

./bin/indexer –config ./shpinx-test.conf –all

Sphinx 2.2.11-id64-release (95ae9a6)

Copyright (c) 2001-2016, Andrew Aksyonoff

Copyright (c) 2008-2016, Sphinx Technologies Inc (http://sphinxsearch.com)

using config file ‘/usr/local/etc/sphinx.conf’…

indexing index ‘test_user’…

collected 6 docs, 0.0 MB

total 6 docs, 39 bytes

total 0.268 sec, 145 bytes/sec, 22.37 docs/sec

skipping non-plain index ‘rt’…

total 11 reads, 0.000 sec, 2.9 kb/call avg, 0.0 msec/call avg

total 10 writes, 0.000 sec, 0.1 kb/call avg, 0.0 msec/call avg

在./data目录下执行tree或者ls命令查看文件:

9cd1e67324441673a0d02eba78269b1d.png

扩展名

描述

.spa

存储文档属性

.spd

存储每个词ID可匹配的文档ID列表

.sph

存储索引头信息

.spi

存储词列表

.spm

存储MVA数据

.spp

存储每个词的命中列表

searchd:守护进程,使用indexer工具生成的数据做查询

1

./bin/searchd --config ./shpinx-test.conf

默认shpinx会在端口9312、9306两个端口上监听,其中9312端口是走的shpinx协议,走这个端口需要通过api、shpinx自带的search客户端、mysql的shpinxSE引擎来访问,而9306走的是mysql41协议,这个端口可以直接通过mysql客户端连接,然后进行相关的搜索查询.

7e63ec1689f1ec42c62123f648b2b0ad.png

查看是否启动成功,我们在命令行通过Sphinx连接查询:

1

mysql -h0 -P 9306 # -h0 = -h 127.0.0.1

连接成功,可以向查询mysql一样查询索引里面的数据:

1

show tables;

e2a9eaba4e10e2d705443a485c8b3374.png

安装php扩展php-sphinx

如果安装的sphinx是通过下载的源码文件安装的,那么在源码的api目录,有libsphinxclient这个文件。执行./configure,编译安装 make && make install安装libsphinxclient。否则回报错误:

checking for libsphinxclient headers in default path… not found

configure: error: Cannot find libsphinxclient headers

然后再安装php-sphinx的扩展:

1

2

3

wget https://pecl.php.net/get/sphinx-1.3.3.tgz

tar zxvf sphinx-1.3.3.tgz

phpize && ./configure && make && make install

如果是brew命令安装的,请直接使用下面命令安装即可:

1

brew install php56-sphinx

是否安装成功,可以使用该命令查看:php -i "(command-line 'phpinfo()')"。

测试代码

创建文件test.php,编写代码:

1

2

3

4

5

6

7

8

$s = new SphinxClient;

$s->setServer("localhost", 9312);

$s->setMatchMode(SPH_MATCH_ANY);

$s->setMaxQueryTime(3);

$result = $s->query("zhangsan");

var_dump($result);

?>

执行: php test.php

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

array (

'error' => '',

'warning' => '',

'status' => 0,

'fields' =>

array (

0 => 'name',

),

'attrs' =>

array (

'uid' => 1,

'name' => 7,

'age' => 1,

'gender' => 1,

'create_time' => 2,

'status' => 1,

),

'matches' =>

array (

13 =>

array (

'weight' => 1,

'attrs' =>

array (

'uid' => 13,

'name' => 'zhangsan',

'age' => 24,

'gender' => 0,

'create_time' => 2017,

'status' => 0,

),

),

7 =>

array (

'weight' => 1,

'attrs' =>

array (

'uid' => 7,

'name' => 'zhangsan',

'age' => 24,

'gender' => 0,

'create_time' => 2017,

'status' => 0,

),

),

),

'total' => 2,

'total_found' => 2,

'time' => 0,

'words' =>

array (

'zhangsan' =>

array (

'docs' => 2,

'hits' => 2,

),

),

)

附录

命令行安装后的配置文件

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

source test

{

type = mysql

sql_host = localhost

sql_user = root

sql_pass = dj123456

sql_db = test

sql_port = 3306 # optional, default is 3306

}

source test_user : test

{

sql_ranged_throttle = 100

sql_query = SELECT id, id as uid,name, age, gender,create_time,status from user

sql_attr_uint = uid

sql_attr_uint = age

sql_attr_uint = status

sql_field_string = name

sql_attr_uint = gender

sql_attr_timestamp = create_time

}

index test_user

{

type = plain

source = test_user

path = /usr/local/var/data/test_user

docinfo = extern

mlock = 0

min_word_len = 1

morphology = none

stopwords = /usr/local/var/data/stopwords.txt

wordforms = /usr/local/var/data/wordforms.txt

exceptions = /usr/local/var/data/exceptions.txt

global_idf = /usr/local/var/sphinx/global.idf

}

#index test_user : test

#{

# path = /usr/local/var/data/test_user

# source = test_user

# morphology = stem_en

#}

index rt

{

# 'rt' index type must be specified to use RT index

type = rt

# index files path and file name, without extension

# mandatory, path must be writable, extensions will be auto-appended

path = /usr/local/var/data/rt

# RAM chunk size limit

# RT index will keep at most this much data in RAM, then flush to disk

# optional, default is 128M

#

# rt_mem_limit = 512M

# full-text field declaration

# multi-value, mandatory

rt_field = title

rt_field = content

# unsigned integer attribute declaration

# multi-value (an arbitrary number of attributes is allowed), optional

# declares an unsigned 32-bit attribute

rt_attr_uint = gid

# RT indexes currently support the following attribute types:

# uint, bigint, float, timestamp, string, mva, mva64, json

#

# rt_attr_bigint = guid

# rt_attr_float = gpa

# rt_attr_timestamp = ts_added

# rt_attr_string = author

# rt_attr_multi = tags

# rt_attr_multi_64 = tags64

# rt_attr_json = extra_data

}

indexer

{

# 建立索引的时候,索引内存限制

mem_limit = 128M

# 每秒最大I/O操作次数,用于限制I/O操作

max_iops = 40

# 最大允许的I/O操作大小,以字节为单位,用于I/O节流

max_iosize = 1048576

# 对于XMLLpipe2数据源允许的最大的字段大小,以字节为单位

max_xmlpipe2_field = 4M

# 写缓冲区的大小,单位是字节

write_buffer = 1M

# 文件字段可用的最大缓冲区大小,字节为单位

max_file_field_buffer = 32M

}

searchd

{

listen = 9312

listen = 9306:mysql41

log = /usr/local/var/log/searchd.log

query_log = /usr/local/var/log/query.log

read_timeout = 5

client_timeout = 300

max_children = 30

persistent_connections_limit = 30

pid_file = /usr/local/var/log/searchd.pid

seamless_rotate = 1

preopen_indexes = 1

unlink_old = 1

mva_updates_pool = 1M

max_packet_size = 8M

max_filters = 256

max_filter_values = 4096

max_batch_queries = 32

workers = threads

}

重要的命令

1

2

3

/usr/local/Cellar/sphinx/2.2.11/bin/indexer --config /usr/local/etc/sphinx.conf --all

/usr/local/bin/searchd --config /usr/local/etc/sphinx.conf

命令行安装php56-sphinx的输出信息

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

==> Installing php56-sphinx from josegonzalez/php

==> Installing dependencies for josegonzalez/php/php56-sphinx: libxml2, libsphinxclient

==> Installing josegonzalez/php/php56-sphinx dependency: libxml2

==> Downloading https://homebrew.bintray.com/bottles/libxml2-2.9.4_3.sierra.bottle.tar.gz

######################################################################## 100.0%

==> Pouring libxml2-2.9.4_3.sierra.bottle.tar.gz

==> Using the sandbox

==> Caveats

This formula is keg-only, which means it was not symlinked into /usr/local,

because macOS already provides this software and installing another version in

parallel can cause all kinds of trouble.

If you need to have this software first in your PATH run:

echo 'export PATH="/usr/local/opt/libxml2/bin:$PATH"' >> ~/.bash_profile

For compilers to find this software you may need to set:

LDFLAGS: -L/usr/local/opt/libxml2/lib

CPPFLAGS: -I/usr/local/opt/libxml2/include

For pkg-config to find this software you may need to set:

PKG_CONFIG_PATH: /usr/local/opt/libxml2/lib/pkgconfig

If you need Python to find bindings for this keg-only formula, run:

echo /usr/local/opt/libxml2/lib/python2.7/site-packages >> /usr/local/lib/python2.7/site-packages/libxml2.pth

mkdir -p /Users/zhoujunwen/Library/Python/2.7/lib/python/site-packages

echo 'import site; site.addsitedir("/usr/local/lib/python2.7/site-packages")' >> /Users/zhoujunwen/Library/Python/2.7/lib/python/site-packages/homebrew.pth

==> Summary

? /usr/local/Cellar/libxml2/2.9.4_3: 281 files, 10.4MB

==> Installing josegonzalez/php/php56-sphinx dependency: libsphinxclient

==> Downloading https://homebrew.bintray.com/bottles-php/libsphinxclient-2.2.11.sierra.bottle.tar.gz

######################################################################## 100.0%

==> Pouring libsphinxclient-2.2.11.sierra.bottle.tar.gz

? /usr/local/Cellar/libsphinxclient/2.2.11: 5 files, 67KB

==> Installing josegonzalez/php/php56-sphinx

==> Downloading https://pecl.php.net/get/sphinx-1.3.2.tgz

######################################################################## 100.0%

==> /usr/local/opt/php56/bin/phpize

==> ./configure --prefix=/usr/local/Cellar/php56-sphinx/1.3.2 --with-php-config=/usr/local/opt/php56/bin/php-config

==> make

==> Caveats

To finish installing sphinx for PHP 5.6:

* /usr/local/etc/php/5.6/conf.d/ext-sphinx.ini was created,

do not forget to remove it upon extension removal.

* Validate installation via one of the following methods:

*

* Using PHP from a webserver:

* - Restart your webserver.

* - Write a PHP page that calls "phpinfo();"

* - Load it in a browser and look for the info on the sphinx module.

* - If you see it, you have been successful!

*

* Using PHP from the command line:

* - Run `php -i "(command-line 'phpinfo()')"`

* - Look for the info on the sphinx module.

* - If you see it, you have been successful!

==> Summary

? /usr/local/Cellar/php56-sphinx/1.3.2: 3 files, 74.5KB, built in 19 seconds

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值