python elasticsearch搜索互联网_在python中使用elasticsearch做为搜索引擎。

最新推荐文章于 2022-03-22 22:00:35 发布

郭鑫垚

最新推荐文章于 2022-03-22 22:00:35 发布

阅读量185

点赞数

文章标签： python elasticsearch搜索互联网

本文链接：https://blog.csdn.net/weixin_36462094/article/details/113961353

版权

一直想找一个快速全文搜索的工具，目前找到的有Sphinx,xapian,Lucene,solr, elasticsearch ,whoosh,hyper estraier等，原本一直不太喜欢用java系的，内存大户伤不起啊。尝试了sphinx,xapian,hyper estraier,其中xapian资料太少，hyper estraier虽然比较简单，但资料也少。sphinx到是有一个中文化的分支coreseek，然后看到文档里面提到sphinx支持一元切分，但根据查询的例子去查的结果不是我想要的，不知道是不是我的查询语句用错了。而且因为我是在windows上测试的，而我的python又是2.7的版本，无法在 coreseek 上直接使用，应该需要重新编译。后来看到 elasticsearch ，真是亮瞎老夫的狗眼啊，这货直接可以用restful json操作又有pyes,pyelasticsearch这些已经封装好的操作库。 elasticsearch 还是支持分布式，扩展也方便了。由于是java开发的，跨平台也无问题，默认单机尝试的时候无须改配置，直接运行 bin/elasticsearch.bat 就可以了。

安装pyes

pip install pyes

使用例子

#coding:utf-8

import pyes

conn = pyes.ES(['127.0.0.1:9200'])#连接es

conn.create_index('test-index')#新建一个索引

#定义索引存储结构

mapping = { u'parsedtext': {'boost': 1.0,

'index': 'analyzed',

'store': 'yes',

'type': u'string',

"term_vector" : "with_positions_offsets"},

u'name': {'boost': 1.0,

'index': 'analyzed',

'store': 'yes',

'type': u'string',

"term_vector" : "with_positions_offsets"},

u'title': {'boost': 1.0,

'index': 'analyzed',

'store': 'yes',

'type': u'string',

"term_vector" : "with_positions_offsets"},

u'position': {'store': 'yes',

'type': u'integer'},

u'uuid': {'boost': 1.0,

'index': 'not_analyzed',

'store': 'yes',

'type': u'string'}

}

conn.put_mapping("test-type", {'properties':mapping}, ["test-index"])#定义test-type

conn.put_mapping("test-type2", {"_parent" : {"type" : "test-type"}}, ["test-index"])#从test-type继承

#插入索引数据

#{"name":"Joe Tester", "parsedtext":"Joe Testere nice guy", "uuid":"11111", "position":1}: 文档数据

#test-index：索引名称

#test-type: 类型

#1: id 注：id可以不给，系统会自动生成

conn.index({"name":"Joe Tester", "parsedtext":"Joe Testere nice guy", "uuid":"11111", "position":1}, "test-index", "test-type", 1)

conn.index({"name":"data1", "value":"value1"}, "test-index", "test-type2", 1, parent=1)

conn.index({"name":"Bill Baloney", "parsedtext":"Bill Testere nice guy", "uuid":"22222", "position":2}, "test-index", "test-type", 2)

conn.index({"name":"data2", "value":"value2"}, "test-index", "test-type2", 2, parent=2)

conn.index({"name":u"百度中国"}, "test-index", "test-type")#这个相当于中文的一元切分吧-_-

conn.index({"name":u"百中度"}, "test-index", "test-type")

conn.default_indices=["test-index"]#设置默认的索引

conn.refresh()#刷新以获得最新插入的文档

q = pyes.TermQuery("name", "bill")#查询name中包含bill的记录

results = conn.search(q)

for r in results:

print r

#查询name中包含百度的数据

q = pyes.StringQuery(u"百度",'name')

results = conn.search(q)

for r in results:

print r

#查询name中包含百度或着中度的数据

q = pyes.StringQuery(u"百度 OR 中度",'name')

results = conn.search(q)

for r in results:

print r

郭鑫垚

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python elasticsearch搜索互联网_在python中使用elasticsearch做为搜索引擎。

一直想找一个快速全文搜索的工具，目前找到的有Sphinx,xapian,Lucene,solr, elasticsearch ,whoosh,hyper estraier等，原本一直不太喜欢用java系的，内存大户伤不起啊。尝试了sphinx,xapian,hyper estraier,其中xapian资料太少，hyper estraier虽然比较简单，但资料也少。sphinx到是有一个中文化的分...
复制链接

扫一扫