很多时间需要通过python来操作ES,在这里记录一个基本的方法。
1. 简单的python API查询
# coding=utf-8
from elasticsearch import Elasticsearch
es = Elasticsearch(hosts=['elastic:changeme@192.168.*.*8:9200'])
query_all = {
'query': {
'match_all': {}
}
}
# search
print('search:')
res = es.search(index="info", body=query_all)
print(res)
print("Got %d Hits:" % res['hits']['total'])
for hit in res['hits']['hits']:
print("%(keyword)s%(title)s" % hit["_source"])
# get
print('get:')
res = es.get(index="info", doc_type='main', id=2)
print('res:', res)
运行结果:
search:
{'_shards': {'failed': 0, 'skipped': 0, 'total': 5, 'successful': 5}, 'took': 0, 'hits': {'max_score': 1.0, 'total': 2, 'hits': [{'_id': '2', '_score': 1.0, '_index': 'info', '_source': {'title': ['醋加一宝90岁都不老'], 'keyword': ['补肾', '黑芝麻糊', '三文鱼', '红酒', '养颜']}, '_type': 'main'}, {'_id': '1', '_score': 1.0, '_index': 'info', '_source': {'title': ['部分医疗机构可预约宫颈癌疫苗 专家详解接种注意事项'], 'keyword': ['HPV', '疫苗', '宫颈癌', '接种', '感染']}, '_type': 'main'}]}, 'timed_out': False}
Got 2 Hits:
['补肾', '黑芝麻糊', '三文鱼', '红酒', '养颜']['醋加一宝90岁都不老']
['HPV', '疫苗', '宫颈癌', '接种', '感染']['部分医疗机构可预约宫颈癌疫苗 专家详解接种注意事项']
get:
res: {'_version': 1, 'found': True, '_source': {'title': ['醋加一宝90岁都不老'], 'keyword': ['补肾', '黑芝麻糊', '三文鱼', '红酒', '养颜']}, '_id': '2', '_index': 'info', '_type': 'main'}
2. 数据库连接错误
GET http://192.168.7.38:9200/index/fulltext/_search [status:401 request:0.004s]
elasticsearch.exceptions.AuthenticationException: TransportError(401, 'security_exception', 'missing authentication token for REST request [/index/fulltext/_search]')
这是一个授权问题。
把hosts写成这样:elastic:changeme@192.168.*.38:9200
3. python批量操作
# coding=utf-8
from elasticsearch import Elasticsearch
from elasticsearch.helpers import bulk
es = Elasticsearch(hosts=['elastic:changeme@192.168.*.*8:9200'])
# 批量索引文档
actions = []
action = {"_index": "kad_info",
"_type": "articles",
"_id": "1",
"_source": {
"article_crawtime": "2017-03-10 15:05:00",
"keywords01": "辅助用药 芜湖市 转化糖电解质注射液",
"keywords02": [{"keyword": "辅助用药", "weight": 0.9}, {"keyword": "芜湖市", "weight": 0.8},
{"keyword": "转化糖电解质注射液", "weight": 0.7}],
"title": ";01又有21个品种进辅助用药目录01;"
}}
actions.append(action)
action = {"_index": "kad_info",
"_type": "articles",
"_id": "2",
"_source": {
"article_crawtime": "2017-03-10 15:05:00",
"keywords01": "辅助用药 芜湖市 转化糖电解质注射液",
"keywords02": [{"keyword": "辅助用药", "weight": 0.9}, {"keyword": "芜湖市", "weight": 0.8},
{"keyword": "转化糖电解质注射液", "weight": 0.7}],
"title": ";02又有21个品种进辅助用药目录01;"
}}
actions.append(action)
rs = bulk(es, actions=actions)
print('成功插入%d个文档...' % (rs[0]))
运行结果:
成功插入2个文档...
提示一下:要注意es的版本问题。 这里测试的是es版本为5.6.0,python包为7.0.1
happyprince , http://blog.csdn.net/ld326/article/details/79187851