Elasticsearch入门2 导入导出数据

最新推荐文章于 2024-07-31 16:12:07 发布

BabY虎子

最新推荐文章于 2024-07-31 16:12:07 发布

阅读量1.6k

点赞数

分类专栏： es python 文章标签： elasticsearch python

本文链接：https://blog.csdn.net/u010041824/article/details/77947495

版权

python 同时被 2 个专栏收录

18 篇文章 1 订阅

订阅专栏

1 篇文章 0 订阅

订阅专栏

将数据导入导出elasticsearch，对elasticsearch进行查询是常用操作。python特供很方便的接口，对数据进行导入导出。

参考博客

python批量导入数据进Elasticsearch http://blog.csdn.net/u012236368/article/details/51284587
[ElasticSearch]Term精确匹配中文字符串短语 http://blog.csdn.net/sunnyyoona/article/details/51842221
python操作Elasticsearch (一、例子) http://www.cnblogs.com/yxpblog/p/5141738.html
elasticsearch 查询（match和term） http://www.cnblogs.com/yjf512/p/4897294.html

代码

python导入Elasticsearch的代码很易懂，主要是对es.index方法中的body进行设置，用字典存储要投入的数据，记录如下：

#coding:utf-8
'''
python导入数据到es中
查询数据
'''
from datetime import datetime
from elasticsearch import Elasticsearch
from elasticsearch import helpers

es = Elasticsearch(['localhost:9200'])
'''
#导入数据
f = open('parse_table', 'r', encoding='utf-8')
count = 0
str_title = ''
str_context = ''
for row in f.readlines():
    row = row.strip()
    if count % 16 < 8:
        str_title = str(row)
    if 8 <= count % 16 <=15:
        str_context = str(row)+str_context
    if count % 16 == 15:
        action = {}
        action['title'] = str_title
        action['context'] = str_context
        #print(action)
        es.index(index='guwen', doc_type='guwen', body=action)
        str_context = ''

    count += 1
#print(count)
'''

查询数据用get和search方法,search方法可以支持结构化查询。

result = es.search(index='guwen', doc_type='guwen', body={
    "fields": [
       "title",
       "context"
    ],
    "query": {
        "match_phrase": {
           "context": {
               "query": "若君父不敬其爲君父之道，則臣子便可以忿之耶？"
           }
        }
    }
})

for hit in result['hits']['hits']:
    print('this is title')
    print(hit['fields']['title'])
    print('this is context')
    print(hit['fields']['context'])

利用elasticsearch查询的时候，如果想要精确查询字符串的时候，可以使用match_phrase，如上面所示。有其他需求，可以参考elasticsearch查询(match和term)中介绍。

另外，python还支持对数据的批量存入。

actions = []
i = 1
# for row_copy in f.readlines():
#     print(row_copy)
#     row_line = row_copy.split('<>')
#     action={
#         "_index":"xx",
#         "_type":"yy",
#         "_id":i,
#         "_source":{
#             'year':row_line[7].decode('utf-8'),
#             'region':row_line[8].decode('utf-8'),
#         }
#     }
#     i += 1
#     actions.append(action)
#     if len(actions) == 5:
#         helpers.bulk(es, actions)
#         del actions[0:len(actions)]
#
# if len(actions)>0:
#     helpers.bulk(es,actions)