elasticsearch

elasticsearch

  1. elasticsearch是基于java开发的,所以一定要安装jdk
  2. elasticsearch 5及其以上的版本需要安装java8及以上
  3. cmd 输入java -version查看java的版本号

安装

第一步:安装elasticsearch
1. 下载:https://github.com/medcl/elasticsearch-rtf
2. cmd 到:G:\study_code\elasticsearch-rtf-master\bin, 执行elasticsearch即可,然后在浏览器中访问http://127.0.0.1:9200/,如果能访问则表示安装成功

第二步:安装head插件
1. 安装node:https://blog.csdn.net/luxiangy1314/article/details/105693054
2. 安装cnpm:cnpm是淘宝的一个镜像,在cmd中运行如下命令安装,然后输入cnpm验证是否安装成功

npm install -g cnpm --registry=https://registry.npm.taobao.org
3. 安装head插件,下载地址:https://github.com/mobz/elasticsearch-head

在这里插入图片描述
4. 修改elasticsearch-rtf-master/config/elasticsearch.yml文件,在最后加上如下语句:

http.cors.enabled: true
http.cors.allow-origin: "*"
http.cors.allow-methods: OPTION, HEAD, GET, POST, PUT, DELETE
http.cors.allow-headers: "X-Requested-With, Content-Type, Content-Length, X-User"

第三步:安装kibana
kibana必须与elasticsearch的版本保持一致
下载地址:https://www.elastic.co/cn/downloads/past-releases
cmd 到G:\study_code\kibana-5.1.2-windows-x86\bin
执行kibana.bat
浏览器访问:http://localhost:5601

基本概念

  1. 集群:一个或多个节点组织在一起
  2. 节点:一个节点是一个集群中的一个服务器,由一个名字来标识,默认是一个随机的漫画角色名称
  3. 分片:将索引划分为多分能力,允许水平分割和扩展扩容,多个分片响应请求,提高性能和吞吐量。
  4. 副本:创建分片的一份或多份的能力,在一个节点失败其余节点可以顶上
elasticsearchmysql
index(索引)数据库
type(类型)
documents(文档)
fields

倒排索引

一般现在的搜索引擎底层的索引都是倒排索引
倒排索引源于实际应用中需要根据属性的值来查找记录。这种索引表中的每一项都包括一个属性值和具有该属性值的各记录的地址。由于不是由记录来确定属性值,而是由属性值来确定记录的位置,因而称为倒排索引(inverted index)。带有倒排索引的文件我们称为倒排索引文件,简称倒排文件(inverted file)。

使用

文档、索引的增删改查操作

# 索引(数据库)初始化操作
# 指定分片和副本的数量,shards一旦设置不能修改
PUT lagou
{
  "settings": {
    "index":{
      "number_of_shards":5,
      "number_of_replicas":1
    }
  }
}


# 获取lagou索引下的settings
GET lagou/_settings
# 获取所有索引的settings
GET _all/_settings
# 获取某些索引的settings,两个索引间不能有空格
GET .kibana,lagou/_settings

# 修改setting
PUT lagou/_settings
{
  "number_of_replicas":2
}

# 获取索引信息
GET _all
GET lagou

# 保存文档,
# 这里的1是指明文档的id,用PUT
# 如果不指明这个ID,会自动生成一个uuid, 用POST
PUT lagou/chemsrc/1
{
    "常用名": "甲醛",
    "英文名": "Formaldehyde",
    "CAS号": "50-00-0",
    "分子量": "30.026",
    "密度": "0.7±0.1 g/cm3",
    "沸点": "-19.5±9.0 °C at 760 mmHg",
    "分子式": "CH2O",
    "熔点": "-92 °C",
    "MSDS": "中文版\n美版",
    "闪点": "-75.1±13.7 °C",
    "符号": "GHS05, GHS06, GHS08",
    "信号词": "Danger"
}


POST lagou/chemsrc/
{
    "常用名": "乳酸",
    "英文名": "Lactic acid",
    "CAS号": "50-21-5",
    "分子量": "90.078",
    "密度": "1.3±0.1 g/cm3",
    "沸点": "227.6±0.0 °C at 760 mmHg",
    "分子式": "C3H6O3",
    "熔点": "18ºC",
    "MSDS": "中文版\n美版",
    "闪点": "109.9±16.3 °C",
    "符号": "GHS05",
    "信号词": "Danger"
}

# 查询文档
GET lagou/chemsrc/1
# 查询文档的某几个字段
GET lagou/chemsrc/1?_source=MSDS,常用名

# 修改文档
PUT lagou/chemsrc/1
{
    "常用名": "甲醛111",
    "英文名": "Formaldehyde",
    "CAS号": "50-00-0",
    "分子量": "30.026",
    "密度": "0.7±0.1 g/cm3",
    "沸点": "-19.5±9.0 °C at 760 mmHg",
    "分子式": "CH2O",
    "熔点": "-92 °C",
    "MSDS": "中文版\n美版",
    "闪点": "-75.1±13.7 °C",
    "符号": "GHS05, GHS06, GHS08",
    "信号词": "Danger"
}
# 修改文档的某几个字段
POST lagou/chemsrc/1/_update
{"doc":
  {
    "MSDS": "aaaaa"
  }
}


# 删除
DELETE lagou/chemsrc/1
DELETE lagou

# 批量查询
GET _mget
{
  "docs":[
    {
      "_index":"testdb",
      "_type":"job1",
      "_id":2
    },
    {
      "_index":"testdb",
      "_type":"job2",
      "_id":1
    }
    ]
}



GET testdb/job1/_mget
{
  "docs":[
    {
      "_id":2
    },
    {
      "_id":1
    }
    ]
}


GET testdb/job1/_mget
{
  "ids":[1,2]
}

bulk批量操作

批量导入可以合并多个操作,比如index、detelete、update、create等等,也可以帮助从一个索引导入到另外一个索引

# 批量操作,注意:内容不能进行美化,必须放在一行
POST _bulk
{"index":{"_index":"chemsrc","_type":"chem","_id":"1"}}
{"常用名": "敌敌畏","英文名": "dichlorvos","CAS号": "62-73-7","分子量": "220.976","密度": "1.4±0.1 g/cm3","沸点": "176.8±40.0 °C at 760 mmHg","分子式": "C4H7Cl2O4P","熔点": "-60°C","MSDS": "中文版\n美版","闪点": "14.7±35.0 °C","符号": "GHS06, GHS09","信号词": "Danger"}
{"index":{"_index":"chemsrc","_type":"chem","_id":"2"}}
{"常用名": "N-亚硝基二甲胺","英文名": "N-Nitrosodimethylamine","CAS号": "62-75-9","分子量": "74.082","密度": "1.0±0.1 g/cm3","沸点": "152.0±0.0 °C at 760 mmHg","分子式": "C2H6N2O","熔点": "50ºC","MSDS": "中文版\n美版","闪点": "36.1±18.7 °C","符号": "GHS02, GHS06, GHS08","信号词": "Danger"}

映射

映射:创建索引的时候,可以预先定义字段的类型以及相关属性
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

基本查询

# 创建索引,并构造4条数据
PUT lagou
{"mappings":{
  "job":{
    "properties": {
      "title":{
        "store": true,
        "type": "text",
        "analyzer": "ik_max_word"  # 分词
      },
      "company_name":{
        "store": true,
        "type": "keyword"  # keyword不会进行分词处理
      },
      "desc":{
        "type": "keyword"
      },
      "comments":{
        "type": "integer"
      },
      "add_time":{
        "type": "date",
        "format":"yyyy-MM-dd"
      }
    }
  }
}}

POST  lagou/job/
{"title":"python django开发工程师",
"company_name":"美图科技有限公司",
"desc":"对django的概念熟悉,熟悉python基础知识",
"comments":20,
"add_time":"2017-4-1"}

POST  lagou/job/
{"title":"python scripy redis分布式爬虫基本",
"company_name":"百度科技有限公司",
"desc":"对scripy的概念熟悉,熟悉redis的基本操作",
"comments":5,
"add_time":"2017-4-5"}

POST  lagou/job/
{"title":"elasticsearch打造搜索引擎",
"company_name":"阿里巴巴科技有限公司",
"desc":"熟悉数据结构算法,熟悉python的基本开发",
"comments":15,
"add_time":"2016-10-20"}

POST  lagou/job/
{"title":"python打造推荐引擎系统",
"company_name":"阿里巴巴科技有限公司",
"desc":"熟悉推荐引擎的原理以及算法,掌握C语言",
"comments":60,
"add_time":"2016-10-20"}

# match查询:可以对我们的查询进行分词
GET lagou/job/_search
{"query": {
  "match": {
    "title": "python"
  },
  "from":1,
  "size":2
}}

# term查询:对传入进来的值不会进行分词查询
GET lagou/job/_search
{"query": {
  "term": {
    "title": "python"
  }
}}

GET lagou/job/_search
{"query": {
  "terms": {
    "title": ["工程师","django","系统"]
  }
}}
# 控制查询的返回数量
GET lagou/job/_search
{"query": {
  "match": {
    "title": "python"
  }
},
"from": 0,
"size": 2
  
}
# match_all查询:查询所有
GET lagou/job/_search
{"query": {
  "match_all": {}
}
}
# 短语查询,首先会将“python系统”分词,然后查询同时有python和系统的内容,slop:设置的# 是两个词之间的最小距离
GET lagou/job/_search
{"query": {
  "match_phrase": {
    "title": {
      "query":"python系统",
      "slop": 6
    }
  }
}
}
# multi_match查询:可以指明多个字段
GET lagou/job/_search
{"query": {
  "multi_match": {
      "query":"python",
      "fields": ["title^3", "desc"]
  }
}
}
# 指定返回的字段,stored_fields返回的是stored为true的字段
GET lagou/job/_search
{
  "stored_fields": ["title","company_name"], 
  "query": {
  "multi_match": {
      "query":"python",
      "fields": ["title^3", "desc"]
  }
}
# 排序
GET lagou/job/_search
{"query": {
  "match_all": {}
},
"sort": [
  {
    "comments": {
      "order": "desc"
    }
  }
]
}
# 范围
GET lagou/job/_search
{
  "query": {
    "range": {
      "comments": {
        "gte": 10,  # gte大于等于
        "lte": 20,  # 小于等于
        "boost": 2.0  # boost权重
      }
    }
}
}
# wildcard查询
GET lagou/job/_search
{
  "query": {
    "wildcard": {
      "title": {
        "value": "pyth*n"
      }
    }
  }
}

组合查询

bool查询

# bool包括:must should must_not filter
# 格式如下:
# bool:{
# "filter": [],
# "must": [],
# "should": [],
# "must_not": []
# }
# filter查询:select * from testjob where salary=20
GET lagou/testjob/_search
{
  "query": {
    "bool": {
      "must": {
        "match_all": {}
      },
      "filter": {
        "term": {
          "salary": 20
        }
      }
    }
  }
}

GET lagou/testjob/_search
{
  "query": {
    "bool": {
      "must": {
        "match_all": {}
      },
      "filter": {
        "terms": {
          "salary": [10,20]
        }
      }
    }
  }
}

GET lagou/testjob/_search
{
  "query": {
    "bool": {
      "must": {
        "match_all": {}
      },
      "filter": {
        "term": {
          "title": "python"  # text在存储的时候会分词,且会做大小写处理,如果用term查询的话
        }
      }
    }
  }
}

# 查看分词
GET _analyze
{"analyzer": "ik_max_word",
  "text": "django开发工程师"
}

# select * from testjob where (salary=20 or title=python) and salary!= 30
GET lagou/testjob/_search
{
  "query": {
    "bool": {
      "should": [
        {"term": {"salary":20}},
        {"term": {"title": "python"}}
      ],
      "must_not": {"term":{"salary":30}}
    }
  }
}

# select * from testjob where title=python or (title=django and salary=30)
GET lagou/testjob/_search
{
  "query": {
    "bool": {
      "should": [
        {"term": {"title": "python"}},
        {"bool": {
          "must": [
            {"term": {"title": "django"}},
            {"term": {"salary": 30}}
          ]
        }}
      ]
    }
  }
}

python操作elasticsearch

第一步:安装 pip install elasticsearch-dsl==5.4.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
安装的时候要注意看elasticsearch的版本,有版本要求

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值