使用ElasticSearch 和 BERT进行NLP文本分析

文章大纲


es 8.0 新特性

https://www.elastic.co/cn/blog/whats-new-elastic-8-0-0

新版es 新增的 机器学习 算法(比如异常检测)

  • https://www.elastic.co/guide/en/machine-learning/current/anomaly-examples.html

wsl2 下使用 docker 搞一下es

如何在wsl2 下面安装docker,可以参考我之前的博客

拉取 Elasticsearch Docker image

Obtaining Elasticsearch for Docker is as simple as issuing a docker pull command against the Elastic Docker registry.

docker pull docker.elastic.co/elasticsearch/elasticsearch:8.2.0

启动单个 ES 节点

Start a single-node cluster with Dockeredit
If you’re starting a single-node Elasticsearch cluster in a Docker container, security will be automatically enabled and configured for you. When you start Elasticsearch for the first time, the following security configuration occurs automatically:

Certificates and keys are generated for the transport and HTTP layers.
The Transport Layer Security (TLS) configuration settings are written to elasticsearch.yml.
A password is generated for the elastic user.
An enrollment token is generated for Kibana.
You can then start Kibana and enter the enrollment token, which is valid for 30 minutes. This token automatically applies the security settings from your Elasticsearch cluster, authenticates to Elasticsearch with the kibana_system user, and writes the security configuration to kibana.yml.

The following commands start a single-node Elasticsearch cluster for development or testing.

Create a new docker network for Elasticsearch and Kibana

docker network create elastic
Start Elasticsearch in Docker. A password is generated for the elastic user and output to the terminal, plus an enrollment token for enrolling Kibana.

docker run --name es01 --net elastic -p 9200:9200 -p 9300:9300 -it docker.elastic.co/elasticsearch/elasticsearch:8.2.0
You might need to scroll back a bit in the terminal to view the password and enrollment token.

Copy the generated password and enrollment token and save them in a secure location. These values are shown only when you start Elasticsearch for the first time.

If you need to reset the password for the elastic user or other built-in users, run the elasticsearch-reset-password tool. This tool is available in the Elasticsearch /bin directory of the Docker container. For example:

docker exec -it es01 /usr/share/elasticsearch/bin/elasticsearch-reset-password
Copy the http_ca.crt security certificate from your Docker container to your local machine.

docker cp es01:/usr/share/elasticsearch/config/certs/http_ca.crt .
Open a new terminal and verify that you can connect to your Elasticsearch cluster by making an authenticated call, using the http_ca.crt file that you copied from your Docker container. Enter the password for the elastic user when prompted.

curl --cacert http_ca.crt -u elastic https://localhost:9200

使用docker 安装 es

主体参考:

  • https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html

docker官方的镜像库比较慢,在进行镜像操作之前,需要将镜像源设置为国内的站点。

新建文件/etc/docker/daemon.json,输入如下内容:

{
    "registry-mirrors" : [
        "https://registry.docker-cn.com",
        "https://docker.mirrors.ustc.edu.cn",
        "http://hub-mirror.c.163.com",
        "https://cr.console.aliyun.com/"
    ]
}

然后重启docker的服务:

systemctl restart docker

早期版本方案 bert-server

https://towardsdatascience.com/elasticsearch-meets-bert-building-search-engine-with-elasticsearch-and-bert-9e74bf5b4cf2

在这里插入图片描述

https://github.com/Hironsan/bertsearch


Es 8.0 版本方案

未完待续

es 与 nlp

https://www.elastic.co/guide/en/machine-learning/master/ml-nlp.html


参考文献

Introduction to modern natural language processing with PyTorch in Elasticsearch

  • https://www.elastic.co/cn/blog/introduction-to-nlp-with-pytorch-models
  • https://eland.readthedocs.io/en/v8.1.0/
  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值