安装logstash
从https://www.elastic.co/cn/downloads/past-releases#logstash选择一个匹配的版本下载
logstash主要作用是将数据格式转换导入到指定存储
数据集
选择 https://files.grouplens.org/datasets/movielens/ml-latest-small.zip
这个大小只有1M左右
编写logstash.conf
这个文件是告诉logstash去那里拿数据并转换数据导入到elasticsearch
input 中指定数据来源
filter 中指定数据转换
output 转换后的数据存储位置
具体参数含义参官网文档https://www.elastic.co/guide/en/logstash/7.17/index.html
input {
file {
# 下载的数据集位置
path => "/Users/xieruixiang/Downloads/ml-latest-small/movies.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
separator => ","
columns => ["id","content","genre"]
}
mutate {
split => { "genre" => "|" }
remove_field => ["path", "host","@timestamp","message"]
}
mutate {
split => ["content", "("]
add_field => { "title" => "%{[content][0]}"}
add_field => { "year" => "%{[content][1]}"}
}
mutate {
convert => {
"year" => "integer"
}
strip => ["title"]
remove_field => ["path", "host","@timestamp","message","content"]
}
}
output {
elasticsearch {
hosts => "http://localhost:9200"
index => "movies"
document_id => "%{id}"
}
stdout {}
}
执行数据转换导入
logstash.conf 我写在了 /Users/xieruixiang/elasticsearch/conf 目下
# elasticsearch 是我起的名字,原先下载的被我改成es了
# 这是 elasticsearch 目录结构
xieruixiang@xieruixiangdeMacBook-Pro elasticsearch % ls
conf docker es kibana logstash
# 进入logstash目录,运行logstash转换数据
bin/logstash -f /Users/xieruixiang/elasticsearch/conf/logstash.conf