logstash安装及简单测试

最新推荐文章于 2024-03-14 16:11:53 发布

-九天-

最新推荐文章于 2024-03-14 16:11:53 发布

阅读量996

点赞数

分类专栏：大数据文章标签：大数据

本文链接：https://blog.csdn.net/qq_26653073/article/details/90436894

版权

大数据专栏收录该内容

19 篇文章 0 订阅

订阅专栏

背景

业务目的是能够分析nginx和apache每天产生的日志，对url、ip、rest接口等信息进行监控，并将数据发送到elasticsearch服务。

对比flume

不重复消费，数据不丢失

目前flume支持hdfs比较好（个人理解）

离线安装

先配置JAVA_HOME 必须java8以上

下载解压即可

标准输入输出

bin/logstash -e 'input { stdin {} } output { stdout{} }'

文件到标准输出

首先在logstash中mkdir conf & touch file-stdout.conf

vim file-stdout.conf
input {
    file {
        path => "/home/bingo/data/test.log"
        start_position => "beginning"
        ignore_older => 0
    }
}
output {
    stdout{}
}

最后启动
bin/logstash -f conf/file-stdout.conf 




#多文件  path => "/home/bingo/data/*.log"、
#多目录  path => "/home/bingo/data/*/*.log"

#参数说明
start_position：默认end,是从文件末尾开始解析
ignore_older：默认超过24小时的日志不解析，0表示不忽略任何过期日志

执行命令后会看到控制台输出log文件的内容

此方式可以持续监控一个文件的输入

文件到文件

启动方式和文件到标准输出相同，不同之处在于配置文件：

touch file-file.conf
vim file-file.conf

input {
    file {
        path => "/home/connect/install/data/test.log"
        start_position => "beginning"
        ignore_older => 0
    }
}
output {
    file {
        path => "/home/connect/install/data/test1.log"
        }
       stdout{
        codec => rubydebug
}
}

上游到elasticsearch

touch file-es.conf

vim file-es.conf

input {
  file {
    type => "flow"
    path => "/home/bingo/data/logstash/logs/*/*.txt"
    discover_interval => 5
    start_position => "beginning" 
  }
}

output {
  if [type] == "flow" {
    elasticsearch {
      index => "flow-%{+YYYY.MM.dd}"
      hosts => ["master01:9200", "worker01:9200", "worker02:9200"]
    }
  }  
}

上游到kafka

kafka到es


touch kafka-es.conf
vim kafka-es.conf

input {
        kafka {
               zk_connect => "master01:2181", "worker01:2181", "worker02:2181"
               auto_offset_reset => "smallest"
               group_id => "bdes_clm_bs_tracking_log_json"
               topic_id => "clm_bs_tracking_log_json"
               consumer_threads => 2
               codec => "json"
               queue_size => 500
               fetch_message_max_bytes => 104857600
        }
}

output {
        elasticsearch {
                hosts => ["A:9900","B:9900","C:9900"]
                document_type => "bs_tracking_log"
                #document_id => "%{[mblnr]}%{[mjahr]}"
                flush_size => 102400
                index => "clm"
                timeout => 10
        }
}

参考：Logstash 基础入门