logstash-input-file使用

要学习该插件,第一步先让该插件跑起来再说
首先,我们创建一个文本文件, 文本文件的内容如下

[sqczm@sqczm first]$ pwd
/opt/logstash-6.7.1/demo/first
[sqczm@sqczm first]$ more users.txt 
name: zhangsan, age: 21, addr: "中国 北京"
name: lisi, age:20,addr:"美国"
name:wangwu,age:19,addr:"beijing"

接下来,我们来配置一下logstash的配置文件

[sqczm@sqczm first]$ pwd
/opt/logstash-6.7.1/demo/first
[sqczm@sqczm first]$ more first.conf 
input {
    file {
        path => ["/opt/logstash-6.7.1/demo/first/users.txt"]
    }
}
filter {
    
}
output {
    stdout {}
}

最后,我们来启动logstash

[sqczm@sqczm logstash-6.7.1]$ pwd
/opt/logstash-6.7.1
[sqczm@sqczm logstash-6.7.1]$ bin/logstash -f /opt/logstash-6.7.1/demo/first/first.conf
Sending Logstash logs to /opt/logstash-6.7.1/logs which is now configured via log4j2.properties
[2019-04-20T16:18:32,057][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2019-04-20T16:18:32,083][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"6.7.1"}
[2019-04-20T16:18:40,628][INFO ][logstash.pipeline        ] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50}
[2019-04-20T16:18:41,060][INFO ][logstash.inputs.file     ] No sincedb_path set, generating one based on the "path" setting {:sincedb_path=>"/opt/logstash-6.7.1/data/plugins/inputs/file/.sincedb_ccdcb2b886f0094c5a7fa2ddbbd759e3", :path=>["/opt/logstash-6.7.1/demo/first/users.txt"]}
[2019-04-20T16:18:41,112][INFO ][logstash.pipeline        ] Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<Thread:0x2ed54826 run>"}
[2019-04-20T16:18:41,202][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2019-04-20T16:18:41,248][INFO ][filewatch.observingtail  ] START, creating Discoverer, Watch with file and sincedb collections
[2019-04-20T16:18:41,658][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}

WTF,控制台居然没有打印文本信息?
先来看下官方文档,官方文档中有一个start_position的参数,官方的描述是这样的:
logstash-input-file官方说明

start_position

  • Value can be any of: beginning, end
  • Default value is “end”

Choose where Logstash starts initially reading files: at the beginning or at the end. The default behavior treats files like live streams and thus starts at the end. If you have old data you want to import, set this to beginning.

This option only modifies “first contact” situations where a file is new and not seen before, i.e. files that don’t have a current position recorded in a sincedb file read by Logstash. If a file has already been seen before, this option has no effect and the position recorded in the sincedb file will be used.

现在我们应该明白了,如果该参数不设置,默认将从文件的末尾开始读取,也就是说为什么我们刚才启动后控制台没有打印文本的原因了。按照官方的描述我们将其设置为"beginning",改完后的配置如下所示

[sqczm@sqczm logstash-6.7.1]$ pwd
/opt/logstash-6.7.1
[sqczm@sqczm logstash-6.7.1]$ more demo/first/first.conf 
input {
    file {
        path => ["/opt/logstash-6.7.1/demo/first/users.txt"]
        start_position => "beginning"
    }
}
filter {
    
}
output {
    stdout {}
}

修改完毕后,我们继续继续启动

[sqczm@sqczm logstash-6.7.1]$ pwd
/opt/logstash-6.7.1
[sqczm@sqczm logstash-6.7.1]$ bin/logstash -f /opt/logstash-6.7.1/demo/first/first.conf
Sending Logstash logs to /opt/logstash-6.7.1/logs which is now configured via log4j2.properties
[2019-04-20T16:31:36,250][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2019-04-20T16:31:36,274][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"6.7.1"}
[2019-04-20T16:31:44,536][INFO ][logstash.pipeline        ] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50}
[2019-04-20T16:31:44,864][INFO ][logstash.inputs.file     ] No sincedb_path set, generating one based on the "path" setting {:sincedb_path=>"/opt/logstash-6.7.1/data/plugins/inputs/file/.sincedb_ccdcb2b886f0094c5a7fa2ddbbd759e3", :path=>["/opt/logstash-6.7.1/demo/first/users.txt"]}
[2019-04-20T16:31:44,915][INFO ][logstash.pipeline        ] Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<Thread:0x5e479e9a run>"}
[2019-04-20T16:31:45,008][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2019-04-20T16:31:45,022][INFO ][filewatch.observingtail  ] START, creating Discoverer, Watch with file and sincedb collections
[2019-04-20T16:31:45,443][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}

WTF,等到花儿都谢了,楞是没等到输出任何文件内容,难道少配置参数了,我们继续来看官方文档

Tracking of current position in watched files edit

The plugin keeps track of the current position in each file by recording it in a separate file named sincedb. This makes it possible to stop and restart Logstash and have it pick up where it left off without missing the lines that were added to the file while Logstash was stopped.

By default, the sincedb file is placed in the data directory of Logstash with a filename based on the filename patterns being watched (i.e. the path option). Thus, changing the filename patterns will result in a new sincedb file being used and any existing current position state will be lost. If you change your patterns with any frequency it might make sense to explicitly choose a sincedb path with the sincedb_path option.

A different sincedb_path must be used for each input. Using the same path will cause issues. The read checkpoints for each input must be stored in a different path so the information does not override.

Files are tracked via an identifier. This identifier is made up of the inode, major device number and minor device number. In windows, a different identifier is taken from a kernel32 API call.

Sincedb records can now be expired meaning that read positions of older files will not be remembered after a certain time period. File systems may need to reuse inodes for new content. Ideally, we would not use the read position of old content, but we have no reliable way to detect that inode reuse has occurred. This is more relevant to Read mode where a great many files are tracked in the sincedb. Bear in mind though, if a record has expired, a previously seen file will be read again.

Sincedb files are text files with four (< v5.0.0), five or six columns:

  1. The inode number (or equivalent).
  2. The major device number of the file system (or equivalent).
  3. The minor device number of the file system (or equivalent).
  4. The current byte offset within the file.
  5. The last active timestamp (a floating point number)
  6. The last known path that this record was matched to (for old sincedb records converted to the new format, this is blank.

On non-Windows systems you can obtain the inode number of a file with e.g. ls -li.

官方说了,sincedb文件中记录了每个被监听文件的位置等信息,当我们logstash重启后就不需要再从头读取文件了。
接下来我们要做的就是将该文件删除,官方描述该文件在data目录下,我们来找下,找了半天你会神奇的发现该文件压根不存在,其实是我们错了,因为该文件是隐藏文件,所以我们没找到,好了,执行以下命令进行删除

[sqczm@sqczm logstash-6.7.1]$ pwd
/opt/logstash-6.7.1
[sqczm@sqczm logstash-6.7.1]$ ls data/plugins/inputs/file/
[sqczm@sqczm logstash-6.7.1]$ ls -a data/plugins/inputs/file/
.  ..  .sincedb_ccdcb2b886f0094c5a7fa2ddbbd759e3
[sqczm@sqczm logstash-6.7.1]$ rm -rf data/plugins/inputs/file/.sincedb_ccdcb2b886f0094c5a7fa2ddbbd759e3 

删除完毕后,我们继续启动logstash

[sqczm@sqczm logstash-6.7.1]$ pwd
/opt/logstash-6.7.1
[sqczm@sqczm logstash-6.7.1]$ bin/logstash -f /opt/logstash-6.7.1/demo/first/first.conf

Sending Logstash logs to /opt/logstash-6.7.1/logs which is now configured via log4j2.properties
[2019-04-20T16:57:38,915][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2019-04-20T16:57:38,939][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"6.7.1"}
[2019-04-20T16:57:47,643][INFO ][logstash.pipeline        ] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50}
[2019-04-20T16:57:48,093][INFO ][logstash.inputs.file     ] No sincedb_path set, generating one based on the "path" setting {:sincedb_path=>"/opt/logstash-6.7.1/data/plugins/inputs/file/.sincedb_ccdcb2b886f0094c5a7fa2ddbbd759e3", :path=>["/opt/logstash-6.7.1/demo/first/users.txt"]}
[2019-04-20T16:57:48,145][INFO ][logstash.pipeline        ] Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<Thread:0xc6ca077 run>"}
[2019-04-20T16:57:48,233][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2019-04-20T16:57:48,251][INFO ][filewatch.observingtail  ] START, creating Discoverer, Watch with file and sincedb collections
[2019-04-20T16:57:48,693][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
/opt/logstash-6.7.1/vendor/bundle/jruby/2.5.0/gems/awesome_print-1.7.0/lib/awesome_print/formatters/base_formatter.rb:31: warning: constant ::Fixnum is deprecated
{
    "@timestamp" => 2019-04-20T08:57:48.917Z,
      "@version" => "1",
          "path" => "/opt/logstash-6.7.1/demo/first/users.txt",
          "host" => "sqczm",
       "message" => "name: lisi, age:20,addr:\"美国\""
}
{
    "@timestamp" => 2019-04-20T08:57:48.886Z,
      "@version" => "1",
          "path" => "/opt/logstash-6.7.1/demo/first/users.txt",
          "host" => "sqczm",
       "message" => "name: zhangsan, age: 21, addr: \"中国 北京\""
}
{
    "@timestamp" => 2019-04-20T08:57:48.918Z,
      "@version" => "1",
          "path" => "/opt/logstash-6.7.1/demo/first/users.txt",
          "host" => "sqczm",
       "message" => "name:wangwu,age:19,addr:\"beijing\""
}

心情无比激动呐,终于看到了文本内容,接下来再来改造下,其实大家在看的时候发现我伪造的数据其实想表现一种json格式,我们把配置文件继续改造下。
修改配置文件,将其设置为json格式

[sqczm@sqczm logstash-6.7.1]$ pwd
/opt/logstash-6.7.1
[sqczm@sqczm logstash-6.7.1]$ more demo/first/first.conf 
input {
    file {
        path => ["/opt/logstash-6.7.1/demo/first/users.txt"]
        start_position => "beginning"
        codec => "json"
    }
}
filter {
    
}
output {
    stdout {}
}

修改完后记得删除sincedb文件

sqczm@sqczm logstash-6.7.1]$ pwd
/opt/logstash-6.7.1
[sqczm@sqczm logstash-6.7.1]$ rm -rf data/plugins/inputs/file/.sincedb_ccdcb2b886f0094c5a7fa2ddbbd759e3 
[sqczm@sqczm logstash-6.7.1]$ bin/logstash -f /opt/logstash-6.7.1/demo/first/first.conf
……省略部分输出……
{
          "tags" => [
        [0] "_jsonparsefailure"
    ],
      "@version" => "1",
    "@timestamp" => 2019-04-20T11:48:45.377Z,
          "path" => "/opt/logstash-6.7.1/demo/first/users.txt",
          "host" => "sqczm",
       "message" => "name: lisi, age:20,addr:\"美国\""
}
{
          "tags" => [
        [0] "_jsonparsefailure"
    ],
      "@version" => "1",
    "@timestamp" => 2019-04-20T11:48:45.332Z,
          "path" => "/opt/logstash-6.7.1/demo/first/users.txt",
          "host" => "sqczm",
       "message" => "name: zhangsan, age: 21, addr: \"中国 北京\""
}
{
          "tags" => [
        [0] "_jsonparsefailure"
    ],
      "@version" => "1",
    "@timestamp" => 2019-04-20T11:48:45.381Z,
          "path" => "/opt/logstash-6.7.1/demo/first/users.txt",
          "host" => "sqczm",
       "message" => "name:wangwu,age:19,addr:\"beijing\""
}

看着结果很奔溃呐,tags节点的错误信息是json解析出错,忽然一看我构造的数据压根不是json格式,赶紧修改下。

[sqczm@sqczm logstash-6.7.1]$ pwd
/opt/logstash-6.7.1
[sqczm@sqczm logstash-6.7.1]$ more demo/first/users.txt 
{"name": "zhangsan", "age": 21, "addr": "中国 北京"}
{"name": "lisi", "age":20,"addr":"美国"}
{"name":"wangwu","age":19,"addr":"beijing"}

修改完毕后,我们赶紧删除sincedb文件后重启

sqczm@sqczm logstash-6.7.1]$ pwd
/opt/logstash-6.7.1
[sqczm@sqczm logstash-6.7.1]$ rm -rf data/plugins/inputs/file/.sincedb_ccdcb2b886f0094c5a7fa2ddbbd759e3 
[sqczm@sqczm logstash-6.7.1]$ bin/logstash -f /opt/logstash-6.7.1/demo/first/first.conf
……省略部分输出……
{
      "@version" => "1",
          "path" => "/opt/logstash-6.7.1/demo/first/users.txt",
          "name" => "zhangsan",
    "@timestamp" => 2019-04-20T11:54:55.419Z,
           "age" => 21,
          "addr" => "中国 北京",
          "host" => "sqczm"
}
{
      "@version" => "1",
          "path" => "/opt/logstash-6.7.1/demo/first/users.txt",
          "name" => "lisi",
    "@timestamp" => 2019-04-20T11:54:55.460Z,
           "age" => 20,
          "addr" => "美国",
          "host" => "sqczm"
}
{
      "@version" => "1",
          "path" => "/opt/logstash-6.7.1/demo/first/users.txt",
          "name" => "wangwu",
    "@timestamp" => 2019-04-20T11:54:55.462Z,
           "age" => 19,
          "addr" => "beijing",
          "host" => "sqczm"
}

到此,我们logstash-input-file插件的例子到此结束,其他属性可以看官方文档进行练习。

  • 11
    点赞
  • 11
    收藏
    觉得还不错? 一键收藏
  • 7
    评论
可以使用 Helm 命令来安装 logstash-input-kubernetes,步骤如下: 1. 添加 Elastic 官方的 Helm 仓库: ``` helm repo add elastic https://helm.elastic.co ``` 2. 创建一个 Helm chart: ``` helm create my-logstash-kubernetes ``` 3. 打开 my-logstash-kubernetes/values.yaml 文件,添加以下配置: ``` esHost: "elasticsearch-master.elasticsearch.svc.cluster.local" esPort: "9200" image: "docker.elastic.co/logstash/logstash-oss:7.11.2" ``` 这些配置项指定了 Elasticsearch 的主机和端口,以及使用Logstash 镜像。 4. 打开 my-logstash-kubernetes/templates/deployment.yaml 文件,添加以下部分: ``` - name: logstash-input-kubernetes image: {{ .Values.image }} env: - name: LOGSTASH_JAVA_OPTS value: "-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=1099 -Dcom.sun.management.jmxremote.rmi.port=1099 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Djava.rmi.server.hostname=localhost" args: - "-f" - "/usr/share/logstash/pipeline/logstash.conf" - "--config.reload.automatic" - "--config.reload.interval=5s" volumeMounts: - name: config mountPath: /usr/share/logstash/pipeline/logstash.conf subPath: logstash.conf ports: - containerPort: 5044 resources: limits: cpu: 1000m memory: 2Gi requests: cpu: 500m memory: 1Gi ``` 这个部分定义了一个名为 logstash-input-kubernetes 的容器,使用了之前指定的 Logstash 镜像,并挂载了一个名为 config 的 ConfigMap。 5. 打开 my-logstash-kubernetes/templates/service.yaml 文件,添加以下部分: ``` - name: logstash-input-kubernetes port: 5044 targetPort: 5044 ``` 这个部分定义了一个名为 logstash-input-kubernetes 的服务,使用了之前定义的端口。 6. 创建一个 ConfigMap,用于存储 Logstash 的配置文件: ``` kubectl create configmap my-logstash-config --from-file=logstash.conf=./logstash.conf ``` 7. 使用 Helm 命令安装 Logstash: ``` helm install my-logstash-kubernetes elastic/logstash --values=my-logstash-kubernetes/values.yaml ``` 这个命令会创建一个名为 my-logstash-kubernetes 的 Kubernetes 部署,并启动一个名为 logstash-input-kubernetes 的容器。容器会使用之前创建的 ConfigMap 中的配置文件来处理 Kubernetes 的日志数据,并将数据发送到 Elasticsearch。 以上就是使用 Helm 安装 logstash-input-kubernetes 的步骤。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 7
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值