这个插件是监控文件夹中文件,进行解析,解析完remove到指定目录。我这块用于解析csv文件,其他类型自行探索。
贴一下配置信息,下面有解释
[[inputs.directory_monitor]]
##指定表名
name_override = "four_base_test"
## The directory to monitor and read files from (including sub-directories if "recursive" is true).
directory = "/mydata/Telegraf/input"
#
## The directory to move finished files to (maintaining directory hierarchy from source).
finished_directory = "/mydata/Telegraf/temp"
#
## Setting recursive to true will make the plugin recursively walk the directory and process all sub-directories.
recursive = false
#
## The directory to move files to upon file error.
## If not provided, erroring files will stay in the monitored directory.
error_directory = "/mydata/Telegraf/errorfile"
#
## The amount of time a file is allowed to sit in the directory before it is picked up.
## This time can generally be low but if you choose to have a very large file written to the directory and it's potentially slow,
## set this higher so that the plugin will wait until the file is fully copied to the directory.
directory_duration_threshold = "100ms"
#
## A list of the only file names to monitor, if necessary. Supports regex. If left blank, all files are ingested.
files_to_monitor = ["^.*\\.csv"]
#
## A list of files to ignore, if necessary. Supports regex.
files_to_ignore = [".DS_Store"]
#
## Maximum lines of the file to process that have not yet be written by the
## output. For best throughput set to the size of the output's metric_buffer_limit.
## Warning: setting this number higher than the output's metric_buffer_limit can cause dropped metrics.
# max_buffered_metrics = 10000
#
## The maximum amount of file paths to queue up for processing at once, before waiting until files are processed to find more files.
## Lowering this value will result in *slightly* less memory use, with a potential sacrifice in speed efficiency, if absolutely necessary.
# file_queue_size = 100000
#
## Name a tag containing the name of the file the data was parsed from. Leave empty
## to disable. Cautious when file name variation is high, this can increase the cardinality
## significantly. Read more about cardinality here:
## https://docs.influxdata.com/influxdb/cloud/reference/glossary/#series-cardinality
# file_tag = ""
#
## Specify if the file can be read completely at once or if it needs to be read line by line (default).
## Possible values: "line-by-line", "at-once"
# parse_method = "line-by-line"
csv_header_row_count = 1
#
## The dataformat to be read from the files.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "csv"
其中 name_override = "four_base_test" 用于指定表名,别的插件应该也可以用
-
directory
:-
说明: 需要监控和读取文件的源目录路径。
-
示例:
"/path/to/source/directory"
-
-
finished_directory
:-
说明: 文件处理完后,将它们移动到的目标目录。这会保持源目录的目录层次结构。
-
示例:
"/path/to/finished/directory"
-
-
recursive
:-
说明: 如果设置为
true
,插件将递归地遍历指定目录及其子目录以处理所有文件。 -
默认值:
false
-
示例:
true
-
-
error_directory
:-
说明: 如果处理文件时发生错误,将它们移动到该目录。如果未指定,错误文件将保留在监控目录中。
-
示例:
"/path/to/error/directory"
-
-
directory_duration_threshold
:-
说明: 文件在目录中允许存在的时间,以确保文件已完全写入。时间可以设置得较低,但对于大型文件,可以设置较高的值以等待文件完全写入。
-
默认值:
"50ms"
-
示例:
"100ms"
-
-
files_to_monitor
:-
说明: 仅监控符合指定正则表达式的文件名。如果留空,将监控所有文件。
-
示例:
["^.*\\.csv"]
(监控所有以.csv
结尾的文件)
-
-
files_to_ignore
:-
说明: 需要忽略的文件名列表,支持正则表达式。
-
示例:
[".DS_Store"]
(忽略.DS_Store
文件)
-
-
max_buffered_metrics
:-
说明: 处理文件时,最大允许的未写入输出的行数。为了最佳吞吐量,设置为输出的
metric_buffer_limit
的大小。 -
默认值:
10000
-
示例:
5000
-
-
file_queue_size
:-
说明: 在处理完当前文件之前,最多允许排队处理的文件路径数量。较低的值将减少内存使用,但可能会影响速度。
-
默认值:
100000
-
示例:
50000
-
-
file_tag
:-
说明: 为文件数据添加的标签,标签值是文件名。留空以禁用。文件名变异性较高时,这可能会显著增加卡迪纳利性(cardinality)。
-
示例:
"filename"
-
-
parse_method
:-
说明: 指定文件的读取方式。可能的值包括
"line-by-line"
和"at-once"
。 -
默认值:
"line-by-line"
-
示例:
"at-once"
-
-
data_format
:-
说明: 读取文件的数据格式。支持多种数据格式,每种格式有其特定的配置选项。
-
示例:
"influx"
(表示文件格式符合 InfluxDB 的格式)
-