首先、这节水的东西就比较少了,大部分是例子。
一、Avro Source与Thrift Source
Avro端口监听并接收来自外部的Avro客户流的事件。当内置Avro 去Sinks另一个配对Flume代理,它就可以创建分层采集的拓扑结构。官网说的比较绕,当然我的翻译也很弱,其实就是flume可以多级代理,然后代理与代理之间用Avro去连接
下面是官网给出的source的配置,加粗的参数是必选,描述就不解释了。
Property Name |
Default |
Description |
channels |
– |
|
type |
– |
The component type name, needs to be avro |
bind |
– |
hostname or IP address to listen on |
port |
– |
Port # to bind to |
threads |
– |
Maximum number of worker threads to spawn |
selector.type |
|
|
selector.* |
|
|
interceptors |
– |
Space-separated list of interceptors |
interceptors.* |
|
|
compression-type |
none |
This can be “none” or “deflate”. The compression-type must match the compression-type of matching AvroSource |
ssl |
FALSE |
Set this to true to enable SSL encryption. You must also specify a “keystore” and a “keystore-password”. |
keystore |
– |
This is the path to a Java keystore file. Required for SSL. |
keystore-password |
– |
The password for the Java keystore. Required for SSL. |
keystore-type |
JKS |
The type of the Java keystore. This can be “JKS” or “PKCS12”. |
ipFilter |
FALSE |
Set this to true to enable ipFiltering for netty |
ipFilter.rules |
– |
Define N netty ipFilter pattern rules with this config. |
官网的例子就不放了,这边用实际例子显示。
#配置文件avro_case2.conf 其实和第二节的pull.conf 一模一样
#Name the components on this agent
a1.sources= r1
a1.sinks= k1
a1.channels= c1
#Describe/configure the source
a1.sources.r1.type = avro
a1.sources.r1.channels = c1
a1.sources.r1.bind = 192.168.233.128
a1.sources.r1.port = 55555
#Describe the sink
a1.sinks.k1.type= logger
a1.sinks.k1.channel= c1
#Use a channel which buffers events in memory
a1.channels.c1.type= memory
a1.channels.c1.capacity= 1000
a1.channels.c1.transactionCapacity= 100
#敲命令
flume-ng agent -cconf -f conf/avro_case2.conf -n a1 -Dflume.root.logger=INFO,console
成功与否就不说明,与第二节的pull.conf 同。。。
#然后在另一个终端进行测试
flume-ng avro-client -cconf -H 192.168.233.128 -p 44444 -F /tmp/logs/test.log
这个就是模拟第二节push代理费pull代理发数据,这里不写配置直接命令方式测试。
发送事件成功,这里和push代理不一样的是没有用spool,所以日志文件名不会被改名称。
看接受终端显示
ok数据发送成功。
ThriftSource 与Avro Source 基本一致。只要把source的类型改成thrift即可,例如a1.sources.r1.type = thrift
比较简单,不做赘述。
二、Exec Source
ExecSource的配置就是设定一个Unix(linux)命令,然后通过这个命令不断输出数据。如果进程退出,Exec Source也一起退出,不会产生进一步的数据。
下面是官网给出的source的配置,加粗的参数是必选,描述就不解释了。
Property Name |
Default |
Description |
channels |
– |
|
type |
– |
The component type name, needs to be exec |
command |
– |
The command to execute |
shell |
– |
A shell invocation used to run the command. e.g. /bin/sh -c. Required only for commands relying on shell features like wildcards, back ticks, pipes etc. |
restartThrottle |
10000 |
Amount of time (in millis) to wait before attempting a restart |
restart |
FALSE |
Whether the executed cmd should be restarted if it dies |
logStdErr |
FALSE |
Whether the command’s stderr should be logged |
batchSize |
20 |
The max number of lines to |