概述
通过log4j连flume,将日志记录到flume。是一种快速而有效的收集日志的方法。log4j连接flume有两种模式单机和集群。今天我们分别进行简单的分析。
引入Flume依赖
<dependency>
<groupId>org.apache.flume.flume-ng-clients</groupId>
<artifactId>flume-ng-log4jappender</artifactId>
<version>1.9.0</version>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-flume-ng</artifactId>
<version>2.11.2</version>
</exclusions>
</dependency>
如果与你工程现有的日志包有冲突,使用exclusion排查调flume中的即可。
单机模式
org.apache.flume.clients.log4jappender.Log4jAppender
在这个类中,有几个关键方法:
@Override
public synchronized void append(LoggingEvent event) throws FlumeException {
//If rpcClient is null, it means either this appender object was never
//setup by setting hostname and port and then calling activateOptions
//or this appender object was closed by calling close(), so we throw an
//exception to show the appender is no longer accessible.
if (**rpcClient == null**) {
String errorMsg = "Cannot Append to Appender! Appender either closed or" +
" not setup correctly!";
LogLog.error(errorMsg);
if (unsafeMode) {
return;
}
throw new FlumeException(errorMsg);
}
if (!rpcClient.isActive()) {
**reconnect**();
}
List<Event> flumeEvents = parseEvents(event);
try {
switch (flumeEvents.size()) {
case 0:
break;
case 1:
rpcClient.append(flumeEvents.get(0));
break;
default:
rpcClient.appendBatch(flumeEvents);
}
} catch (EventDeliveryException e) {
String msg = "Flume append() failed.";
LogLog.error(msg);
if (unsafeMode) {
return;
}
throw new FlumeException(msg + " Exception follows.", e);
}
}
单机版有个缺陷,也或许是故意设计成了这样。就是在应用与Flume断开之后,代码中的rpcClient.isActive会变成False,触发应用和Flume的重连,在重连时,进行了两步操作,源码如下:
private void reconnect() throws FlumeException {
close();
activateOptions();
}
close操作会清除flume的连接信息,即执行close方法后rpcClient==null,然后通过activateOptions方法重新连接,一旦activateOptions重新连接失败,不会为rpcClient重新赋值,所以下次发送日志请求时,程序就会进入到if (rpcClient == null) 的这段逻辑中,除非重启应用服务器,否则无法重新连接。集群模式下避免了这种情况,具体继续往下看
集群模式
使用的是org.apache.flume.clients.log4jappender.LoadBalancingLog4jAppender
有三种选择机制ROUND_ROBIN,RANDOM或自定义FQDN。
ROUND_ROBIN 是按顺序选择,RANDOM是随机选择。
继续来看源码
public class LoadBalancingLog4jAppender extends Log4jAppender
LoadBalancingLog4jAppender 继承了单机版使用的Log4jAppender。
关键代码在与append事件的代码,调用了下面这个方法
@Override
public void append(Event event) throws EventDeliveryException {
throwIfClosed();
boolean eventSent = false;
Iterator<HostInfo> it = selector.createHostIterator();
while (it.hasNext()) {
HostInfo host = it.next();
try {
RpcClient client = getClient(host);
client.append(event);
eventSent = true;
break;
} catch (Exception ex) {
selector.informFailure(host);
LOGGER.warn("Failed to send event to host " + host, ex);
}
}
if (!eventSent) {
throw new EventDeliveryException("Unable to send event to any host");
}
}
在这个方法中,会循环调用服务器主机直到发送成功。集群模式能够避免跟Flume无法重连主要是下面这段代码,在每次发送前会检查连接,如果连接断开直接进行重连
private synchronized RpcClient getClient(HostInfo info)
throws FlumeException, EventDeliveryException {
throwIfClosed();
String name = info.getReferenceName();
RpcClient client = clientMap.get(name);
if (client == null) {
client = createClient(name);
clientMap.put(name, client);
} else if (!client.isActive()) {
try {
client.close();
} catch (Exception ex) {
LOGGER.warn("Failed to close client for " + info, ex);
}
client = createClient(name);
clientMap.put(name, client);
}
return client;
}
虽然也有else if (!client.isActive())这个判断,但是我并没有找到集群模式试active变成false的地方。
连接方式
应用与Flume建立连接的方式是使用Netty建立长连接的方式。截取了部分代码:
@SuppressWarnings("unchecked")
public static RpcClient getInstance(Properties properties)
throws FlumeException {
String type = null;
type = properties.getProperty(
RpcClientConfigurationConstants.CONFIG_CLIENT_TYPE);
if (type == null || type.isEmpty()) {
type = ClientType.DEFAULT.getClientClassName();
}
Class<? extends AbstractRpcClient> clazz;
AbstractRpcClient client;
其中,ClientType.DEFAULT的定义如下:
public static enum ClientType {
OTHER(null),
DEFAULT(NettyAvroRpcClient.class.getCanonicalName()),
DEFAULT_FAILOVER(FailoverRpcClient.class.getCanonicalName()),
DEFAULT_LOADBALANCE(LoadBalancingRpcClient.class.getCanonicalName()),
THRIFT(ThriftRpcClient.class.getCanonicalName());
默认的是NettyAvroRpcClient。