Log4j2自定义Appender将日志写入到ElasticSearch

背景

我们数据加工使用海豚调度DolpinScheduler做任务调度。流程是 提交一个shell 到海豚,shell里使用
spark-submit 提交 java 程序,然后使用海豚的api 去获取日志(海豚管理端也能看到,我们换皮定制化了),但是在集群模式下,获取到的日志是spark运行的日志(拉取jar,accept,running之类的),我们需要的是jar包里打印的log.info()这一类的,在jar包里打印了任务进行到流程的哪个阶段,这个日志在hadoop里能看,海豚调度里看不了,又不想大费周章去部署logtash搞elk,或者去配flume,然后我们就想着把日志输出到ES里,能看能查。之前的业务日志又全都是log.info打印的,又懒得去改造,调研后发现log4j2可以自定义Appender,那么咱们今天就搞定他。

在海豚里提交的shell脚本

下面这个脚本我们换皮搞定制化的提交到海豚的

/home/software/spark-3.1.2/bin/spark-submit --class cnki.bdms.servicespark.BdcServiceSparkApplication \
    --conf spark.yarn.jars="hdfs://cluster1:8020/bdclib/*" \
    --driver-java-options "-Dspark.yarn.dist.files=/home/software/hadoop-3.3.0/etc/hadoop/yarn-site.xml" \
    --master yarn \
    --deploy-mode cluster \
    --driver-memory 1g \
    --executor-memory 3g \
    --executor-cores 3 \
    --num-executors 3 \
    --conf spark.yarn.maxAppAttempts=5 \
    --conf spark.yarn.preserve.staging.files=true \
    /data/jar/bdcServiceSpark-1.0.0.jar 3627

自定义EsAppender

先引入pom,我们提交的是springboot 的jar包

		<dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-log4j2</artifactId>
        </dependency>

EsAppender实现

package cnki.bdms.servicespark.utils;


import com.fasterxml.jackson.core.JsonProcessingException;
import org.apache.http.HttpHost;
import org.apache.http.auth.AuthScope;
import org.apache.http.auth.UsernamePasswordCredentials;
import org.apache.http.client.CredentialsProvider;
import org.apache.http.impl.client.BasicCredentialsProvider;
import org.apache.http.impl.nio.reactor.IOReactorConfig;
import org.apache.logging.log4j.core.Filter;
import org.apache.logging.log4j.core.Layout;
import org.apache.logging.log4j.core.LogEvent;
import org.apache.logging.log4j.core.appender.AbstractAppender;
import org.apache.logging.log4j.core.config.plugins.Plugin;
import org.apache.logging.log4j.core.config.plugins.PluginAttribute;
import org.apache.logging.log4j.core.config.plugins.PluginElement;
import org.apache.logging.log4j.core.config.plugins.PluginFactory;
import org.apache.logging.log4j.core.layout.PatternLayout;
import org.apache.logging.log4j.util.ReadOnlyStringMap;
import org.elasticsearch.action.bulk.BulkRequest;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestClientBuilder;
import org.elasticsearch.client.RestHighLevelClient;
import org.slf4j.MDC;
import org.springframework.util.CollectionUtils;

import java.io.IOException;
import java.io.Serializable;
import java.net.InetAddress;
import java.net.NetworkInterface;
import java.net.UnknownHostException;
import java.util.*;

@Plugin(name = "elasticSearchAppender", category = "Core", elementType = "appender", printObject = true)
public class EsAppender extends AbstractAppender {
    private String address;
    private Integer port;
    private String user;
    private String password;
    private String threshold;
    /**
     * 日志缓冲池大小,如果缓冲池大小为1则日志会被立即同步到ES中</br>
     * 否则需要等到缓冲池Size达到bufferSize了后才会将日志刷新至ES</br>
     * bufferSize默认初始化为1
     */
    private int bufferSize;
    /**
     * 日志缓冲数据
     */
    private List<Map<String, Object>> buffers = new ArrayList<>();
    /**
     * 操作ES集群的客户端
     */
    private RestHighLevelClient client;
    /**
     * 插入索引
     */
    private String index;


    protected EsAppender(String name, Filter filter, Layout<? extends Serializable> layout,
                         String address,
                         Integer port,
                         String index,
                         String user,
                         String password,
                         String threshold,
                         int bufferSize) {
        super(name, filter, layout);
        this.address = address;
        this.port = port;
        this.index = index;
        this.user = user;
        this.password = password;
        this.threshold = threshold;
        this.bufferSize = bufferSize;
    }


    private void parseLog(LogEvent event) throws JsonProcessingException, UnknownHostException {
        ReadOnlyStringMap contextData = event.getContextData();
        //System.out.println(LOG_LEVEL_PATTERN);
        String applicationId = MDC.get("applicationId");
        Map<String, Object> item = new HashMap<>();
        item.put("className", event.getSource().getClassName());
        item.put("fileName", event.getSource().getFileName());
        item.put("lineNumber", event.getSource().getLineNumber());
        item.put("methodName", event.getSource().getMethodName());
        item.put("serverIp", getIp());
        item.put("logName", event.getLoggerName());
        item.put("logLevel", event.getLevel().toString());
        item.put("logThread", event.getThreadName());
        item.put("logMills", new Date(event.getTimeMillis()));
        item.put("logMessage", JsonHelper.toJSONString(event.getMessage().getFormattedMessage()));
        item.put("applicationId", applicationId);
        //全部或者获取对应的日志级别
        if ((threshold.equalsIgnoreCase(event.getLevel().toString()) || threshold.equalsIgnoreCase("all"))
                && StringUtil.isNotBlank(applicationId)) {
            buffers.add(item);
        }

    }


    /**
     * 将数据写入到ES中
     */
    private void flushBuffer() {
        if (!CollectionUtils.isEmpty(buffers)) {
            try {
                bulkLoadMap(index, buffers);
                buffers.clear();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }

    /**
     * 批量写入
     * @param esIndex 索引
     * @param datalist 数据
     * @throws IOException
     */
    public void bulkLoadMap(String esIndex, List<Map<String, Object>> datalist) throws IOException {
        //获取客户端
        RestHighLevelClient client = getRestHighLevelClient();
        BulkRequest bulkRequest = new BulkRequest();
        for (Map<String, Object> data : datalist) {
            Object id = data.get("sys_uuid");
            //如果数据包含id字段,使用数据id作为文档id
            if (id != null) {
                bulkRequest.add(new IndexRequest(esIndex).id(id.toString()).source(data));
            } else {//让es自动生成id
                bulkRequest.add(new IndexRequest(esIndex).source(data));
            }
        }
        client.bulk(bulkRequest, RequestOptions.DEFAULT);
    }

    /**
     * 获取ES RestHighLevelClient客户端
     */
    private RestHighLevelClient getRestHighLevelClient() {
        if (client == null) {
            RestClientBuilder builder = RestClient.builder(new HttpHost(address, port, "http"));

            builder.setHttpClientConfigCallback(httpClientBuilder -> httpClientBuilder
                    .setDefaultIOReactorConfig(IOReactorConfig.custom()
                            .setSoKeepAlive(true)
                            .build())).setRequestConfigCallback(requestConfigBuilder -> requestConfigBuilder
                    .setConnectTimeout(40000)
                    .setSocketTimeout(40000));

            CredentialsProvider credentialsProvider = new BasicCredentialsProvider();
            credentialsProvider.setCredentials(AuthScope.ANY, new UsernamePasswordCredentials(user, password));
            builder.setHttpClientConfigCallback(f -> f.setDefaultCredentialsProvider(credentialsProvider));
            client = new RestHighLevelClient(builder);
        }
        return client;
    }

    /**
     * 获取ip
     * @return
     * @throws UnknownHostException
     */
    private String getIp() throws UnknownHostException {
        try {
            InetAddress candidateAddress = null;
            // 遍历所有的网络接口
            for (Enumeration ifaces = NetworkInterface.getNetworkInterfaces(); ifaces.hasMoreElements(); ) {
                NetworkInterface iface = (NetworkInterface) ifaces.nextElement();
                // 在所有的接口下再遍历IP
                for (Enumeration inetAddrs = iface.getInetAddresses(); inetAddrs.hasMoreElements(); ) {
                    InetAddress inetAddr = (InetAddress) inetAddrs.nextElement();
                    // 排除loopback类型地址
                    if (!inetAddr.isLoopbackAddress()) {
                        if (inetAddr.isSiteLocalAddress()) {
                            // 如果是site-local地址,就是它了
                            return inetAddr.getHostAddress();
                        } else if (candidateAddress == null) {
                            // site-local类型的地址未被发现,先记录候选地址
                            candidateAddress = inetAddr;
                        }
                    }
                }
            }
            if (candidateAddress != null) {
                return candidateAddress.getHostAddress();
            }
            // 如果没有发现 non-loopback地址.只能用最次选的方案
            InetAddress jdkSuppliedAddress = InetAddress.getLocalHost();
            if (jdkSuppliedAddress == null) {
                throw new UnknownHostException("The JDK InetAddress.getLocalHost() method unexpectedly returned null.");
            }
            return jdkSuppliedAddress.getHostAddress();
        } catch (Exception e) {
            UnknownHostException unknownHostException = new UnknownHostException(
                    "Failed to determine LAN address: " + e);
            unknownHostException.initCause(e);
            throw unknownHostException;
        }
    }
    /**
     * 类结束之前调用方法
     */
    @Override
    public void stop() {
        if (buffers.size() > 0) {
            flushBuffer();
        }
        try {
            client.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
        super.stop();

    }

    @Override
    public void append(LogEvent event) {
        try {
            parseLog(event);
            if (buffers.size() >= (bufferSize == 0 ? 50 : bufferSize)) {
                flushBuffer();
            }
        } catch (JsonProcessingException | UnknownHostException e) {
            e.printStackTrace();
        }

    }

    @PluginFactory
    public static EsAppender createAppender(@PluginAttribute("name") String name,
                                            @PluginElement("Filter") final Filter filter,
                                            @PluginElement("Layout") Layout<? extends Serializable> layout,
                                            @PluginAttribute("address") String address,
                                            @PluginAttribute("port") Integer port,
                                            @PluginAttribute("index") String index,
                                            @PluginAttribute("user") String user,
                                            @PluginAttribute("password") String password,
                                            @PluginAttribute("threshold") String threshold,
                                            @PluginAttribute("bufferSize") int bufferSize
    ) {
        if (name == null) {
            LOGGER.error("No name provided for MyCustomAppenderImpl");
            return null;
        }
        if (layout == null) {
            layout = PatternLayout.createDefaultLayout();
        }
        return new EsAppender(name, filter, layout, address, port, index, user, password, threshold, bufferSize);
    }
}

配置文件log4j2.xml,这里也可以使用yml,看个人喜好,这里几个参数和上面的保持一致

<?xml version="1.0" encoding="UTF-8"?>
<Configuration packages="org.apache.logging.log4j.core,io.sentry.log4j2,cnki.bdms.servicespark.utils">
    <appenders>
        <Console name="Console" target="SYSTEM_OUT">
            <PatternLayout pattern="%X{applicationId} %d{HH:mm:ss.SSS} [%t] %-5level %logger{36} - %msg%n"/>
        </Console>
        <elasticSearchAppender name="esAppender" address="你的es ip地址" port="9200" index="spark_log"
                    user="admin" password="admin" threshold="info" bufferSize="100">
        </elasticSearchAppender>
    </appenders>
    <loggers>
        <root level="INFO">
            <appender-ref ref="Console"/>
            <appender-ref ref="esAppender"/>
        </root>
    </loggers>
</Configuration>

看完上面你一定注意到我们使用MDC了,这个是为了记录spark 运行的appincationId
所以你在使用的时候可能是这样

try {
            spark = SparkSession.builder()
                    .enableHiveSupport()
                    .master(master)
                    .appName(appName)
                    .config(conf)
                    .config("spark.sql.shuffle.partition", shufflePartition)
                    .config("hive.metastore.uris", metastoreUris)
                    .config("spark.yarn.jars", yarnJars)
                    .config("spark.sql.warehouse.dir", spark_warehouse_home)
                    .config("spark.sql.legacy.timeParserPolicy", timeParserPolicy)
 String applicationId = spark.sparkContext().applicationId();
  MDC.put("applicationId", applicationId);
//处理你的业务
...
//处理完之后
 MDC.remove("applicationId");
 //触发缓存池的信息写入到es
  removeEsAppender();
}
catch (Exception ex) {
            log.info("spark对象初始化成功:master=" + ex.getMessage());
            throw new RuntimeException(ex);
        }

removeEsAppender ,为啥要加这个,我们使用这个appender后,程序执行完了,在idea里一直是running状态,而且没法触发appender的stop,咱们可是在stop里写有业务的(将剩余的不足缓存池大小的日志也要写进es)

/**
     * 停止并移除EsAppender(先触发缓存池的信息写入到es里,在停止后移除)
     */
    private void removeEsAppender() {
        LoggerContext context = LoggerContext.getContext(false);
        org.apache.logging.log4j.core.config.Configuration config = context.getConfiguration();
        Appender esAppender = config.getAppender("esAppender");
        if (config.getAppenders().remove(esAppender.getName()) != null) {
            esAppender.stop();
            config.getRootLogger().removeAppender(esAppender.getName());
            context.updateLoggers(config);
        }
    }

至此搞定,再说下我们怎么从海豚里拿applicationid的吧,海豚里每次执行日志里都有

在这里插入图片描述
从海豚的api里能拿到这个日志,然后咱们可以写个正则,拿到applicationId再去找同一个applicationId 的记录,完事了。

Pattern pattern = Pattern.compile("(?<=find app id: )([a-z0-9_]+)");
Matcher matcher = pattern.matcher(data);
if (matcher.find()) {
  String applicationId = matcher.group();
  System.out.println(applicationId );
}
  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
log4j的MDC(Mapped Diagnostic Context)可以用来在日志中添加自定义的上下文信息。在log4j的配置文件中,可以使用"%X{key}"的形式来输出MDC中指定键的值。例如,在log4j的配置文件中定义一个JDBCAppender,并使用MDC输出日志信息: ```properties log4j.appender.sql=org.apache.log4j.jdbc.JDBCAppender log4j.appender.sql.URL=jdbc:mysql://localhost:3306/test log4j.appender.sql.driver=com.mysql.jdbc.Driver log4j.appender.sql.user=root log4j.appender.sql.password=root log4j.appender.sql.sql=INSERT INTO log(level, message, user) VALUES('%p', '%m', '%X{user}') ``` 在上面的配置文件中,我们定义了一个JDBCAppender,并且在插入日志信息时使用了MDC输出"user"键的值。在代码中,我们可以使用MDC.put()方法来设置"user"键的值: ```java import org.apache.log4j.Logger; import org.apache.log4j.MDC; public class TestLog { private static final Logger logger = Logger.getLogger(TestLog.class); public static void main(String[] args) { MDC.put("user", "Tom"); logger.info("Hello, world!"); MDC.remove("user"); } } ``` 在上面的代码中,我们使用MDC.put()方法设置"user"键的值为"Tom",然后在log4j的配置文件中使用"%X{user}"输出该键的值。在输出的日志信息中,我们可以看到"user"键的值被正确地输出了。 需要注意的是,在使用MDC输出日志信息时,如果键的值为null或者不存在,那么在输出的日志信息中,对应的值将会被替换成"-"(减号)。因此,在使用MDC输出日志信息时,我们需要确保键的值不为null。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值