使用redis对日志内容进行去重
有这个需求的盆友可以参考一下我的实现方式,虽然有带你low。
使用此方式进行去重前提是你的日志已经进行过细粒度解析过,已经明确了哪一部分是会出现重复的。
提取会产生重复的数据,将其转成hash值,此时得到的hash值为redis的key,每次都先判断一下当前key是否存在,不存在就生成值,存再判定当前数据为重复数据return掉。
小案例:
模拟日志发送端:
import org.productivity.java.syslog4j.Syslog;
import org.productivity.java.syslog4j.SyslogIF;
import java.io.UnsupportedEncodingException;
import java.net.URLDecoder;
public class Main extends Thread{
private static void send(){
SyslogIF syslog = Syslog.getInstance("udp");
syslog.getConfig().setHost("127.0.0.1");
syslog.getConfig().setPort(514);
var buffer = new StringBuffer();
buffer.append("AUDIT_SUCCESS 5156 Windows 筛选平台已允许连接。\n" +
"应用程序信息:" +
"\t进程 ID:\t\t4" +
"\t应用程序名称:\tSystem\n" +
"网络信息:" +
"\t方向:\t\t入站" +
"\t源地址:\t\t192.168.0.156" +
"\t源端口:\t\t0" +
"\t目标地址:\t224.0.0.251" +
"\t目标端口:\t\t0" +
"\t协议:\t\t2" +
"筛选器信息:" +
"\t筛选器运行时 ID:\t168106" +
"\t层名称:\t\t接收/接受" +
"\t层运行时 ID:\t44");
try {
syslog.log(0, URLDecoder.decode(buffer.toString(),"utf-8"));
System.out.println("+++++++send ok++++++++++");
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
}
}
private static void send1(){
SyslogIF syslog = Syslog.getInstance("udp");
syslog.getConfig().setHost("192.168.0.13");
syslog.getConfig().setPort(514);
var buffer = new StringBuffer();
buffer.append("AUDIT_SUCCESS 5156 Windows 筛选平台已允许连接。\n" +
"应用程序信息:" +
"\t进程 ID:\t\t4" +
"\t应用程序名称:\tSystems\n" +
"网络信息:" +
"\t方向:\t\t入站" +
"\t源地址:\t\t192.168.0.156" +
"\t源端口:\t\t0" +
"\t目标地址:\t224.0.0.251" +
"\t目标端口:\t\t0" +
"\t协议:\t\t2" +
"筛选器信息:" +
"\t筛选器运行时 ID:\t168106" +
"\t层名称:\t\t接收/接受" +
"\t层运行时 ID:\t44");
try {
syslog.log(0, URLDecoder.decode(buffer.toString(),"utf-8"));
System.out.println("+++++++send1 ok++++++++++");
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
}
}
@Override
public void run() {
for (int i = 0; i < 100; i++) {
send();
}
for (int i = 0; i < 3; i++) {
send1();
}
}
public static void main(String[] args) {
Main main = new Main();
main.start();
}
}
接收端处理好日志原文之后将,会重复的字段提取出来;
public CrudeLog removeDuplication(CrudeLog crudeLog){
//将可能重复的字段转换成hash
var jedisId = mapToLocalId(crudeLog);
if(jedisId == null){
return crudeLog;
}
try (var jedis = jedisPool.getResource()){
if(!jedis.exists(jedisId)){
jedis.set(jedisId,crudeLog.getFineField().toString(),"NX","EX",EXPIR_ETIME);
return crudeLog;
}else {
return null;
}
}
}
因为我当前的“会出现重复的字段”是一个map,所以我先将map转换成json,然后再hash处理;
private String mapToLocalId(CrudeLog crudeLog){
Map<String, String> fineField = crudeLog.getFineField();
if(fineField.isEmpty()){
return null;
}
var jsonNode = JsonUtil.OBJECT_MAPPER.convertValue(fineField, JsonNode.class);
LOGGER.info("EncryptionUtil.md5Hex(jsonNode.toString()) :: {}",EncryptionUtil.md5Hex(jsonNode.toString()));
return EncryptionUtil.md5Hex(jsonNode.toString());
}
public final class JsonUtil {
public static final ObjectMapper OBJECT_MAPPER = new ObjectMapper();
}
public final class EncryptionUtil {
public static String md5Hex(String content) {
return DigestUtils.md5Hex(content);
}
}
具体的处理流程就是这样,如果看官老爷有不一样的见解,可以告知,QQ1097172038本人萌新程序猿