记录一次编写一个udf的代码优化记录;
代码逻辑:通过将离线订单数统计量(每天5点更新累计到前一天24:00的量,这里存在两种情况,0-5时是累计到前天24时的量,5-24时是累计到昨天24点的量),实时提交订单统计量(只包含当天和昨天),实时取消订单统计量(只包含当天和昨天)这三个量解析出来,进行实时加减运算,计算出当前时刻某用户实时订单量。
原代码如下:
@UdfFunction(udfName = "FLASH_HISTORY_ORDER_CNT")
public long computeFlashOrderCount(String offlineData, String orderCount, String orderCancel){
long offline = 0;
long count = 0;
SimpleDateFormat format = new SimpleDateFormat("yyyyMMdd");
try {
String today = format.format(new Date(System.currentTimeMillis()));
String yesterday = format.format(new Date(System.currentTimeMillis()-86400000));
Map<String, Long> orderCnt = new HashMap<>();
if(!StringUtils.isBlank(orderCount)){
orderCnt = renderFeatureMap(today, yesterday, orderCount);
}
Map<String, Long> orderCnl = new HashMap<>();
if(!StringUtils.isBlank(orderCancel)){
orderCnl = renderFeatureMap(today, yesterday, orderCancel);
}
if (!StringUtils.isBlank(offlineData)) {//离线数据存在
String offline_date = offlineData.split(":")[0];
offline = Long.valueOf(getValue(offlineData.split(":")[1]));
if (offline_date.equals(yesterday)){//判断是否是昨天的数据
count = offline + orderCnt.get(today) - orderCnl.get(today);
}else {//离线数据不是昨天的数据
count = orderCnt.get(today) + orderCnt.get(yesterday) - orderCnl.get(today) - orderCnl.get(yesterday);
count += offline;
}
}else {//离线数据不存在
for(Long s:orderCnt.values()){
count += s;
}
for(Long s:orderCnl.values()){
count -= s;
}
}
if(count < 0){
count = 0;
}
}catch (Exception e){
log.error("FeatureUdf.fxwGivenDateValue.jsonValue.ValueError", e);
return 0L;
}
return count;
}
private Map<String, Long> renderFeatureMap(String today,String yesterday, String jsonStr){
Map<String, Long> map = new HashMap<>();
if(!StringUtils.isBlank(jsonStr)){
Map<String, String> kvMap = (Map<String, String>) JSON.parseObject(jsonStr, Map.class);
for(String k: kvMap.keySet()){
if(k.contains(today) || k.contains(yesterday))
map.put(k.substring(0,8),Long.valueOf(getValue(kvMap.get(k))));
}
map.computeIfAbsent(today,K -> 0L);
map.computeIfAbsent(yesterday,K -> 0L);
}else {
map.put(today, 0L);
return map;
}
return map;
}
private String getValue(String v) {
int index;
if (!StringUtils.isEmpty(v)) {
if ((index = v.indexOf("^")) > -1) {
return v.substring(0, index);
} else {
return v;
}
}else{
return "";
}
原代码中,我使用了好多map对象,基本上每个map都是初始化之后开始用的。这样会消耗大量的内存,如果QPS不高的情况下还好,QPS高的情况下会导致频繁GC。所以需要对代码进行优化。
优化后代码:
@UdfFunction(udfName = "FLASH_HISTORY_ORDER_CNT")
public long computeFlashOrderCount(String offlineData, String orderCount, String orderCancel){
SimpleDateFormat format = new SimpleDateFormat("yyyyMMdd");
long offline = 0;
long count = 0;
try {
Date date = new Date();
date.setTime(System.currentTimeMillis());
String today = format.format(date) + "0000";
date.setTime(System.currentTimeMillis() - 86400000);
String yesterday = format.format(date) + "0000";
long[] orderCnt = renderFeatureMap(today, yesterday, orderCount);
long[] orderCnl = renderFeatureMap(today, yesterday, orderCancel);
if (StringUtils.isNotBlank(offlineData)) {//离线数据存在
String offline_date = "";
int index;
String offlineData_value = getValue(offlineData);
if((index = offlineData_value.indexOf(":")) > -1){
offline_date = offlineData_value.substring(0,index)+"0000";
offline = Long.valueOf(offlineData_value.split(":")[1]);
}
if (offline_date.equals(yesterday)){//判断是否是昨天的数据
count = offline + orderCnt[0] - orderCnl[0];
}else {//离线数据不是昨天的数据
count = orderCnt[0] + orderCnt[1] - orderCnl[0] - orderCnl[1];
count += offline;
}
}else {//离线数据不存在
count = orderCnt[0] + orderCnt[1] - orderCnl[0] - orderCnl[1];
}
if(count < 0){
count = 0;
}
}catch (Exception e){
log.error("FeatureUdf.computeFlashOrderCount.jsonValue.ValueError offlineData:{}, orderCount:{}, orderCancel:{}", offlineData, orderCount, orderCancel);
return 0L;
}
return count;
}
private long[] renderFeatureMap(String today,String yesterday, String jsonStr){
long[] featureArr = new long[2];
if(StringUtils.isNotBlank(jsonStr)){
JSONObject jsonObject = JSON.parseObject(jsonStr);
if(StringUtils.isNotBlank(jsonObject.getString(today))){
featureArr[0] = Long.valueOf(getValue(jsonObject.getString(today)));
}
if(StringUtils.isNotBlank(jsonObject.getString(yesterday))){
featureArr[1] = Long.valueOf(getValue(jsonObject.getString(yesterday)));
}
}
return featureArr;
}
private String getValue(String v) {
int index;
if (!StringUtils.isEmpty(v)) {
if ((index = v.indexOf("^")) > -1) {
return v.substring(0, index);
} else {
return v;
}
}else{
return "";
}
}
优化原则:
能用数组代替的map就用数组代替,不能用数组代替的map则按照预计大小初始化容量大小,或者不进行初始化操作。
其中SimpleDateFormat当初在优化的时候曾将其当成静态变量放置在方法体外部,但是在测试的时候发现SimpleDateFormat是线程不安全的。所以就搁置在方法体内部。