第十一章广告检索系统——加载全量索引（二）

最新推荐文章于 2021-05-07 10:04:08 发布

paynmind

最新推荐文章于 2021-05-07 10:04:08 发布

阅读量270

点赞数

分类专栏：项目2——广告系统文章标签： java mysql 项目 spring

本文链接：https://blog.csdn.net/paynmind/article/details/109695528

版权

项目2——广告系统专栏收录该内容

17 篇文章 22 订阅

订阅专栏

此博客用于个人学习，来源于网上，对知识点进行一个整理。

1. 索引操作：

我们之前实现了将数据库中的索引信息导入到文件中，目的是在检索的过程中实现数据的加载，构造全量索引。但是由于写入文件的数据格式和索引的数据格式不一样，于是我们要定义好索引的操作。但由于我们之前定义了索引的增删改查方法，于是我们只需要直接调用就可以实现从文件中获取信息构造全量索引。

1.1 索引操作 handler 的定义与说明：

定义一个枚举，对应索引的操作类型。

public enum OpType {

    ADD,
    UPDATE,
    DELETE,
    OTHER;
}

由于我们定义的索引方法都需要一个 key-value 的键值对的形式，于是我们在这个操作类中也需要这样子定义。

/**
 * 1. 索引之间存在着层级的划分，也就是依赖关系的划分
 * 2. 加载全量索引其实是增量索引 “增加” 的一种特殊实现
 */
@Slf4j
public class AdLevelDataHandler {

     private static <K,V> void handleBinlogEvent(IndexAware<K,V> index, K key, V value, OpType type){
         switch (type){
             case ADD:
                 index.add(key, value);
                 break;
             case UPDATE:
                 index.update(key, value);
                 break;
             case DELETE:
                 index.delete(key, value);
                 break;
             default:
                 break;
         }
     }
}

1.2 定义第二层级的索引操作：

由于广告计划和广告创意并没有依赖于其他索引，于是将其定义为第二层级的索引，先构建这两个索引对应的方法。逻辑为：通过索引表数据构建索引对象，调用之前定义的索引操作。

public static void handleLevel2(AdPlanTable planTable,OpType type){
     AdPlanObject planObject = new AdPlanObject(
             planTable.getUserId(),
             planTable.getUserId(),
             planTable.getPlanStatus(),
             planTable.getStartDate(),
             planTable.getEndDate()
     );
     handleBinlogEvent(
             DataTable.of(AdPlanIndex.class),
             planObject.getPlanId(),
             planObject,
             type
     );
 }

public static void handleLevel2(CreativeTable creativeTable, OpType type){
    CreativeObject creativeObject = new CreativeObject(
            creativeTable.getAdId(),
            creativeTable.getName(),
            creativeTable.getType(),
            creativeTable.getMaterialType(),
            creativeTable.getHeight(),
            creativeTable.getWidth(),
            creativeTable.getAuditStatus(),
            creativeTable.getAdUrl()
    );
    handleBinlogEvent(
            DataTable.of(CreativeIndex.class),
            creativeObject.getAdId(),
            creativeObject,
            type
    );
}

1.3 定义第三层级的索引操作：

由于广告单元关联于广告计划，广告单元创意中间表关联于广告创意，所以将其归类为第三层级。相比于第二层级的对象，不仅需要判断本身对应的索引对象是否为空，还得判断关联的索引对象是否为空。

public static void handleLevel3(AdUnitTable unitTable,OpType type){
     AdPlanObject adPlanObject = DataTable.of(AdPlanIndex.class).get(unitTable.getPlanId());
     if (null == adPlanObject){
         log.error("handleLevel3 found AdPlanObject error:{}",unitTable.getPlanId());
         return;
     }
    AdUnitObject unitObject = new AdUnitObject(
            unitTable.getUnitId(),
            unitTable.getUnitStatus(),
            unitTable.getPositionType(),
            unitTable.getPlanId(),
            adPlanObject
    );
     handleBinlogEvent(
             DataTable.of(AdUnitIndex.class),
             unitObject.getUnitId(),
             unitObject,
             type
     );
}

public static void handleLevel3(CreativeUnitTable creativeUnitTable,OpType type){
     //先判断是否是更新操作，该索引无更新操作
    if (type == OpType.UPDATE){
        log.error("CreativeUnitIndex not support update");
        return;
    }
    AdUnitObject unitObject = DataTable.of(AdUnitIndex.class).get(creativeUnitTable.getUnitId());
    CreativeObject creativeObject = DataTable.of(CreativeIndex.class).get(creativeUnitTable.getAdId());
    if (null == unitObject || null == creativeObject){
        log.error("AdCreativeUnitTable index error:{}", JSON.toJSONString(creativeUnitTable));
        return;
    }
    CreativeUnitObject creativeUnitObject = new CreativeUnitObject(
            creativeUnitTable.getAdId(),
            creativeUnitTable.getUnitId()
    );
    handleBinlogEvent(
            DataTable.of(CreativeUnitIndex.class),
            CommonUtils.stringConcat(
                    creativeUnitObject.getAdId().toString(),
                    creativeUnitObject.getUnitId().toString()
            ),
            creativeUnitObject,
            type
    );
}

需要注意的是，广告单元创意中间表的 key 是 String 的一个拼接，于是需要在工具类中定义一个方法用于拼接所有传入的参数。

/**
 * 拼接字符串
 * @param args
 * @return
 */
public static String stringConcat(String... args){
    StringBuffer result = new StringBuffer();
    for (String arg : args) {
        result.append(arg);
        result.append("-");
    }
    result.deleteCharAt(result.length() - 1);
    return result.toString();
}

1.4 定义第四层级的索引操作：

由于关键词限制，地域限制和兴趣限制关联于第三层级的推广单元，于是将各个限制定义为第四层级。实现的逻辑与之前类似，都是需要判断本身索引对象与相关联的索引对象是否存在，然后再构建索引对象，调用方法。

public static void handleLevel4(UnitDistrictTable unitDistrictTable, OpType type){
     if (type == OpType.UPDATE){
         log.error("district index can not support update");
         return;
     }
     AdUnitObject unitObject = DataTable.of(AdUnitIndex.class).get(unitDistrictTable.getUnitId());
     if (unitObject == null){
         log.error("UnitDistrictTable index error:{}",unitDistrictTable.getUnitId());
         return;
     }
     String key = CommonUtils.stringConcat(
             unitDistrictTable.getProvince(),
             unitDistrictTable.getCity()
     );
     Set<Long> value = new HashSet<>(Collections.singleton(unitDistrictTable.getUnitId()));
     handleBinlogEvent(
             DataTable.of(UnitDistrictIndex.class),
             key,value,
             type
     );
 }

public static void handleLevel4(UnitItTable unitItTable, OpType type){
    if (type == OpType.UPDATE){
        log.error("it index can not support update");
        return;
    }
    AdUnitObject unitObject = DataTable.of(AdUnitIndex.class).get(unitItTable.getUnitId());
    if (unitObject == null){
        log.error("UnitItTable index error:{}",unitItTable.getUnitId());
        return;
    }
    Set<Long> value = new HashSet<>(Collections.singleton(unitItTable.getUnitId()));
    handleBinlogEvent(
            DataTable.of(UnitDistrictIndex.class),
            unitItTable.getItTag(),
            value,
            type
    );
}

public static void handleLevel4(UnitKeywordTable unitKeywordTable, OpType type){
    if (type == OpType.UPDATE){
        log.error("keyword index can not support update");
        return;
    }
    AdUnitObject unitObject = DataTable.of(AdUnitIndex.class).get(unitKeywordTable.getUnitId());
    if (unitObject == null){
        log.error("UnitKeywordTable index error:{}",unitKeywordTable.getUnitId());
        return;
    }
    Set<Long> value = new HashSet<>(Collections.singleton(unitKeywordTable.getUnitId()));
    handleBinlogEvent(
            DataTable.of(UnitDistrictIndex.class),
            unitKeywordTable.getKeyword(),
            value,
            type
    );
}

2. 全量索引加载的实现：

之前实现了各个对象的数据操作，也就是根据不同的数据类型去构建当前系统中的索引，接下来去实现读取保存的全量文件，去加载全量索引实现。

实现的逻辑是先定义一个方法用于将文件的数据一行一行加载到数据中，然后定义一个初始化函数，在其中对之前定义的各个层级间的数据实现全量索引的加载。需要注意的是，必须按照从小到大层级的顺序进行加载，如果本身这个数据的依赖还没加载完成，自己也必然会加载失败。

@Component
@DependsOn("dataTable")
public class IndexFileLoader {

    /**
     * 实现全量索引的加载
     */
    @PostConstruct
    public void init(){
        List<String> adPlanStrings = loadDumpData(
                String.format("%s%s",
                        DConstant.DATA_ROOT_DIR,
                        DConstant.AD_PLAN)
        );
        adPlanStrings.forEach(p -> AdLevelDataHandler.handleLevel2(
                JSON.parseObject(p, AdPlanTable.class),
                OpType.ADD
        ));
        List<String> adCreativeStrings = loadDumpData(
                String.format("%s%s",
                        DConstant.DATA_ROOT_DIR,
                        DConstant.AD_CREATIVE)
        );
        adCreativeStrings.forEach(c -> AdLevelDataHandler.handleLevel2(
                JSON.parseObject(c, CreativeTable.class),
                OpType.ADD
        ));

        List<String> adUnitStrings = loadDumpData(
                String.format("%s%s",
                        DConstant.DATA_ROOT_DIR,
                        DConstant.AD_UNIT
                )
        );
        adUnitStrings.forEach(u -> AdLevelDataHandler.handleLevel3(
                JSON.parseObject(u, AdUnitTable.class),
                OpType.ADD
        ));
        List<String> adCreativeUnitStrings = loadDumpData(
                String.format("%s%s",
                        DConstant.DATA_ROOT_DIR,
                        DConstant.AD_CREATIVE_UNIT
                )
        );
        adCreativeUnitStrings.forEach(cu -> AdLevelDataHandler.handleLevel3(
                JSON.parseObject(cu, CreativeUnitTable.class),
                OpType.ADD
        ));

        List<String> adUnitDistrictStrings = loadDumpData(
                String.format("%s%s",
                        DConstant.DATA_ROOT_DIR,
                        DConstant.AD_UNIT_DISTRICT
                )
        );
        adUnitDistrictStrings.forEach(d -> AdLevelDataHandler.handleLevel4(
                JSON.parseObject(d, UnitDistrictTable.class),
                OpType.ADD
        ));
        List<String> adUnitItStrings = loadDumpData(
                String.format("%s%s",
                        DConstant.DATA_ROOT_DIR,
                        DConstant.AD_UNIT_IT
                )
        );
        adUnitItStrings.forEach(i -> AdLevelDataHandler.handleLevel4(
                JSON.parseObject(i, UnitItTable.class),
                OpType.ADD
        ));
        List<String> adUnitKeywordStrings = loadDumpData(
                String.format("%s%s",
                        DConstant.DATA_ROOT_DIR,
                        DConstant.AD_UNIT_KEYWORD
                )
        );
        adUnitKeywordStrings.forEach(k -> AdLevelDataHandler.handleLevel4(
                JSON.parseObject(k, UnitKeywordTable.class),
                OpType.ADD
        ));
    }

    private List<String> loadDumpData(String fileName){
        try(BufferedReader br = Files.newBufferedReader(Paths.get(fileName))){
            return br.lines().collect(Collectors.toList());
        }catch (IOException ex){
            throw new RuntimeException(ex.getMessage());
        }
    }
}

paynmind

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
第十一章广告检索系统——加载全量索引（二）

此博客用于个人学习，来源于网上，对知识点进行一个整理。1. 索引操作：我们之前实现了将数据库中的索引信息导入到文件中，目的是在检索的过程中实现数据的加载，构造全量索引。但是由于写入文件的数据格式和索引的数据格式不一样，于是我们要定义好索引的操作。但由于我们之前定义了索引的增删改查方法，于是我们只需要直接调用就可以实现从文件中获取信息构造全量索引。1.1 索引操作 handler 的定义与说明：定义一个枚举，对应索引的操作类型。public enum OpType { ADD,
复制链接

扫一扫