个人学习总结
下面我从所做的各个业务入手,对自己实习期间开发工作做一份总结。同时正好也提供一份交接文档出来。
总结
在京东集团创新零售 - 技术研发部 - 拼拼与平台研发部搜推组的三个月实习中,我参与京东七鲜推荐业务开发,现将工作情况总结如下:
- 核心业务贡献:负责 “搜索发现”“首页精排模型升级” 及 “前置仓活动页加 redis 兜底” 功能开发,保障业务稳定运行。同时协同团队完成 “新品 PB 品扶持” “首页负反馈”“搭配购”“无货找相似”等需求落地,同时配合算法老师完成“搜推精排模型跑数”工作 ,推动推荐系统迭代优化。
- 技术能力突破:实习期间实现技术栈的全面拓展,掌握 kafka、redis 的应用场景与操作方法;熟练运用 flink 处理实时流数据,高效完成数据实时处理与转接;能使用 HiveSQL 处理离线数据并接入 jimdb,夯实大数据开发能力,为后续工作奠定坚实基础。
- 综合实习收获:通过实际项目开发,将理论知识与业务场景深度结合,提升问题解决与工程实践能力。同时收获宝贵行业经验,为未来职业发展积累核心竞争力。
七鲜搭配购,无货找相似
2025年2月5日-21日
代码仓库及相关信息
代码仓库:odin-front
推荐位:621603(搭配购),621602(无货找相似)
开发人员:XXX(主要),周文星
PRD: https://joyspace.jd.com/pages/RY4qvFAG4r5vUNa6S6FS
TRD: https://joyspace.jd.com/pages/tnDqJu2n5ePFTKAsBJiM
搭配购需求背景及目标
即时零售业务场景下,用户购买商品主要集中在蔬菜水果类商品,购买时往往会多件商品一起下单购买;多种商品之间往往存在着搭配购买的场景(例如做菜时需要多种蔬菜一起来购买),所以需要在导购环节提供搭配推荐购买的能力,来满足用户实际购买需求,除此之外搭配购买也可以提高用户的客单价,一定程度上对 GMV 有一定正向作用。
搭配购触发场景
将A商品加入购物车,触发搭配购,显示与A商品相搭配的商品(比如A是西红柿,则搭配购商品可以是番茄,组成番茄炒蛋这个菜)
搭配购主要实现
- 召回:菜谱配置商品召回,大模型召回,积理配置的商品召回。召回依赖于算法测提供,根据加车商品的skuId,去redis 取
- 排序:基于当前转化相关模型来做商品的推荐排序
- 重排:主要做类目打散和曝光降权; 搭配购商品基于四级类目打散,每3个商品中不能有重复的四级类目;曝光降权,避免搭配购出现过多的重复商品
- 过滤:过滤无货,下架的商品
- 搭配购商品数据范围:最多20个最少3个,低于3个则不返回
无货找相似触发场景
搜索的商品无货时,在下方展示一些与该商品相关的商品,在这个阶段直接根据搭配购召回通道来做,主要保证商品无货时,能有对应的商品在搜索栏下面显示即可。
无货找相似在我那个阶段主要参考的搭配购来做的,其他具体的细节跟搭配购基本一致,就不重复赘述了。
七鲜首页负反馈
2025年2月17日-3月13日
代码仓库及相关信息
代码仓库:odin-front(工程测),flink_user_cl_jdq(数据测)
推荐位:621610(七鲜首页推荐位)
开发人员:周文星(主要),XXX(主要)
PRD: https://joyspace.jd.com/pages/8Gy3Se6crMOkC8PUO3hj
TRD: https://joyspace.jd.com/pages/tnDqJu2n5ePFTKAsBJiM
七鲜数据情况
-
七鲜首页停留时长352s=5.8分钟
-
七鲜首页推荐流浏览深度17种sku,竞对盒马33、小象30;
-
七鲜首页推荐流用户刷新浏览:2.74次
曝光埋点优化
-
老规则:露出一点,就上报;
-
新规则:七鲜APP端,商品曝光,修改为露出30%、停留300ms再上报曝光;
首页负反馈需求背景及目标
七鲜首页推荐流,用户在同一天访问多次时,每次看到的商品都是一样的,导致用户反馈首页推荐没有变化。现要求根据
首页负反馈触发场景
用户首页某商品的在一定时间内满足曝光次数大于xx,点击次数小于xx,加车次数小于xx 则会将该商品降权,让用户首页展示的商品。上面的xx次用京东内部配置管理工具 DUCC 实现在线配置(DUCC可以理解为Nacos)
时间有两种,实时(6分钟内)和 离线(7天内),其目的是在用户浏览首页时,对符合实时和离线降权标准的商品降权。这两种降权是叠加的,若某商品同时满足实时降权和离线降权标准,则会叠加降两次权。降权是在跑完精排模型得出权重之后对权重分乘以一个小于1的系数进行降权(精排模型得出的权重是相对值,用来表示商品与用户的相关性,商品权重越大表示商品与该用户越相关)
我们需要做的是让用户的首页推荐有变化,为什么按采用实时和离线这两种情况?直觉理解按实时降权这一种降权策略好像就可以?请试想这样一种情况,有些用户不感兴趣但官网扶持的商品这种商品最后出来的权重会比较高,如果只做实时过滤,那么用户很有可能每次打开七鲜都能看到该商品。所以需要加入一个离线时间来兜底,若该商品在7天内在用户首页曝光了多少次,且点击加车分别小于多少次(如点击次数<=1&&未加车才降权) 就降权。这样大体能保证用户在一一段时间内,首页不会出现一周内首页曝光多次但很少点击和加车的商品。
首页负反馈主要实现
数据测,工程测两大块。
数据测主要负责从子午线平台(埋点信息实时上报到这)采集用户的曝光,点击,加车三路用户行为数据,并用Flink对其出处理,再写入到星汇平台(专门存用户画像和用户行为)。
下面拿曝光数据接入举例,演示一下Flink的基本使用。
@Slf4j
public class FreshEPAllTopology {
private static final String ORANGE_COMPONENT_NEW_FEEDS_SKU_EXPO = "orangeComponent_newFeeds_skuExpose";
public static void main(String[] args) throws Exception {
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.enableCheckpointing(60_000); // 1分钟
env.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE);
env.getCheckpointConfig().setMinPauseBetweenCheckpoints(30_000); // 30秒间隔
env.setParallelism(10);
env.setBufferTimeout(10); // 降低缓冲时间
KafkaParameters jdqAppKP = KafkaParameters.builder()
.user(JdqConf.USERNAME)
.password(JdqConf.FRESH_APP_NB_EP_CONSUMER_PASSWORD)
.clientId(JdqConf.FRESH_APP_NB_EP_CONSUMER_CLIENTID)
.topicList(Collections.singletonList(JdqConf.FRESH_APP_NB_EP_TOPIC))
.build();
jdqAppKP.selfValidate();
Properties props = jdqAppKP.properties4Consumer();
props.setProperty("fetch.max.wait.ms","500");
FlinkKafkaConsumer<String> consumerApp = new FlinkKafkaConsumer<>(
jdqAppKP.getTopicList(),
new SimpleStringSchema(),
props
);
KafkaParameters jdqXHKP = KafkaParameters.builder()
.user(JdqConf.FRESH_USER_BEHAVIOR_USERNAME)
.password(JdqConf.FRESH_USER_BEHAVIOR_PASSWORD)
.clientId(JdqConf.FRESH_USER_BEHAVIOR_CLIENTID)
.topicList(Collections.singletonList(JdqConf.FRESH_USER_BEHAVIOR_TOPIC))
.build();
jdqXHKP.selfValidate();
FlinkKafkaProducer producer = null;
try {
producer = new FlinkKafkaProducer(
JdqConf.FRESH_USER_BEHAVIOR_TOPIC,
new SimpleStringSchema(),
jdqXHKP.properties4Producer()
);
producer.setLogFailuresOnly(false);
producer.setWriteTimestampToKafka(false);
} catch (Exception e) {
log.error("FreshEPAllTopology FlinkKafkaProducer is error: ", e);
}
try {
SourceFunction<String> sourceAppFunction = (SourceFunction) RichFunctionProxy.getRichFunctionProxy(consumerApp, new KafkaSecurityProviderCleaner());
DataStream<AddUserBehaviorInDto> ep = env.addSource(sourceAppFunction)// 修改消费者并行度(需等于Topic分区数)
.name("ep-appEp2Dto")
.map((MapFunction<String, AddUserBehaviorInDto>) FreshEPAllTopology::appEp2Dto).setParallelism(10).filter(Objects::nonNull).setParallelism(10);
DataStream<String> flattenedStream = ep.flatMap(new FreshSkuAllExposureAddUserSinkFunction()) .setParallelism(10);
flattenedStream.addSink(producer).setParallelism(10).name("user_behavior_sink");
env.execute("user_behavior_appEpStr");
} catch (Exception e) {
log.error("FreshEPAllTopology Producer is error: ", e);
}
}
private static AddUserBehaviorInDto appEp2Dto(String appData) {
AddUserBehaviorInDto addUserBehaviorInDto = new AddUserBehaviorInDto();
try {
log.info("FreshEPAllTopology appEp2Dto -> appData:{}", JSON.toJSON(appData));
JSONObject jsonObject = JSON.parseObject(appData);
String pin = jsonObject.getString("user_log_acct");
String uuid = jsonObject.getString("browser_uniq_id");
String time=jsonObject.getString("request_time_sec");
String eventId = jsonObject.getString("event_id");
if (StringUtils.isEmpty(eventId)) {
return addUserBehaviorInDto;
}
if (StringUtils.isEmpty(pin) && StringUtils.isEmpty(uuid)) {
return addUserBehaviorInDto;
}
String skuId = jsonObject.getString("skuId");
if (StringUtils.isEmpty(skuId)) {
skuId = jsonObject.getString("sku");
}
String jsonParamStr = jsonObject.getString("json_param");
if (StringUtils.isEmpty(skuId)
&& StringUtils.isNotEmpty(jsonParamStr)
&& jsonParamStr.length() > 5) {
JSONObject jsonParam = JSON.parseObject(jsonParamStr);
skuId = jsonParam.getString("skuId");
}
if (StringUtils.isEmpty(skuId)) {
skuId = jsonObject.getString("sku_id");
}
if (StringUtils.isEmpty(skuId)) {
return addUserBehaviorInDto;
}
if (StringUtils.isNotEmpty(pin)) {
addUserBehaviorInDto.setUser_id(pin);
addUserBehaviorInDto.setUser_id_type(UserIdTypeEnum.UserPinType.getValue());
} else if (StringUtils.isNotEmpty(uuid)) {
addUserBehaviorInDto.setUser_id(uuid);
addUserBehaviorInDto.setUser_id_type(UserIdTypeEnum.UuidType.getValue());
}
addUserBehaviorInDto.setTime(StringUtils.isNotEmpty(time) ? Long.parseLong(time) : System.currentTimeMillis());
addUserBehaviorInDto.setVersion(0);
addUserBehaviorInDto.setItem_sku_id(skuId);
addUserBehaviorInDto.setEvent_Id(eventId.trim());
addUserBehaviorInDto.setData_name("7fresh_sku_exposure_all");
addUserBehaviorInDto.setData_access_token("55d9d7d3dd2f4703");
log.info("FreshEPAllTopology appEp2Dto -> addUserBehaviorInDto:{}", JSON.toJSON(addUserBehaviorInDto));
return addUserBehaviorInDto;
} catch (Exception e) {
log.error("FreshEPAllTopology 7fresh_sku_exposure_all :{}", appData);
log.error("FreshEPAllTopology 7fresh_sku_exposure_all value is error: ", e);
}
return addUserBehaviorInDto;
}
}
@Slf4j
public class FreshSkuAllExposureAddUserSinkFunction extends RichFlatMapFunction<AddUserBehaviorInDto, String> {
@Override
public void open(Configuration parameters) throws Exception {
boolean status = DataCenter.init();
if (status) {
log.info("初始化成功");
}else {
log.info("初始化失败");
}
}
@Override
public void flatMap(AddUserBehaviorInDto addUserBehaviorInDto, Collector<String> collector) throws Exception {
log.info("FreshEPAllTopology AddUserSinkFunction exposure in addUserBehaviorInDto: {}", JSON.toJSON(addUserBehaviorInDto));
UserBehaviorDataProto.UserBehaviorData.Builder userBehaviorDataBuilder = UserBehaviorDataProto.UserBehaviorData.newBuilder();
UserBehaviorDataProto.UserBehaviorInfo.Builder userBehaviorInfobuilder = UserBehaviorDataProto.UserBehaviorInfo.newBuilder();
// 只有曝光数据
String skuId = addUserBehaviorInDto.getItem_sku_id();
if(StringUtils.isBlank(skuId)){
return;
}
UserBehaviorDataProto.ExtendInfo.Builder extendInfo = UserBehaviorDataProto.ExtendInfo.newBuilder();
extendInfo.setSessionId(addUserBehaviorInDto.getEvent_Id());
userBehaviorInfobuilder.setExtendInfo(extendInfo);
userBehaviorInfobuilder.setItemType(UserBehaviorDataProto.ItemType.SKU);
userBehaviorInfobuilder.setItemId(skuId);
userBehaviorInfobuilder.setTime(addUserBehaviorInDto.getTime());
userBehaviorDataBuilder.addInfos(userBehaviorInfobuilder);
UserBehaviorDataProto.UserBehaviorData mergerUserBehaviorData = UserBehaviorService.fetchData(addUserBehaviorInDto.getData_name(), addUserBehaviorInDto.getVersion(),
addUserBehaviorInDto.getUser_id(), addUserBehaviorInDto.getUser_id_type(), addUserBehaviorInDto.getData_access_token());
if (Objects.nonNull(mergerUserBehaviorData)) {
List<UserBehaviorDataProto.UserBehaviorInfo> infos = mergerUserBehaviorData.getInfosList();
log.info("==FreshEPAllTopology show UserBehaviorInfo size:{}", infos.size());
if (CollectionUtils.isNotEmpty(infos)){
// 864000000L为10天的毫秒数
long miniTime = System.currentTimeMillis() - 864000000L;
try{
List<UserBehaviorDataProto.UserBehaviorInfo> collect = infos.stream()
// 只保留当前时间前10天的数据
.filter(info -> info != null && info.getTime() > miniTime)
.collect(Collectors.toList());
if (CollectionUtils.isNotEmpty(collect)){
userBehaviorDataBuilder.addAllInfos(collect);
}
}catch (Exception e){
log.error("FreshEPAllTopology collect is error: ", e);
}
}
}
UserBehaviorDataProto.UserBehaviorData userData = userBehaviorDataBuilder.build();
String jsonString = JsonFormat.printer().print(userData);
log.info("FreshEPAllTopology AddUserSinkFunction exposure in build: {}", jsonString);
byte[] originalData = userData.toByteArray();
byte[] compressedData = Snappy.compress(originalData);
UserBehaviorDto userBehaviorDto = new UserBehaviorDto();
userBehaviorDto.setUser_id(addUserBehaviorInDto.getUser_id());
userBehaviorDto.setData_name(addUserBehaviorInDto.getData_name());
userBehaviorDto.setTime(addUserBehaviorInDto.getTime());
userBehaviorDto.setUser_id_type(addUserBehaviorInDto.getUser_id_type());
userBehaviorDto.setVersion(addUserBehaviorInDto.getVersion());
userBehaviorDto.setData_access_token(addUserBehaviorInDto.getData_access_token());
userBehaviorDto.setValue(Base64.getEncoder().encodeToString(compressedData));
collector.collect(JSON.toJSONString(userBehaviorDto));
log.info("FreshEPAllTopology AddUserSinkFunction exposure userBehaviorDto build end: {}", userBehaviorDto);
}
}
工程测
- 在精排之后编写两个算子分别处理实时降权和离线降权逻辑,两个算子都接入数据拿到该用户实时(6分钟内)以及离线(7天)内的行为数据。
- 根据实时行为数据组3个map(离线也一样,在离线处理算子中组三个map),分别记录sku-曝光次数,sku-点击次数,sku-加车次数。
- 遍历本轮首页推荐的商品,判断遍历的商品是否在“sku-曝光次数”这个map中,若在,则根据3个map取该商品的曝光,点击,加车次数(默认为0),然后判断该商品是否符合降权标准,即曝光次数 > a,点击次数 < b, 加车次数 < c, 如果符合,则该商品权重weight * 降权系数d。 这里a, b, c, d均支持在线配置,立刻生效。
下面拿实时行为降权来举例,我们首先在首页推荐位621610中加入了图节点${freshRealTimeExposureReweighter},并定义了其前后依赖关系,算子 FreshFeedRealTimeExposureReweighterProcessor,以及算子所需的一些参数
freshRealTimeExposureReweighter = ${vertex} {
name = "freshRealTimeExposureReweighter"
dependencies = [{"name" = "broadwayRequest"}, {"name" = "hint"},{"name" = "userBehavior"}, {"name" = "behaviorReweighterResult"}]
emitters = ["freshRealTimeExposureReweighter"]
processor = "com.jd.rec.odin.processor.fresh.FreshFeedRealTimeExposureReweighterProcessor"
processorConf = {
mapping = {}
recItemName = behaviorReweighterResult
userBehaviorString = userBehavior
processorSwitch = true
lastNumSwitch = false // 是否按条数来计算
minuteLevelDowngradeExpId = ""
expoEventIds = "expoEventIdList"
clkEventIds = "clkEventIdList"
cartEventIds = "cartEventIdList"
}
}
算子代码
/**
* 实时浏览行为过滤
* 具体表现为:对用户已曝光,(未点击&&未加车),并且曝光时间在6分钟之内的商品sku 进行降权处理
*/
public class FreshFeedRealTimeExposureReweighterProcessor extends AbstractProcessor<List<RecItem>> {
private static final Logger LOGGER = HotSwitchLogger.of(FreshFeedRealTimeExposureReweighterProcessor.class);
@Parameter
private boolean processorSwitch;
@Parameter
private boolean lastNumSwitch;
@Parameter
protected String recItemName;
@Parameter
private String minuteLevelDowngradeExpId;
@Parameter
private String userBehaviorString;
@Parameter
private String expoEventIds;
@Parameter
private String clkEventIds;
@Parameter
private String cartEventIds;
@Parameter private Map<String, String> mapping = new HashMap();
@Inject
private FreshFeedRealTimeReweightDuccManager freshFeedRealtimeReweightDuccManager;
@Inject
private FreshFeedbackConfigDuccManager freshFeedbackConfigDuccManager;
/**
* 对推荐商品数据进行实时曝光降权处理。
* @param vertex 代表当前处理节点的顶点对象。
* @return 处理后的推荐商品数据列表。
*/
@Override
public CompletableFuture<List<RecItem>> process (Vertex vertex) {
// 定义一个计时器,用来记录该方法执行耗时
Stopwatch stopwatch = Stopwatch.createStarted();
BroadwayRequest broadwayRequest = VertexUtil.getBroadwayRequest(vertex);
// 拿到前面算子传来的推荐商品数据
List<RecItem> recItems = VertexDependencyUtil.getDependencyData(vertex, recItemName);
if (!processorSwitch || CollectionUtils.isEmpty(recItems) || StringUtils.isBlank(minuteLevelDowngradeExpId)) {
CxlsDebugLogUtil.afterItem(vertex, recItems);
return CompletableFuture.completedFuture(recItems);
}
// 1 取星汇中用户行为数据,根据用户行为拿到星汇中指定视图id的曝光,点击,加车数据
UserBehavior userBehavior = VertexDependencyUtil.getDependencyData(vertex, this.mapping.getOrDefault(userBehaviorString, "userBehavior"));
if (Objects.isNull(userBehavior)){
return CompletableFuture.completedFuture(recItems);
}
Map<BehaviorType, List<BehaviorItem>> behaviors = userBehavior.getBehaviors();
List<BehaviorItem> expoList = behaviors.get(BehaviorType.PAGE);
List<BehaviorItem> clkList = behaviors.get(BehaviorType.CLK7F);
List<BehaviorItem> cartList = behaviors.get(BehaviorType.CART7F);
// 2 从ducc拿到要保留的数据需要满足的eventId 如曝光的eventId为orangeComponent_newFeeds_skuExpose
Map<String, Set<String>> freshFeedbackConfig = freshFeedbackConfigDuccManager.getFreshFeedbackConfig();
if (MapUtils.isEmpty(freshFeedbackConfig)) {
return CompletableFuture.completedFuture(recItems);
}
Set<String> expoEventIdSet = freshFeedbackConfig.get(expoEventIds);
Set<String> clkEventIdSet = freshFeedbackConfig.get(clkEventIds);
Set<String> cartEventIdSet = freshFeedbackConfig.get(cartEventIds);
// 3 根据当前命中的实验id,从ducc拿到对应的降权规则
FreshFeedbackReweightRule freshFeedRealTimeReweightRule = getFreshFeedRealTimeReweightRule(minuteLevelDowngradeExpId);
int expoCountRange = freshFeedRealTimeReweightRule.getExpoCountRange();
int clkCountRange = freshFeedRealTimeReweightRule.getClkCountRange();
int cartCountRange = freshFeedRealTimeReweightRule.getCartCountRange();
double penaltyWeight = freshFeedRealTimeReweightRule.getPenaltyWeight();
int lastNum = freshFeedRealTimeReweightRule.getLastNum();
long lastTime = freshFeedRealTimeReweightRule.getLastTime();
Map<Long, Integer> expoSkuCountMap;
Map<Long, Integer> clkSkuCountMap;
Map<Long, Integer> cartSkuCountMap;
// 4 处理1,2中拿到曝光,加车,点击数据和eventIds。 主要统计在6min内,sku曝光,点击,加车次数
if (lastNumSwitch){ // 根据条数来取,取前60条
expoSkuCountMap = handlerDataAndBuildMapByLastNum(expoEventIdSet,expoList,lastNum);
clkSkuCountMap = handlerDataAndBuildMapByLastNum(clkEventIdSet,clkList,lastNum);
cartSkuCountMap = handlerDataAndBuildMapByLastNum(cartEventIdSet,cartList,lastNum);
}else {
expoSkuCountMap = handlerDataAndBuildMap(expoEventIdSet,expoList,lastTime);
clkSkuCountMap = handlerDataAndBuildMap(clkEventIdSet,clkList,lastTime);
cartSkuCountMap = handlerDataAndBuildMap(cartEventIdSet,cartList,lastTime);
}
if (MapUtils.isEmpty(expoSkuCountMap)){
return CompletableFuture.completedFuture(recItems);
}
if (broadwayRequest.getDebug()){
addDebug(vertex, "Map<Long, Long> expoSkuCountMap"+JSON.toJSONString(expoSkuCountMap)+
"Map<Long, Long> clkSkuCountMap"+JSON.toJSONString(clkSkuCountMap)+
"Map<Long, Long> cartSkuCountMap"+JSON.toJSONString(cartSkuCountMap)
);
}
// 5 遍历推荐数据recItems,对sku在filterSkuList中的商品拿到其曝光,点击,加车次数并判断是否需降权处理
List<RecItem> items = getRecItems(recItems, expoSkuCountMap, clkSkuCountMap, cartSkuCountMap, expoCountRange, clkCountRange, cartCountRange, penaltyWeight);
if (CollectionUtils.isEmpty(items)){
return CompletableFuture.completedFuture(recItems);
}
if (broadwayRequest != null && broadwayRequest.getDebug()) {
addDebug(vertex, "FreshFeedRealTimeExposureReweighterProcessor itemsReweightResult" + JSON.toJSONString(items)
);
}
LOGGER.info("FreshFeedRealTimeExposureReweighterProcessor, 耗时:{} 豪秒", stopwatch.elapsed(TimeUnit.MILLISECONDS));
return CompletableFuture.completedFuture(items);
}
private Map<Long, Integer> handlerDataAndBuildMapByLastNum(Set<String> eventIdList, List<BehaviorItem> behaviorList, int lastNum) {
// 1 过滤掉eventId不满足条件的item
List<Long> behaviorSku = new ArrayList<>();
if (CollectionUtils.isNotEmpty(behaviorList)) {
// 过滤掉eventId不满足event的item并将拿到sku列表
behaviorSku = behaviorList.stream().filter(item -> {
// 过滤掉eventId不满足指定event的item
if (StringUtils.isNotEmpty(item.getEvent()) && item.getDatetime() > 0) {
return eventIdList.contains(item.getEvent());
}
return false;
}).map(BehaviorItem::getSku).collect(Collectors.toList()); // 星汇中拿过来的数据本身已根据时间排好序的 最近的在前面
}
// 统计behaviorList中按时间最近的60条sku sort
behaviorSku = behaviorSku.subList(0, Math.min(behaviorSku.size(), lastNum));
// 2 对用户最近访问过的sku进行次数统计
Map<Long, Integer> expoSkuCountMap = new HashMap<>();
if (CollectionUtils.isEmpty(behaviorSku)){
return expoSkuCountMap;
}
return behaviorSku.stream().collect(Collectors.groupingBy(Function.identity(), summingInt(e -> 1)));
}
private static List<RecItem> getRecItems(List<RecItem> recItems, Map<Long, Integer> expoSkuCountMap, Map<Long, Integer> clkSkuCountMap, Map<Long, Integer> cartSkuCountMap, int expoCountRange, int clkCountRange, int cartCountRange, double penaltyWeight) {
List<RecItem> items = new ArrayList<>();
items = recItems.stream().filter(z -> StringUtils.isNotBlank(z.getItemId()) && NumberUtils.isNumber(z.getItemId())).peek(x-> {
Long itemId = Long.valueOf(x.getItemId());
if (expoSkuCountMap.containsKey(Long.valueOf(x.getItemId()))){
Integer expoCnt = expoSkuCountMap.getOrDefault(itemId,0);
Integer clkCnt = clkSkuCountMap.getOrDefault(itemId,0);
Integer cartCnt = cartSkuCountMap.getOrDefault(itemId,0);
if ((expoCnt > expoCountRange && clkCnt <= clkCountRange && cartCnt <= cartCountRange)){
x.setWeight(x.getWeight() * penaltyWeight);
}
}
}).collect(Collectors.toList());
return items;
}
private Map<Long, Integer> handlerDataAndBuildMap(Set<String> eventIdList, List<BehaviorItem> behaviorList, long lastTime) {
// 1 过滤掉eventId不满足条件的item
List<Long> behaviorSku = new ArrayList<>();
if (CollectionUtils.isNotEmpty(behaviorList)) {
behaviorSku = behaviorList.stream().filter(item -> {
// 过滤掉eventId不满足条件的item
if (StringUtils.isNotEmpty(item.getEvent()) && item.getDatetime() > System.currentTimeMillis() - lastTime){
return eventIdList.contains(item.getEvent());
}
return false;
}).map(BehaviorItem::getSku).collect(Collectors.toList());
}
// 2 对该用户实时(六分钟之内)曝光的所有sku进行曝光次数统计
Map<Long, Integer> expoSkuCountMap = new HashMap<>();
if (CollectionUtils.isEmpty(behaviorSku)){
return expoSkuCountMap;
}
return behaviorSku.stream().collect(Collectors.groupingBy(Function.identity(), summingInt(e -> 1)));
}
public FreshFeedbackReweightRule getFreshFeedRealTimeReweightRule(String minuteLevelDowngradeExpId) {
// 从ducc中拿到首页负反馈所有降权规则
Map<String, FreshFeedbackReweightRule> freshFeedRealTimeReweightRules = freshFeedRealtimeReweightDuccManager.getFreshFeedRealtimeReweightRules();
// 合并判空逻辑:Map为空 或 key不存在 时返回默认
if (MapUtils.isEmpty(freshFeedRealTimeReweightRules) || !freshFeedRealTimeReweightRules.containsKey(minuteLevelDowngradeExpId)) {
return getDefaultRule();
}
return freshFeedRealTimeReweightRules.get(minuteLevelDowngradeExpId);
}
private FreshFeedbackReweightRule getDefaultRule() {
return new FreshFeedbackReweightRule(1, 1, 1, 1.0, 3600000, 60);
}
}
七鲜搜索发现
2025年3月10日-3月19日
代码仓库及相关信息
代码仓库: cxls-search-gateway
代码所在:SearchQueryService类
开发人员:周文星(主要),张兆军(主要)
PRD: https://joyspace.jd.com/pages/UZ6x8jXInhxCZzlXNxZi
搜索发现(猜你想搜)需求背景及目标
背景: 搜索中间页承载着搜索引导及商品分发的核心职责,且是提升用户搜索体验和搜索效率的抓手。目前七鲜搜索中间页,仅有“历史搜索”“热搜”功能,缺少可以引导用户快速检索的能力。即时零售业务用户的需求相对集中,多以生鲜品类为主。因此搭建一套用户搜索意图的分析能力,引导用户快速检索,是提升搜索流量/搜索体验/商品精准分发的快捷途径
需求:“搜索发现”即“猜你想搜”功能主要做到的是实时 取用户实时的搜索,点击,加车,订单四类中商品的产品词,根据该产品词取召回与该产品词相似的商品的产品词(这里的产品词称之为推荐词),再将该推荐词返给前端。保证猜你想搜这里的内容是根据用户行为时刻在变化的。
搜索发现触发场景
搜索的时候,搜索栏下面的 “搜索发现” 就是该功能。
七鲜搜索发现主要实现
涉及到用线程池并行执行 U2I 和 I2I 这两个方法
/**
* @program: cxls-search-gateway
* @description:
* @packagename: com.jd.cxls.service.impl
* @author: zhangzhaojun
* @date: 2025-02-13 18:32
**/
@Slf4j
@Service
@UMP("service")
public class SearchQueryService implements SearchQueryServiceImpl {
@Autowired
private UserServerRpc userServerRpc;
@Autowired
private ExecutorService horizontalExecutor;
@Resource
private JimdbUtil jimdbUtil;
@Override
public List<RecQueryVo> guessQueryList(GuessQueryListRequest request) {
log.info("SearchQueryService -> guessQueryList request:{}", JsonUtils.toJson(request));
if(StringUtils.isBlank(request.getPin())){
return douDiList();
}
// 1 取U2I,I2I生成的搜索词
Future<List<RecQueryVo>> i2IFuture = horizontalExecutor.submit(()-> queryI2I(request));
Future<List<RecQueryVo>> u2IFuture = horizontalExecutor.submit(()-> queryU2I(request));
List<RecQueryVo> u2IList = null;
List<RecQueryVo> i2IList = null;
// 用两个try-catch 捕获异常,防止一个线程异常,导致另一个线程也异常
try {
u2IList = u2IFuture.get();
} catch (InterruptedException | ExecutionException e) {
log.error("u2IFuture InterruptedException|ExecutionException -> e:", e);
}
try {
i2IList = i2IFuture.get();
} catch (InterruptedException | ExecutionException e) {
log.error("i2IFuture InterruptedException|ExecutionException -> e:", e);
}
// 若U2I,I2I都为空,则返回兜底词
if(CollectionUtils.isEmpty(i2IList) && CollectionUtils.isEmpty(u2IList)){
return douDiList();
}
// 2 处理U2I,I2I生成的搜索词 I2I排在U2I前面
String posStr = DuccConfiguration.getConfigValue(DuccConfigKeyConstant.QUERY_QUERY_LIST_CHANNELS,"10,20");
String limitStr = DuccConfiguration.getConfigValue(DuccConfigKeyConstant.QUERY_QUERY_LIST_LIMIT,"30");
List<Integer> posList = Lists.newArrayList();
if(StringUtils.isBlank(posStr)){
posList.add(10);
posList.add(20);
}else{
posList = Arrays.stream(posStr.split(","))
.map(Integer::parseInt)
.collect(Collectors.toList());
}
// 2.1 若I2I为空,则返回U2I 搜索词数量<=30
if(CollectionUtils.isEmpty(i2IList)){
return u2IList.size()>Integer.parseInt(limitStr)? u2IList.subList(0, Integer.parseInt(limitStr)): u2IList;
}
// 2.2 若U2I为空,则返回I2I 搜索词数量<=30
if(CollectionUtils.isEmpty(u2IList)){
return i2IList.size()>Integer.parseInt(limitStr)? i2IList.subList(0, Integer.parseInt(limitStr)): i2IList;
}
// 2.3 若I2I,U2I都存在,则取I2I,U2I前30个搜索词
u2IList = u2IList.size() > posList.get(0) ? u2IList.subList(0, posList.get(0)) : u2IList;
i2IList = i2IList.size() > posList.get(1) ? i2IList.subList(0, posList.get(1)) : i2IList;
// 2.4 合并I2I,U2I到recQueryVos
List<RecQueryVo> recQueryVos = mergeU2IAndI2I(i2IList, u2IList, limitStr);
if(CollectionUtils.isEmpty(recQueryVos)){
return douDiList();
}
// 2.5 过滤黑名单并返回
String blackStr = DuccConfiguration.getConfigValue(DuccConfigKeyConstant.QUERY_BLACK_LIST,"");
if(StringUtils.isBlank(blackStr)){
log.info("SearchQueryService -> blackStr is null");
return recQueryVos;
}
log.info("SearchQueryService -> blackStr is: {}", blackStr);
List<String> blackList = Arrays.asList(blackStr.split(","));
List<RecQueryVo> vos = recQueryVos.stream().filter(x -> !blackList.contains(x.getQuery())).collect(Collectors.toList());
log.info("SearchQueryService -> recQueryVos:{}", JsonUtils.toJson(vos));
return vos;
}
private static List<RecQueryVo> mergeU2IAndI2I(List<RecQueryVo> i2IList, List<RecQueryVo> u2IList, String limitStr) {
List<RecQueryVo> mergedList = new ArrayList<>();
HashSet<String> uniqueQueries = new HashSet<>();
// I2I搜索词在U2I搜索词之前
for (int i = 0; i < i2IList.size(); i++) {
if (uniqueQueries.add(i2IList.get(i).getQuery())) {
mergedList.add(i2IList.get(i));
}
}
for (int i = 0; i < u2IList.size(); i++) {
if (uniqueQueries.add(u2IList.get(i).getQuery())) {
mergedList.add(u2IList.get(i));
}
}
return mergedList.size() > Integer.parseInt(limitStr) ? mergedList.subList(0, Integer.parseInt(limitStr)) : mergedList;
}
// 兜底逻辑
private List<RecQueryVo> douDiList() {
List<RecQueryVo> douDiList = Lists.newArrayList();
String redisKey = JimdbConstant.QIXIAN_CNXS + "null";
String str = jimdbUtil.get(redisKey);
if(StringUtils.isBlank(str)){
return null;
}
List<String> stringList = Arrays.asList(str.split(","));
stringList.forEach(x->{
RecQueryVo recQueryVo = new RecQueryVo();
recQueryVo.setQuery(x.trim());
recQueryVo.setSource("doudi");
douDiList.add(recQueryVo);
});
return douDiList;
}
// U2I
private List<RecQueryVo> queryU2I(GuessQueryListRequest request) {
List<RecQueryVo> recQueryVos = Lists.newArrayList();
String redisKey = JimdbConstant.QIXIAN_U2I + request.getPin();
String skuString = jimdbUtil.get(redisKey);
if(StringUtils.isBlank(skuString)){
return null;
}
List<String> skuInfoList = Arrays.asList(skuString.split(","));
if(CollectionUtils.isEmpty(skuInfoList)){
return null;
}
List<byte[]> redisList= buildRedisKeys(skuInfoList);
List<String> wordList = jimdbUtil.mget(redisList);
if(CollectionUtils.isEmpty(wordList)){
return null;
}
for(String word: wordList){
if(StringUtils.isBlank(word) || word.length()>7){
continue;
}
RecQueryVo recQueryVo = new RecQueryVo();
recQueryVo.setQuery(word.trim());
recQueryVo.setSource("u2i");
recQueryVos.add(recQueryVo);
}
return recQueryVos;
}
// I2I
private List<RecQueryVo> queryI2I(GuessQueryListRequest request) {
List<RecQueryVo> recQueryVos = Lists.newArrayList();
// 1 取到星汇中的用户最近(搜索,点击,加车,订单)四类商品数据
UserServerProto.UserInfoRequest.Builder pluginRequest = buildUserInfoRequest(request);
pluginRequest.setToken("e8c6aaf6a7872443");
pluginRequest.setSourceId("cxls_odin");
String requestStr = Base64.getEncoder().encodeToString(pluginRequest.build().toByteArray());
String rec = userServerRpc.getUserInfoMap(requestStr);
// 对rec判空,防止下面Base64的decode报异常
if (StringUtils.isBlank(rec)) {
return recQueryVos;
}
byte[] data = Base64.getDecoder().decode(rec);
// 2 拆出搜索,点击,加车,订单 最近的10个info 总共40个info加入到infos
String searcherKey = "1." + 70.35; // 70.35代表搜索在星汇平台中的视图id
String clkKey = "1." + 70.34;
String cartKey = "1." + 70.36;
String ordKey = "1." + 70.37;
List<UserBehaviorDataProto.UserBehaviorInfo> infos = Lists.newArrayList();
Set<String> searchItemIds = Sets.newHashSet();
try {
UserServerProto.UserInfoMapResponse response = UserServerProto.UserInfoMapResponse.parseFrom(data);
Map<String, ByteString> responseUserRawInfoMap = response.getUserRawInfoMap();
searchItemIds = queryUserBehaviorDataAndSearchItemId(responseUserRawInfoMap, searcherKey, infos);
queryUserBehaviorData(responseUserRawInfoMap, clkKey, infos);
queryUserBehaviorData(responseUserRawInfoMap, cartKey, infos);
queryUserBehaviorData(responseUserRawInfoMap, ordKey, infos);
} catch (Exception e) {
log.error("Error parsing or decompressing user behavior data", e);
}
if (CollectionUtils.isEmpty(infos)) {
log.info("SearchQueryService -> infos is null");
return recQueryVos;
}
// 3 将infos按时间进行降序输出itemId(skuId)
List<UserBehaviorDataProto.UserBehaviorInfo> infoList = infos.stream().sorted(Comparator.comparing(UserBehaviorDataProto.UserBehaviorInfo::getTime).reversed()).collect(Collectors.toList());
List<String> itemIds = infoList.stream().map(UserBehaviorDataProto.UserBehaviorInfo::getItemId).collect(Collectors.toList());
if (CollectionUtils.isEmpty(itemIds)) {
log.info("SearchQueryService I2I -> itemIds is null");
return recQueryVos;
}
// 4 根据3 itemIds构建redisKey 根据redisKey去jimdb中查推荐词
List<byte[]> redisQueryKeys = buildQueryRedisKeys(itemIds);
List<String> wordList = jimdbUtil.mget(redisQueryKeys);
if (CollectionUtils.isEmpty(wordList)) {
log.info("SearchQueryService -> wordList is null");
return recQueryVos;
}
// 5 根据4取出的产品词组key去jimdb中查推荐词
List<byte[]> wordKeys = buildQueryRedisKeys(wordList);
List<String> wordsToQuery = jimdbUtil.mget(wordKeys);
if (CollectionUtils.isEmpty(wordsToQuery)) {
log.info("SearchQueryService -> wordsToQuery is null");
return recQueryVos;
}
// 6 根据查到的推荐词来构建RecQueryVo,放入recQueryVos并返回
buildQueryVos(recQueryVos,wordList,wordsToQuery,searchItemIds);
log.info("SearchQueryService -> recQueryVos:{}", JsonUtils.toJson(recQueryVos));
return recQueryVos;
}
private static void buildQueryVos(List<RecQueryVo> recQueryVos,List<String> wordList,List<String> wordToQueryList) {
if(CollectionUtils.isEmpty(wordList)){
return ;
}
List<String> list = Lists.newArrayList();
int wSize = wordList.size();
int wQSize = wordToQueryList.size();
int mMax = Math.max(wSize,wQSize);
for (int i = 0; i < mMax; i ++){
String word = wordList.get(i);
String wordQ = wordToQueryList.get(i);
if(StringUtils.isBlank(word)){
continue;
}
// 产品词截取2个,当用户行为是搜索时,可能根据itemId拿到多个产品词截取两个;当用户行为是点击,加车,订单时,只有一个产品词,综上截取前两个
List<String> str = Arrays.stream(word.split(",")).filter(s -> !s.trim().isEmpty()).limit(2).collect(Collectors.toList());
// 推荐词只截取前2个
List<String> strQ = Arrays.stream(wordQ.split(",")).filter(s -> !s.trim().isEmpty()).limit(2).collect(Collectors.toList());
if(!CollectionUtils.isEmpty(str)){
list.addAll(str);
}
if(!CollectionUtils.isEmpty(strQ)){
list.addAll(strQ);
}
}
// 去重复
log.info("SearchQueryService buildQueryWord list :{}", list);
Set<String> uniqueSet = new LinkedHashSet<>(list);
log.info("SearchQueryService buildQueryWord uniqueSet :{}", uniqueSet);
uniqueSet.forEach(x->{
if(x.length()<=7){
RecQueryVo recQueryVo = new RecQueryVo();
recQueryVo.setQuery(x.trim());
recQueryVo.setSource("xh");
recQueryVos.add(recQueryVo);
}
});
}
private static void buildQueryVos(List<RecQueryVo> recQueryVos,List<String> wordList,List<String> wordToQueryList, Set<String> searchItemIds) {
if(CollectionUtils.isEmpty(wordList)){
return ;
}
List<String> list = Lists.newArrayList();
int wSize = wordList.size();
int wQSize = wordToQueryList.size();
int mMax = Math.max(wSize,wQSize);
for (int i = 0; i < mMax; i ++){
String word = wordList.get(i);
String wordQ = wordToQueryList.get(i);
if(StringUtils.isBlank(word)){
continue;
}
// 产品词截取2个,当用户行为是搜索时,可能根据itemId拿到多个产品词截取两个;当用户行为是点击,加车,订单时,只有一个产品词,综上截取前两个
List<String> str = Arrays.stream(word.split(",")).filter(s -> !s.trim().isEmpty()).limit(2).collect(Collectors.toList());
// 推荐词只截取前2个
List<String> strQ = Arrays.stream(wordQ.split(",")).filter(s -> !s.trim().isEmpty()).limit(2).collect(Collectors.toList());
if(!CollectionUtils.isEmpty(str)){
list.addAll(str);
}
if(!CollectionUtils.isEmpty(strQ)){
list.addAll(strQ);
}
}
// 去重复
log.info("SearchQueryService buildQueryWord list :{}", list);
Set<String> uniqueSet = new LinkedHashSet<>(list);
log.info("SearchQueryService buildQueryWord uniqueSet :{}", uniqueSet);
uniqueSet.forEach(x->{
if(x.length()<=7 && !searchItemIds.contains(x)){
RecQueryVo recQueryVo = new RecQueryVo();
recQueryVo.setQuery(x.trim());
recQueryVo.setSource("xh");
recQueryVos.add(recQueryVo);
}
});
}
private void queryUserBehaviorData(Map<String, ByteString> responseUserRawInfoMap, String key, List<UserBehaviorDataProto.UserBehaviorInfo> infos) throws IOException {
ByteString searchData = responseUserRawInfoMap.get(key);
if (searchData!= null) {
byte[] uncompressedData = Snappy.uncompress(searchData.toByteArray());
UserBehaviorDataProto.UserBehaviorData behaviorData = UserBehaviorDataProto.UserBehaviorData.parseFrom(uncompressedData);
List<UserBehaviorDataProto.UserBehaviorInfo> userBehaviorInfos = behaviorData.getInfosList();
if (CollectionUtils.isNotEmpty(userBehaviorInfos)) {
// 将userBehaviorInfos按照时间降序排序 取最近的15条
userBehaviorInfos = userBehaviorInfos.stream().filter(x -> StringUtils.isNotBlank(x.getItemId())).sorted(Comparator.comparing(UserBehaviorDataProto.UserBehaviorInfo::getTime).reversed()).collect(Collectors.toList());
userBehaviorInfos = userBehaviorInfos.size() > 15 ? userBehaviorInfos.subList(0, 15) : userBehaviorInfos;
for (UserBehaviorDataProto.UserBehaviorInfo info : userBehaviorInfos) {
infos.add(info);
}
}
}
}
private Set<String> queryUserBehaviorDataAndSearchItemId(Map<String, ByteString> responseUserRawInfoMap, String key, List<UserBehaviorDataProto.UserBehaviorInfo> infos) throws IOException {
ByteString searchData = responseUserRawInfoMap.get(key);
List<UserBehaviorDataProto.UserBehaviorInfo> searchInfos = Lists.newArrayList();
if (searchData!= null) {
byte[] uncompressedData = Snappy.uncompress(searchData.toByteArray());
UserBehaviorDataProto.UserBehaviorData behaviorData = UserBehaviorDataProto.UserBehaviorData.parseFrom(uncompressedData);
List<UserBehaviorDataProto.UserBehaviorInfo> userBehaviorInfos = behaviorData.getInfosList();
if (CollectionUtils.isNotEmpty(userBehaviorInfos)) {
// 将userBehaviorInfos按照时间降序排序 取最近的15条
userBehaviorInfos = userBehaviorInfos.stream().filter(x -> StringUtils.isNotBlank(x.getItemId())).sorted(Comparator.comparing(UserBehaviorDataProto.UserBehaviorInfo::getTime).reversed()).collect(Collectors.toList());
userBehaviorInfos = userBehaviorInfos.size() > 15 ? userBehaviorInfos.subList(0, 15) : userBehaviorInfos;
for (UserBehaviorDataProto.UserBehaviorInfo info : userBehaviorInfos) {
infos.add(info);
searchInfos.add(info);
}
}
}
return searchInfos.stream().map(UserBehaviorDataProto.UserBehaviorInfo::getItemId).collect(Collectors.toSet());
}
/**
* 构建UserInfoRequest
* @return
*/
private UserServerProto.UserInfoRequest.Builder buildUserInfoRequest(
GuessQueryListRequest request) {
UserServerProto.UserInfoRequest.Builder requestBuilder = UserServerProto.UserInfoRequest.newBuilder();
String pin = request.getPin();
String uuid = request.getClientInfo().getUuid();
if (org.apache.commons.lang.StringUtils.isEmpty(pin) && org.apache.commons.lang.StringUtils.isEmpty(uuid)) {
return null;
}
// pin为空时user_id填充uuid,且user_id_type要和真实user_id对齐
if (org.apache.commons.lang.StringUtils.isEmpty(pin)) {
requestBuilder.setUserId(uuid);
requestBuilder.setUserIdType(UserCommonProto.UserIdType.UuidType);
} else {
requestBuilder.setUserId(pin);
requestBuilder.setUserIdType(UserCommonProto.UserIdType.UserPinType);
}
if (!org.apache.commons.lang.StringUtils.isEmpty(uuid)) {
requestBuilder.setCurrentUserDevice(uuid);
requestBuilder.setCurrentUserDeviceType(UserCommonProto.UserIdType.UuidType);
}
// 请求流量标识, 1:测试流量,0:线上流量
requestBuilder.setForcebot(0);
requestBuilder.setChannelType( UserServerProto.ChannelType.ChannelAPP.getNumber());
// 用户请求IP地址
try {
requestBuilder.setIpRegion(InetAddress.getLocalHost().getHostAddress());
} catch (UnknownHostException e) {
log.error("buildUserInfoRequest 获取 客户端ip 出错");
}
com.jd.ad.user.domain.UserCommonProto.ReqDataTypeField.Builder fieldBuilder = UserCommonProto.ReqDataTypeField.newBuilder();
fieldBuilder.setFirstLevelFieldId(70);
fieldBuilder.addSecondLevelFieldIds(35);
requestBuilder.addReqDataTypeFields(fieldBuilder);
fieldBuilder.setFirstLevelFieldId(70);
fieldBuilder.addSecondLevelFieldIds(34);
requestBuilder.addReqDataTypeFields(fieldBuilder);
fieldBuilder.setFirstLevelFieldId(70);
fieldBuilder.addSecondLevelFieldIds(36);
requestBuilder.addReqDataTypeFields(fieldBuilder);
fieldBuilder.setFirstLevelFieldId(70);
fieldBuilder.addSecondLevelFieldIds(37);
requestBuilder.addReqDataTypeFields(fieldBuilder);
requestBuilder.setReqUserDeviceType(UserServerProto.UserDeviceType.PinType_VALUE);
return requestBuilder;
}
private List<byte[]> buildRedisKeys(List<String> skuList) {
List<byte[]> redisKeys = new ArrayList();
skuList.forEach(e -> redisKeys.add((JimdbConstant.QIXIAN_CNXS + e).getBytes(Charsets.UTF_8)));
return redisKeys;
}
private List<byte[]> buildQueryRedisKeys(List<String> queryList) {
List<byte[]> redisKeys = new ArrayList();
queryList.forEach(e -> redisKeys.add((JimdbConstant.QIXIAN_CNXS + e).getBytes(Charsets.UTF_8)));
return redisKeys;
}
}
七鲜新品PB品扶持
代码仓库及相关信息
代码仓库: odin-front, cxls-search-gateway
代码所在:621610-7fresh-feed, 621610-7fresh-feed-v1
开发人员:XXX(工程测),周文星(数据测)
PRD: https://joyspace.jd.com/pages/gRivHJZh1IqsV2Icvhco
TRD: https://joyspace.jd.com/pages/SFikEQbToCsCtQKLOpyA
新品PB品扶持需求背景及目标
背景:为新品及战略PB商品做重点推荐干预
目的:在保证新品获得合理曝光的同时,又能通过动态规则调控而非依赖固定权重,避免对大盘指标产生负面影响。因为新品一般过精排后的权重分会比较低,需要流量扶持;PB品
(privacy brand)是自有商品(牌子是自己的),也需要扶持。
新品PB品扶持触发场景
按现在逻辑,首页中假如一页为10个商品,则在每页的位置0~4 5~9 中(前半页,后半页)随机各取一个位置用来放新品,和PB品(哪个位置放新品,哪个位置放PB品也是随机的)。
新品PB品扶持主要实现
数据测,从数据仓库Hive中每天定时拉取数据到 Jimdb (京东内部的redis),拉取的SQL脚本如下。
WITH t1 AS (
SELECT
concat('odin_fresh_sku_type_attid_list_', current_date, '_', a.att_id) AS odin_fresh_sku_type_attid_list,
CAST(a.sku_id AS STRING) AS sku_id,
a.modified,
-- 去重:每个 att_id+sku_id 组合仅保留最新记录 [6,7](@ref)
ROW_NUMBER() OVER (
PARTITION BY a.att_id, a.sku_id
ORDER BY a.modified DESC
) AS sku_rn,
-- 取前500:按 att_id 分组,时间倒序取前500个不重复 sku_id [7,8](@ref)
ROW_NUMBER() OVER (
PARTITION BY a.att_id
ORDER BY a.modified DESC
) AS att_rn
FROM (
SELECT
used_obj_id AS sku_id,
att_id,
modified
FROM fdm.fdm_xstore_new_tag_new_tag_used_chain
WHERE dp='ACTIVE'
AND tenant_id = 1
AND tag_type = 4
AND delete_flag = 0
-- AND tag_id IN('100603436','100603437')
) a
LEFT JOIN (
SELECT att_id
FROM fdm.fdm_xstore_new_tag_tag_attribute_chain
WHERE dp='ACTIVE'
AND tenant_id = 1
-- AND tag_id IN('100603436','100603437')
) b
ON a.att_id = b.att_id
WHERE b.att_id IS NOT NULL -- 确保 LEFT JOIN 有效性,排除无效 att_id
),
t2 AS (
SELECT
odin_fresh_sku_type_attid_list,
sku_id,
modified,
ROW_NUMBER() OVER (
PARTITION BY odin_fresh_sku_type_attid_list
ORDER BY modified ASC
) AS rcnt
FROM t1
WHERE sku_rn = 1 -- 去重后的有效记录
AND att_rn <= 500 -- 每个 att_id 取前500个最新记录
)
SELECT
odin_fresh_sku_type_attid_list,
sku_id
FROM t2
WHERE rcnt <= 500;
在数据集成-DataBus平台中 创建定时推送hive中我们需要的数据至jimdb(redis)
数据测还有一个是用flink来统计曝光次数,篇幅有限这个可参考 “七鲜首页负反馈” 的数据测实现
七鲜首页精排模型升级
代码仓库及相关信息
代码仓库: odin-front
代码所在:621610-7fresh-feed-v1
开发人员:周文星
算法老师给的文档:https://joyspace.jd.com/pages/6VH0HXO8xPulm8eWtKXJ
首页精排模型升级需求背景及目标
升级最新的精排模型,并通过AB实验来统计各个指标如GMV,点击率,转化率等指标对比新旧精排模型的效果好坏。
首页精排模型升级触发场景
首页推荐,自动触发。
首页精排模型升级主要实现
将推荐位621610中 aiFlowRankWeightResult 节点中的精排模型进行更新,同时按照新模型的要求重新编写相关参数。因为老版本精排模型只有一个,而本次更新用上了两个精排模型,所以本次变动较大。
老算子
aiFlowRankWeightResult = ${aiFlowRankWeightProcessor} {
name = "aiFlowRankWeightResult"
dependencies = [{"name" = "hint"},{"name"= "aiFlowPsUserInfo"}, {"name" = "userBehavior"}, {"name" = "userProfile"},
{"name" = "freshUserRealTimeFeatureInfo"}, {"name" = "broadwayRequest"}]
emitters = ["aiFlowRankWeightResult"]
processor = "com.jd.rec.odin.processor.fresh.FreshAiFlowRankWeightProcessor"
processorConf = {
clientTimeout = 100
indexWeight = 4
limitCut = 500
singlePsIp = ""
modelName = qixian_feeds_v4_raw
recItemName = "freshUserRealTimeFeatureInfo"
processorSwitch = true
priceSwitch = true
partitionNum = 10
debugLevel = 0
requestBuilder = FreshRequestBuilder
outputPbFile = false
sliceBugfix = true
featureDump = false
dependencyMap = {"recItem" = "freshUserRealTimeFeatureInfo", "userBehavior" = "userBehavior", "userProfile" = "userProfile" , "broadwayRequest" = "broadwayRequest" , "hint" = "hint" , "aiFlowPsUserInfo" = "aiFlowPsUserInfo"}
bizName = vsmultisiterank
}
}
新算子
aiFlowRankWeightResult = ${aiFlowRankWeightProcessor} {
name = "aiFlowRankWeightResult"
dependencies = [{"name" = "hint"},{"name"= "aiFlowPsUserInfo"}, {"name" = "userBehavior"}, {"name" = "userProfile"},
{"name" = "freshUserRealTimeFeatureInfo"}, {"name" = "broadwayRequest"},{"name" = "userInfoListAIFlowFeature"}, {"name" = "userActionListAIFlowFeature"}, {"name" = "recListAIFlowFeature"}]
emitters = ["aiFlowRankWeightResult"]
processor = "com.jd.rec.odin.processor.fresh.FreshU2IAiFlowRankWeightProcessor"
processorConf = {
clientTimeout = 100
clientTimeoutTwo = 100
indexWeight = 4
limitCut = 500
singlePsIp = ""
modelName = qixian_rec_transformer_v5_raw
modelNameTwo = qixian_rec_score_v5_raw
recItemName = "freshUserRealTimeFeatureInfo"
processorSwitch = true
priceSwitch = false
partitionNum = 10
debugLevel = 0
requestBuilder = FreshTransformerRequestBuilder
requestBuilderTwo = FreshScoreRequestBuilder
cacheName = cxls_27109
cacheKeyUser = qixian_user_
cacheKeyUserAction = QQ_user_action_
cacheKeySku = qixian_sku_v1_
cacheKeyPriceCompare = qixian_price_compare_v1_
outputPbFile = false
sliceBugfix = true
featureDump = false
dependencyMap = {"recItem" = "freshUserRealTimeFeatureInfo", "userBehavior" = "userBehavior", "userProfile" = "userProfile" , "broadwayRequest" = "broadwayRequest" , "hint" = "hint" , "aiFlowPsUserInfo" = "aiFlowPsUserInfo","userInfoListAIFlowFeature"="userInfoListAIFlowFeature" ,"userActionListAIFlowFeature"="userActionListAIFlowFeature", "recListAIFlowFeature"="recListAIFlowFeature"}
bizName = vsmultisiterank
}
}
新算子代码
public class FreshU2IAiFlowRankWeightProcessor extends AbstractProcessor<Map<Long, AiFlowRankWeightResult>> {
private static final Logger LOGGER = HotSwitchLogger.of(com.jd.rec.odin.processor.ranker.aiflow.AiFlowRankWeightProcessor.class);
private static final String BEFORE_PREFIX = "RankBefore";
private static final String AFTER_PREFIX = "RankAfter";
private static final String SAMPLE_PREFIX = "com.jd.rec.odin.processor.ranker.aiflow.sample.";
private static final String RANKER_DISPATCHER = "rankerDispatcher";
private static final String AIFLOW_RANK_KEY = "rec.odin.aiflow.rank";
@Inject
private AiFlowPluginImpl aiflowPlugin;
@Parameter
private int clientTimeout;
@Parameter
private int clientTimeoutTwo;
@Parameter
private String singlePsIp;
@Parameter
private String modelName;
@Parameter
private String modelNameTwo;
@Parameter
private String recItemName;
@Parameter
private boolean processorSwitch;
@Parameter
private int partitionNum;
@Parameter
private RequestBuilder requestBuilder;
@Parameter
private FineRequestBuilder.RequestBuilder requestBuilderTwo;
@Inject
private Injector injector;
@Parameter
public int debugLevel;
@Parameter
public boolean outputPbFile;
@Parameter
public boolean sliceBugfix;
@Parameter
public boolean featureDump;
@Parameter
public String bizName;
@Parameter
public SampleConfig sampleConfig;
@Parameter
public String sinkPartitionName;
@Parameter
private FineSortConfig fineSortConfig;
@Parameter
private String rankerType;
@Parameter
private MergerType mergerType;
@Parameter
private boolean asyncSwitch;
@Inject
private OdinMonitorPlugin odinMonitorPlugin;
@Parameter
List<MonitorStrategy> monitorStrategys;
@Parameter
private Map<String, String> dependencyMap;
@Parameter
private String responseParser;
@Parameter
private Integer indexWeight;
@Parameter
private Integer limitCut;
@Parameter
private boolean priceSwitch;
@Parameter
private String cacheName;
@Parameter
private String cacheKeyUser;
@Parameter
private String cacheKeyUserAction;
@Parameter
private String cacheKeySku;
@Parameter
private String cacheKeyPriceCompare;
@Inject
private JimdbPlugin jimdbPlugin;
@Inject private KafkaProducerPlugin kafkaProducerPlugin;
@Named("rpc")
private FuturePool rpcFuturePool;
@Inject
private DowngradeSwitchDuccManager downgradeSwitchDuccManager;
public FreshU2IAiFlowRankWeightProcessor() {
}
@Override
public CompletableFuture<Map<Long, AiFlowRankWeightResult>> process(Vertex vertex) {
Stopwatch stopwatchx = Stopwatch.createStarted();
if (!this.processorSwitch) {
return CompletableFuture.completedFuture(ImmutableMap.of());
}
DowngradeSwitchDto downgradeSwitchDto = downgradeSwitchDuccManager.getDowngradeSwitchDto();
OdinGraphContext odinGraphContext = (OdinGraphContext)vertex.getGraph().getGraphContext();
AiFlowRankDependency aiFlowRankDependency = new AiFlowRankDependency(vertex, this.dependencyMap);
long startTime = System.currentTimeMillis();
List<RecItem> recListBro = VertexDependencyUtil.getDependencyData(vertex, this.recItemName);
if (CollectionUtils.isEmpty(recListBro)) {
return CompletableFuture.completedFuture(ImmutableMap.of());
}
Map<String, Map<String, String>> recMap = recListBro.stream().collect(Collectors.toMap(RecItem::getItemId, RecItem::getBrokerExt));
List<String> userInfoList = (List<String>)VertexDependencyUtil.getDependencyData(aiFlowRankDependency.getVertex(), "userInfoListAIFlowFeature");
List<String> userActionList = (List<String>)VertexDependencyUtil.getDependencyData(aiFlowRankDependency.getVertex(), "userActionListAIFlowFeature");
List<RecItem> recList = (List<RecItem>)VertexDependencyUtil.getDependencyData(aiFlowRankDependency.getVertex(), "recListAIFlowFeature");
if (CollectionUtils.isEmpty(recList)) {
return CompletableFuture.completedFuture(ImmutableMap.of());
}
for (RecItem recItem : recList) {
recItem.setBrokerExt(recMap.get(recItem.getItemId()));
}
recList = recList.size() > limitCut ? recList.subList(0, limitCut) : recList;
BroadwayRequest bwRequest = odinGraphContext.getBroadwayRequest();
if (bwRequest.getDebug()){
this.printRankDebugInfo(vertex, bwRequest, recList, "RankBefore");
}
MonitorRequest param = MonitorManager.buildMonitorRequest(vertex, recList, "before");
MonitorManager.monitorCalculate(param, this.monitorStrategys);
IoMonitor ioMonitor = IoMonitorFactory.empty();
if (IoMonitUtils.isEnableIoMonit(vertex)) {
ioMonitor = IoMonitorFactory.get(String.valueOf(bwRequest.getPid()), this.odinMonitorPlugin.getExpIdsInWhite(MonitorUtil.getEids(bwRequest), bwRequest.getPid()), bwRequest.getRequestId(), IoTypeEnum.AI_FLOW_RANK, OptionalParams.builder().withVertexName(vertex.getName()).build());
}
String pin = bwRequest.getUser().getPin();
Map<String, List<String>> paramStringListMap = new HashMap<>();
paramStringListMap.put("userInfoList",userInfoList);
paramStringListMap.put("userActionList", userActionList);
Stopwatch stopwatch1 = Stopwatch.createStarted();
AiFlowPlugin.FineRequest request = this.buildRequest(aiFlowRankDependency, recList, paramStringListMap, modelName, requestBuilder, clientTimeout, 1);
if (odinGraphContext.getBroadwayRequest().getDebug()) {
com.jd.model.ModelRequest.Builder builder = request.requestBuilder.buildPartitionRequest(request);
try {
String jsonString = com.google.protobuf.util.JsonFormat.printer().print(builder);
addDebug(vertex, "621610 模型1 的入参" + jsonString);
} catch (InvalidProtocolBufferException e) {
LOGGER.error("621610 模型1 write log error");
}
}
CompletableFuture<List<Double>> actionfuture = this.aiflowPlugin.fetchFineFeature(request).thenApply((responses) -> {
List<Double> result = new ArrayList<>();
if (this.featureDump) {
return result;
} else {
for (AiFlowPlugin.FineResponse response : responses) {
AiFlowResponseParser parser = this.getAiFlowResponseParser();
if (parser == null) {
ModelResponse modelResponse = response.getModelResponse();
if (modelResponse.getSliceUnitCount() > 0) {
List<InferenceResponse> sliceUnits = modelResponse.getSliceUnitList();
if (CollectionUtils.isNotEmpty(sliceUnits)) {
result = sliceUnits.get(0).getOutput(0).getMatrix(0).getDoubleValList();
}
}
}
}
return result;
}
}).exceptionally((e) -> new ArrayList<>());
try {
List<Double> actionDoubles = Lists.newArrayList();
long timeOut = 50;
if(Objects.isNull(downgradeSwitchDto) || StringUtils.isBlank((downgradeSwitchDto.getFreshU2iModelOneTimeSwitch()))){
actionDoubles = actionfuture.get();
}else{
timeOut = Long.parseLong(downgradeSwitchDto.getFreshU2iModelOneTimeSwitch());
actionDoubles = actionfuture.get(timeOut,TimeUnit.MILLISECONDS );
}
List<String> actionStrings = actionDoubles.stream().map(String::valueOf).collect(Collectors.toList());
paramStringListMap.put("actionStrings", actionStrings);
} catch (Exception e) {
LOGGER.error("FreshU2IAiFlowRankWeightProcessor actionStrings is error: ",e);
}
LOGGER.info("FreshAiFlowRankWeightProcessor 模型1, 耗时:{} 豪秒", stopwatch1.elapsed(TimeUnit.MILLISECONDS));
Stopwatch stopwatch2 = Stopwatch.createStarted();
AiFlowPlugin.FineRequest scoreRequest = this.buildRequest(aiFlowRankDependency, recList, paramStringListMap, modelNameTwo, requestBuilderTwo, clientTimeoutTwo, partitionNum);
request.ioMonitor = ioMonitor;
IoMonitor finalIoMonitor = ioMonitor;
IoMonitor finalIoMonitor1 = ioMonitor;
List<RecItem> finalRecList = recList;
if (odinGraphContext.getBroadwayRequest().getDebug()) {
com.jd.model.ModelRequest.Builder builder = scoreRequest.requestBuilder.buildPartitionRequest(scoreRequest);
com.jd.model.ModelRequest.Builder modelBuilder = ModelRequest.newBuilder(builder.build());
ModelRequest modelRequest = scoreRequest.requestBuilder.buildPartitionRequest(modelBuilder, recList).build();
try {
String jsonString = com.google.protobuf.util.JsonFormat.printer().print(modelRequest);
addDebug(vertex, "621610 模型2 的入参" + jsonString);
} catch (InvalidProtocolBufferException e) {
LOGGER.error("621610 模型2 write log error");
}
}
CompletableFuture<Map<Long, AiFlowRankWeightResult>> future = this.aiflowPlugin.fetchFineFeature(scoreRequest).thenApply((responses) -> {
finalIoMonitor.afterIoResponse();
Map<Long, AiFlowRankWeightResult> result = new HashMap();
Map<Long, String> responseMap = Maps.newHashMap();
if (this.featureDump) {
return result;
} else {
for (AiFlowPlugin.FineResponse response : responses) {
AiFlowResponseParser parser = this.getAiFlowResponseParser();
if (parser == null) {
this.parseResponse(result, response, responseMap);
} else {
try {
if (parser.valid(response)) {
parser.parse(result, response);
}
} catch (Exception var14) {
LOGGER.error("621610 AiFlowResponseParser error:", var14);
}
}
}
finalIoMonitor.afterDeserialize();
if (this.odinMonitorPlugin != null) {
Map<String, String> logTraceMap = Maps.newHashMap();
logTraceMap.put("pid", String.valueOf(request.pid));
logTraceMap.put("modelName", request.modelName);
List<String> expIdsInWhite = this.odinMonitorPlugin.getExpIdsInWhite(MonitorUtil.getEids(bwRequest), bwRequest.getPid());
for (String expId : expIdsInWhite) {
logTraceMap.put("expId", expId);
if (MapUtils.isNotEmpty(result)) {
this.odinMonitorPlugin.logObserver("rec.odin.aiflow.rank", 1, startTime, logTraceMap);
} else {
finalIoMonitor.onError();
this.odinMonitorPlugin.logObserver("rec.odin.aiflow.rank", 0, startTime, logTraceMap);
}
}
}
List<RecItem> monitorItems = finalRecList.stream().filter((item) -> result.containsKey(Long.parseLong(item.getItemId()))).collect(Collectors.toList());
MonitorRequest param2 = MonitorManager.buildMonitorRequest(vertex, monitorItems, "after");
MonitorManager.monitorCalculate(param2, this.monitorStrategys);
if(Objects.isNull(downgradeSwitchDto) || StringUtils.isBlank(downgradeSwitchDto.getFreshU2iSendKafkaSwitch()) || "false".equals(downgradeSwitchDto.getFreshU2iSendKafkaSwitch())) {
try {
List<CommonWeightDto> resultDtoList = convertMapAiFlowToCommonWeightDto(result);
List<CommonWeightDto> responseMapDtoList = convertMapToCommonWeightDto(responseMap);
kafkaProducerPlugin.send(
JsonUtils.toJson(resultDtoList) + "\u0008" + pin,
"odin_fresh_feature", "aiflow_model_result"
);
kafkaProducerPlugin.send(
JsonUtils.toJson(responseMapDtoList) + "\u0008" + pin,
"odin_fresh_feature", "aiflow_model_response"
);
} catch (Exception e) {
LOGGER.error("kafkaProducerPlugin.send toJson error:", e);
}
}
return result;
}
}).exceptionally((e) -> {
finalIoMonitor1.onError();
return ImmutableMap.of();
});
LOGGER.info("FreshAiFlowRankWeightProcessor 模型2, 耗时:{} 豪秒", stopwatch2.elapsed(TimeUnit.MILLISECONDS));
LOGGER.info("FreshAiFlowRankWeightProcessor 模型总耗时时间, 耗时:{} 豪秒", stopwatchx.elapsed(TimeUnit.MILLISECONDS));
if (bwRequest.getDebug()){
this.printRankDebugInfo(vertex, bwRequest, recList, "RankAfter");
}
if (bwRequest.getDebug()){
this.printRankDebugInfo(vertex, bwRequest, finalRecList, "RankAfterFinalRecList");
}
return this.asyncSwitch ? CompletableFuture.completedFuture(ImmutableMap.of()) : future;
}
public static List<CommonWeightDto> convertMapToCommonWeightDto(Map<Long, String> map) {
List<CommonWeightDto> dtoList = new ArrayList<>();
for (Map.Entry<Long, String> entry : map.entrySet()) {
CommonWeightDto dto = new CommonWeightDto();
dto.setKey(entry.getKey().toString());
dto.setValue(entry.getValue());
dtoList.add(dto);
}
return dtoList;
}
public static List<CommonWeightDto> convertMapAiFlowToCommonWeightDto(Map<Long, AiFlowRankWeightResult> map) {
List<CommonWeightDto> dtoList = new ArrayList<>();
for (Map.Entry<Long, AiFlowRankWeightResult> entry : map.entrySet()) {
CommonWeightDto dto = new CommonWeightDto();
dto.setKey(entry.getKey().toString());
dto.setValue(JsonUtils.toJson(entry.getValue()));
dtoList.add(dto);
}
return dtoList;
}
private FineRequest buildRequest(AiFlowRankDependency aiFlowRankDependency, List<RecItem> recItems, Map<String, List<String>> paramStringListMap, String modelName, FineRequestBuilder.RequestBuilder requestBuilder, int clientTimeout, int partitionNum) {
OdinGraphContext odinGraphContext = (OdinGraphContext) aiFlowRankDependency.getVertex().getGraph().getGraphContext();
BroadwayRequest broadwayRequest = odinGraphContext.getBroadwayRequest();
FineRequest request = new FineRequest();
request.vertexName = aiFlowRankDependency.getVertex().getName();
request.vertex = aiFlowRankDependency.getVertex();
request.broadwayRequest = broadwayRequest;
request.pid = broadwayRequest.getPid();
request.requestId = broadwayRequest.getRequestId();
request.pin = broadwayRequest.getUser().getPin();
request.uuid = broadwayRequest.getUser().getUuid();
request.singlePsIp = this.singlePsIp;
request.clientTimeout = clientTimeout;
request.modelName = modelName;
request.partitionNum = partitionNum;
request.items = recItems;
request.requestBuilder = (FineRequestBuilder)this.injector.getInstance(requestBuilder.getClz());
request.debugLevel = this.debugLevel;
request.inferInvokerType = InferInvokerType.INFER_CONCURRENT;
request.outputPbFile = this.outputPbFile;
request.featureDump = this.featureDump;
request.testTraffic = broadwayRequest.getTestTraffic();
request.bizName = this.bizName;
request.sinkPartitionName = this.sinkPartitionName;
request.psShardType = PsShardType.ASSIGN_ITEM_NUM;
request.fineSortConfig = this.fineSortConfig;
request.userProfile = (UserProfile)VertexDependencyUtil.getDependencyData(aiFlowRankDependency.getVertex(), "userProfile");
request.skuProfileMap = (Map)VertexDependencyUtil.getDependencyData(aiFlowRankDependency.getVertex(), "itemProfile");
request.mergerType = this.mergerType;
request.userBehavior = (UserBehavior)VertexDependencyUtil.getDependencyData(aiFlowRankDependency.getVertex(), "userBehavior");
request.aiFlowRankDependency = aiFlowRankDependency;
request.paramStringListMap = paramStringListMap;
return request;
}
protected void printRankDebugInfo(Vertex vertex, BroadwayRequest bwRequest, List<RecItem> roughRecList, String key) {
if (bwRequest.getDebug()) {
String sizeMsg = key + " AiFlowRankWeightProcessor item size is %s";
String itemMsg = key + "AiFlowRankWeightProcessor, item list is [%s]";
if (CollectionUtils.isNotEmpty(roughRecList)) {
this.addDebug(vertex, String.format(sizeMsg, roughRecList.size()));
this.addDebug(vertex, String.format(itemMsg, ((List)roughRecList.stream().map((recItem) -> {
String var10000 = recItem.getItemId();
return var10000 + " - " + recItem.getRecallType().getTag() + " - " + recItem.getRoughRankWeight() + " - " + recItem.getWeight();
}).collect(Collectors.toList())).toString()));
} else {
this.addDebug(vertex, String.format(sizeMsg, 0));
this.addDebug(vertex, String.format(itemMsg, "null"));
}
}
}
private AiFlowResponseParser getAiFlowResponseParser() {
AiFlowResponseParser aiFlowResponseParser = null;
if (StringUtils.isNotBlank(this.responseParser)) {
try {
aiFlowResponseParser = (AiFlowResponseParser)this.injector.getInstance(Class.forName(this.responseParser));
} catch (ClassNotFoundException var3) {
LOGGER.error("aiFlowResponseParser init error:{}", var3);
}
}
return aiFlowResponseParser;
}
private void parseResponse(Map<Long, AiFlowRankWeightResult> result, FineResponse response,Map<Long, String> responseMap) {
ModelResponse modelResponse = response.getModelResponse();
if (modelResponse.getSliceUnitCount() > 0) {
List<InferenceResponse> sliceUnits = modelResponse.getSliceUnitList();
if (CollectionUtils.isNotEmpty(sliceUnits)) {
int weightNum = sliceUnits.stream().map((sliceUnitx) -> sliceUnitx.getOutput(0).getMatrix(0).getDoubleValList()).mapToInt((weightsx) -> Optional.ofNullable(weightsx).orElse(new ArrayList(0)).size()).sum();
if (weightNum / indexWeight == response.getItemList().size()) {
int currentWeightSize = 0;
for(int m = 0; m < sliceUnits.size(); ++m) {
int weightSize;
if (m == 0) {
weightSize = 0;
} else {
weightSize = ((InferenceResponse)sliceUnits.get(m - 1)).getOutput(0).getMatrix(0).getDoubleValList().size()/indexWeight;
}
currentWeightSize += weightSize;
InferenceResponse sliceUnit = (InferenceResponse)sliceUnits.get(m);
List<Double> weights = getWeights(sliceUnit.getOutput(0).getMatrix(0).getDoubleValList());
List<String> weightStrs = Lists.newArrayList();
try{
weightStrs = getWeightsList(sliceUnit.getOutput(0).getMatrix(0).getDoubleValList(),
Math.toIntExact(modelResponse.getSliceUnit(0).getOutput(0).getMatrix(0).getPinoShape().getDim(1).getSize()));
}catch (Exception e){
LOGGER.error("AiFlowRankWeightProcessor getWeightsList is error: ",e);
}
if (!CollectionUtils.isNotEmpty(weights)) {
LOGGER.error("AiFlowRankWeightProcessor proxy server return wrong weights");
break;
}
for(int i = 0; i < weights.size(); ++i) {
double weight = (Double)weights.get(i);
if (!(weight < 0.0D)) {
RecItem recItem = (RecItem)response.getItemList().get(currentWeightSize + i);
if (recItem != null) {
Long itemId = Long.parseLong(recItem.getItemId());
AiFlowRankWeightResult aiFlowRankWeightResult = new AiFlowRankWeightResult();
SkuInfo skuInfo = (SkuInfo)recItem.getItemInfo();
if(this.priceSwitch){
if(skuInfo.getJdPrice() != 0){
weight = weight * skuInfo.getJdPrice();
}else {
LOGGER.error("AiFlowRankWeightProcessor jd price is null sku: "+itemId);
}
}
aiFlowRankWeightResult.setWeight((float)weight);
result.put(itemId, aiFlowRankWeightResult);
try{
responseMap.put(itemId, weightStrs.get(i));
}catch (Exception e){
LOGGER.error("AiFlowRankWeightProcessor responseMap put is error: ",e);
}
}
}
}
}
} else {
LOGGER.error("AiFlowRankWeightProcessor proxy server return wrong num weights");
}
}
}
}
private List<Double> getWeights(List<Double> weights) {
// 假设这里已经填充了 120 个元素
List<Double> result = new ArrayList<>();
for (int i = 0; i < weights.size(); i += indexWeight) {
int lastElementIndex = Math.min(i + indexWeight-1, weights.size() - 1);
result.add(weights.get(lastElementIndex) + weights.get(lastElementIndex-1)+weights.get(lastElementIndex-2));
}
return result;
}
private List<String> getWeightsList(List<Double> weights, int size) {
// 分堆
List<List<Double>> chunks = splitIntoChunks(weights, size);
return convertToJson(chunks);
}
/**
* 将 List<Double> 按照指定 size 分堆
*
* @param weights 原始数据
* @param size 每堆的大小
* @return 分堆后的列表
*/
public static List<List<Double>> splitIntoChunks(List<Double> weights, int size) {
List<List<Double>> chunks = new ArrayList<>();
for (int i = 0; i < weights.size(); i += size) {
int end = Math.min(i + size, weights.size());
chunks.add(new ArrayList<>(weights.subList(i, end)));
}
return chunks;
}
/**
* 将每堆数据转换为 JSON 字符串
*
* @param chunks 分堆后的数据
* @return JSON 字符串列表
*/
public static List<String> convertToJson(List<List<Double>> chunks) {
List<String> jsonChunks = new ArrayList<>();
for (List<Double> chunk : chunks) {
jsonChunks.add(JsonUtils.toJson(chunk));
}
return jsonChunks;
}
private List<byte[]> buildRedisKeys(List<RecItem> addList) {
List<byte[]> redisKeys = new ArrayList();
addList.forEach(e -> redisKeys.add((cacheKeySku + e.getItemId()).getBytes(Charsets.UTF_8)));
return redisKeys;
}
private List<byte[]> buildPriceCompareRedisKeys(List<RecItem> addList, String storeId) {
List<byte[]> redisKeys = new ArrayList();
addList.forEach(e -> redisKeys.add((cacheKeyPriceCompare + storeId + "_" + e.getItemId()).getBytes(Charsets.UTF_8)));
return redisKeys;
}
}
前置仓活动页加 redis 兜底
代码仓库及相关信息
代码仓库: odin-front
代码所在:619657-front-did-sku
开发人员:周文星
PRD文档:https://joyspace.jd.com/pages/sgfTtx7X6HPmBVy5PcSO
前置仓活动页需求背景及目标
为618大促,做兜底处理
前置仓活动页主要实现
在召回合并之后,若没用内容。则走该逻辑进行处理。
lbsRedis = ${vertex} {
dependencies = [
{"name" = "recallResult"},{"name" = "hint"}
]
processor = "com.jd.rec.odin.processor.front.LbsRedisProcessor"
emitters = ["lbsRedis"]
processorConf = {
recItemName = recallResult
recallType = USERPROFILE_CID3_LBS_JDDJ_SKU
}
}
public class LbsRedisProcessor extends AbstractProcessor<List<RecItem>> {
private static final Logger LOGGER = LoggerFactory.getLogger(LbsRedisProcessor.class);
@Parameter
private String recItemName;
@Inject
private InnerJimdbPlugin innerJimdbPlugin;
@Parameter
protected RecallType recallType;
private static final String CACHE_NAME_cxls_27110 = "cxls_27110";
private static final String CACHE_KEY_lbs_did = "front_did_bind_";
@Override
public CompletableFuture<List<RecItem>> process(Vertex vertex) {
// 根据
List<RecItem> recItems = VertexDependencyUtil.getDependencyData(vertex, this.recItemName);
JsonObject hint = VertexDependencyUtil.getDependencyData(vertex, DependencyConstants.HINT);
// 若召回合并后的结果为空,则说明lbs未更新,则进行更新
if (CollectionUtils.isEmpty(recItems)){
if (hint == null) {
return CompletableFuture.completedFuture(recItems);
}
String did = JsonUtil.getAsString(hint, "lbs_did");
if (StringUtils.isEmpty(did)) {
return CompletableFuture.completedFuture(recItems);
}
CompletableFuture<List<String>> completableFuture = innerJimdbPlugin.lRange(CACHE_KEY_lbs_did + did, 0, -1, CACHE_NAME_cxls_27110);
try {
List<String> skuList = completableFuture.get(200, TimeUnit.MILLISECONDS);
if (CollectionUtils.isEmpty(skuList)) {
LOGGER.error("front_did_bind_pin get skuList is null, pin is: "+ did);
return CompletableFuture.completedFuture(recItems);
}else{
// 拿到skuList后,在这里将skuList组成items
List<RecItem> recItemList=new ArrayList<>();
for (String sku : skuList) {
if (StringUtils.isEmpty(sku)) {
continue;
}
RecItem recItem = new RecItem();
recItem.setItemId(sku);
SkuInfo skuInfo = new SkuInfo();
skuInfo.setSku(Long.parseLong(sku));
recItem.setItemInfo(skuInfo);
recItem.setRankWeight(0.0f);
recItem.setRecallType(recallType);
recItem.setTag(recallType.getTag());
recItem.setItemType(ItemType.SKU);
RecallItemDetail detail = new RecallItemDetail(Long.parseLong(recItem.getItemId()), recallType, recItem.getWeight(), recallType.getTagName());
recItem.setRecallItemDetails(Lists.newArrayList(detail));
recItemList.add(recItem);
}
return CompletableFuture.completedFuture(recItemList);
}
} catch (Exception e) {
LOGGER.error("read CACHE_NAME_cxls_27109 CACHE_KEY_qixian_user redis cache error ", e);
}
}
return CompletableFuture.completedFuture(recItems);
}
}
来京东实习的个人收获与总结
项目经历
本人在实习期间主要工作是推荐业务功能的开发。参与的需求有 搭配购,无货找相似,首页负反馈等。
我这边感受是部门leader,组里的老大,mentor都很nice,我有什么不懂的去问都会给我解答。很推荐大家来京东实习。
日常生活
上班下班时间
早上我一般9点半到岗(京东产研是9点到11点弹性打卡),但一般大家最晚上午10点前基本都到岗位了。中午吃饭加午休是11:30-13:00来之前还很担心中午只有一小时包吃饭午休但其实不是。晚上的话,我一般吃完饭后散散步,20:00之前回工位继续感觉,前面刚开始还好后面活多了基本就是22点之后下班了,还好住的近。但周五可以吃完晚饭就回了(实习只要工时满了8小时就行)。
福利待遇
工资高。工资应该是实习天花板了,具体可以小红书上搜下 哈哈
晚饭免费。晚上18:50之后晚饭免费,每天还有酸奶或水果拿一样(感觉这点比华子的20:35领夜宵好太多了)。
每天餐补20。食堂种类很多,晚上18:50蹭个免费晚饭,在京东基本吃饭不花钱,吃不完离职时还可以提现带走。
周六日吃饭也不要钱。我住的离公司比较近,周围也没啥吃的,平时周六日我有的时候中午晚上就来公司蹭个饭。
免费咖啡机。里面可以打美式,卡布奇诺,拿铁,摩卡,纯牛奶等饮品统统免费,部分楼层还支持带冰。
免费健身房。健身房设备很全,挺大的,还有私教。但得提前一天预约。
实习培训
京东感觉是所有公司里面最看重实习的公司。它对给每期进来的实习生进行为期三天的培训,教一些通用的职场知识,大我们一两届的学长的经验分享以及答疑,还有京东的一些发展史,价值观等。这三天培训也是带薪的,最关键的是实习是分组来的,大概一个组八到九个人。这意味直接给你分配了同期一起实习的小伙伴,大家可以约着去干饭之类的,后续工作中也可以相互交流。这点真的很顶,所以我单独拿出来说。
个人感受
现在感觉就是很好,但还在实习中,后续慢慢再分享
下面附上实习时拍的两张照片