A Pareto-Efficient Algorithm for Multiple Objective Optimization in E-Commerce Recommendation阅读翻译



Recommendation with multiple objectives is an important but difficult problem, where the coherent difficulty lies in the possible conflicts between objectives. In this case, multi-objective optimization is expected to be Pareto efficient, where no single objective can be further improved without hurting the others. However existing approaches to Pareto efficient multi-objective recommendation still lack good theoretical guarantees.
In this paper, we propose a general framework for generating Pareto efficient recommendations. Assuming that there are formal differentiable formulations for the objectives, we coordinate these objectives with a weighted aggregation. Then we propose a condition ensuring Pareto efficiency theoretically and a two-step Pareto efficient optimization algorithm. Meanwhile the algorithm can be easily adapted for Pareto Frontier generation and fair recommendation selection. We specifically apply the proposed framework on E-Commerce recommendation to optimize GMV and CTR simultaneously. Extensive online and offline experiments are conducted on the real-world E-Commerce recommender system and the results validate the Pareto efficiency of the framework.
To the best of our knowledge, this work is among the first to provide a Pareto efficient framework for multi-objective recommendation with theoretical guarantees. Moreover, the framework can be applied to any other objectives with differentiable formulations and any model with gradients, which shows its strong scalability.





Recommender systems are emerging as a crucial role in online services and platforms, which prevent users from information overload. The recommendation algorithms (for example Learning To Rank) generate personalized rankings of items and the top-ranked items are recommended to users. Usually, the algorithms need very careful designs to fulfill multiple objectives. However, it is difficult to optimize multiple objectives simultaneously, where the core difficulty lies in the conflicts between different objectives. In E-Commerce recommendation, CTR (Click Through Rate) and GMV (Gross Merchandise Volume) are two important objectives that are not entirely consistent. To validate this inconsistency, we collect one-week online data from a real-world E-Commerce platform and plot the trends of GMV when CTR . According to the trends reflected in Fig. 1, CTR is not entirely consistent with GMV , and a CTR-optimal or GMV-optimal recommendation can be rather sub-optimal or even bad in terms of the other objective.

Therefore, a solution is considered as optimal for two objectives in the sense that no objective can be further improved without hurting the other one. This optimality is widely acknowledged in multiple objective optimization and named as Pareto efficiency or Pareto optimality. In the context of Pareto efficiency, solution A is considered to dominate solution B only when A outperforms B on all the objectives. And the aim of Pareto efficiency is to find solutions that are not dominated by any others.

Existing approaches for Pareto optimization can be categorized into two categories: heuristic search and scalarization. Evolutionary algorithms are popular choices in heuristic search approaches. However, heuristic search can not guarantee Pareto efficiency, it only ensures the resulting solutions are not dominated by each other (but still can be dominated by the Pareto efficient solutions) [45]. Unlike heuristic search, scalarization transforms multiple objectives into a single one with a weighted sum of all the objective functions. With proper scalarization, the Pareto efficient solutions can be achieved by optimizing the reformulated objective function. However, the scalarization weights of objective functions are usually determined manually and Pareto efficiency is still not guaranteed. To summarize, it is very difficult for existing evolutionary algorithms and scalarization algorithms to find Pareto efficient solutions with a guarantee. Recently, it is pointed out that the Karush-Kuhn-Tucker (KKT) conditions can be used to guide the scalarization [11]. We build our algorithm upon the KKT conditions and propose a novel algorithmic framework that generates the scalarization weights with theoretical guarantees.

Specifically, we propose a Pareto-Efficient algorithmic framework “PE-LTR” that optimizes multiple objectives with an LTR procedure. Given the candidate items generated for each user, PE-LTR ranks the candidates so that the ranking is Pareto efficient with respect to multiple objectives. Assuming that there exist differentiable formulations for each objective correspondingly, we adopt the scalarization technique to coordinate different objectives into a single objective function. As stated before, the scalarization technique can not guarantee Pareto efficiency unless the weights are carefully chosen. Therefore, we first propose a condition for the scalarization weights that ensures the solution is Pareto efficient. The condition is equivalent to a constrained optimization problem, and we propose an algorithm that solves the problem in two steps. First we simplify the problem by relaxing the constraints so that an analytic solution is achieved; then we get the feasible solution by conducting a projection procedure. With PE-LTR as the cornerstone, we provide methods to generate the Pareto Frontier and a specific recommendation, depending on the needs of service providers. To generate the Pareto Frontier, one can run PE-LTR by evenly set the bounds of the objective scalarization weights. To generate a specific recommendation, one can either run PE-LTR once with proper bounds or generate the Pareto Frontier first and choose a “fair” solution with specific fairness metric.

In this paper we apply this framework to optimize two important objectives for E-Commerce recommendation, i.e. GMV and CTR. For E-Commerce platforms, the primary objective is to improve the GMV, but too much sacrifice of CTR may cause a severe decrease of daily active users (DAU) in the long term. Therefore we aim to find Pareto efficient solutions with respect to both objectives. We propose two differentiable formulations for GMV and CTR respectively and apply the PE-LTR framework for generating Pareto-optimal solutions. We conduct extensive experiments on a real-world E-Commerce recommender system and compare the results with state-of-the-art approaches. The online and offline experimental results both indicate that our solution outperforms other baselines significantly and the solutions are nearly Pareto efficient.

The contributions of this work are:

  • We propose a general Pareto efficient algorithmic framework
    (PE-LTR) for multi-objective recommendation. The framework is both model and objective agnostic, which shows its great scalability.
  • We propose a two-step algorithm which theoretically guarantees the Pareto efficiency. Despite the algorithm is built upon scalarization technique, it differs from other scalarization approaches with its theoretical guarantee and its automatic learning of scalarization weights rather than manually assignment.
  • With PE-LTR as the cornerstone, we present how to generate the Pareto Frontier and a specific recommendation. Specifically, we propose to select a fair recommendation from the Pareto Frontier with proper fairness metrics.
  • We use E-Commerce recommendation as a specification of PE-LTR, and conduct extensive online and offline experiments on a real-world recommender system. The results indicate that our algorithm outperforms other state-of-the-art approaches significantly and the solutions generated are Pareto efficient.
  • We open-source a large-scale E-Commerce recommendation dataset EC-REC, which contains the real records of impressions, clicks and purchases. To the best of our knowledge, no public dataset includes all three labels and enough features, this dataset can be used for further studies.
  • 我们提出了一个通用的帕累托有效算法框架(PE-LTR)用于多目标推荐。该框架既不依赖于模型,也不依赖于目标,显示了其强大的可伸缩性。
  • 我们提出一个两步算法,从理论上保证帕累托效率。尽管该算法是建立在标量化技术的基础上的,但它不同于其他标量化方法,它具有理论保证和标量化权值的自动学习,而不是手工赋值。
  • 以PE-LTR为基础,我们介绍了如何生成帕累托前沿和具体推荐。具体来说,我们提出从帕累托边界中选择一个公平的推荐,并使用适当的公平度量。
  • 我们使用电子商务推荐作为PE-LTR的规范,并在现实世界的推荐系统上进行广泛的在线和离线实验。结果表明,我们的算法明显优于其他最新方法,并且生成的解是帕累托有效的。
  • 我们开源了一个大型电子商务推荐数据集EC-REC,其中包含展现、点击和购买的真实记录。据我们所知,这是第一个包含所有三个标签和足够特征的公共数据集,这个数据集可以用于进一步的研究。


In this section, we provide a detailed introduction to the related studies from the following aspects: recommendation with multiple objectives, E-Commerce recommendation and learning to rank.

2.1 Recommendation with Multiple Objectives 多目标推荐

We look at the studies on multi-objective recommendation from two aspects, i.e. the objectives concerned and the approaches for multi-objective recommendations.
Despite the recommendation accuracy is the main concern, some studies argue that other characteristics such as the availability, profitability, or usefulness should be considered simultaneously [15, 22]. Some studies attempt to model the trade-off s between relevance and diversity in recommendation [14, 17, 41]. When multiple objectives are concerned, it is expected to get a Pareto efficient recommendation [27, 28]. Recently, it is pointed out that some multiple objectives are related to users [7, 16, 23, 29]. On one hand, different objectives are related to different user behaviors. For example, both clicks and hides are considered in LinkedIn feeds [33]. On the other hand, the objectives are related to different user statuses, for example different stakeholders [8, 23].


The approaches on recommendation with multiple objectives can be categorized into evolutionary algorithm [45] and scalarization [38]. The evolutionary algorithm has been used for long- tail recommendation [35], diversified recommendation [10], and novelty-aware recommendation [28]. And it has also been used for Pareto efficient hybridization [28] of multiple recommendation algorithms. Scalarization technique is also used for recommendation with multiple objectives [38]. However, existing studies mostly depend on manually assigned weights for scalarization, whose Pareto efficiency can not be guaranteed. Recently, the KKT conditions are used for guiding scalarization techniques [11, 32]. However, existing algorithms based on these conditions are limited to the unconstrained cases and can not t the requirements in real-world scenarios.

2.2 E-Commerce Recommendation 电子商务推荐

E-Commerce recommendation is also a popular research topic. Some studies adopt economic theory models and Markov chains for recommendation [12, 19, 42, 43]. While some other studies focus on other aspects in E-Commerce recommendation [1, 30, 34, 40], such as feature learning and diversification. It is pointed out that a good practice in E-Commerce searching is learning to rank [18], which also coincides with the motivation of our framework. Usually there are multiple stages in E-Commerce recommendation, for example clicks and purchases. Therefore the learning-to-rank algorithms need to jointly optimize multiple stages [36]. Some studies focus on the post-click stage in searching and recommendation. For example, the bidding price and revenue are jointly considered with relevance [26, 44]. Recently, two studies focus on the connection between clicks and purchases in E-Commerce searching and advertising [21, 36]. As optimizing clicks and purchases are not entirely consistent, it is necessary to find a Pareto efficient trade-off between them, which is not considered in previous studies on purchase optimization [21, 36].

2.3 Learning to Rank 排序学习

Learning To Rank (LTR) has been a popular research topic for quite a long time. The studies on LTR can be categorized into point-wise, pair-wise and list-wise approaches. The point-wise scheme [20] predicts the individual instance separately; the pair-wise scheme [4, 13] is approximated as a binary classification problem, which focuses on the relative order of a pair of instances; while the list-wise scheme [5, 6, 37, 39] directly optimizes the metric of a ranking list. Usually, list-wise LTR achieves superior performances than other schemes.
ranking methods have been proposed, such as RankNet [4], Rank- Boost [13], AdaRank [39], LambdaRank [5], ListNet [6] and LambdaMART [37]. Due to the similarity between searching and recommendation in ranking, LTR approaches are widely used in both scenarios. Recently, it is pointed out that LTR is a key component in E-Commerce searching [18], which is able to exploit multiple user feedback signals for relevance modeling, including clicks, add-to-cart ratios, and revenue.
According to the previous studies, LambdaMART is one of the best performing algorithms [36]. As focus of this paper is not about ranking model, we choose a simple point-wise ranking model for the proposed framework.
长期以来,排序学习一直是一个热门的研究课题。LTR的研究可分为point-wise、pair-wise和list-wise。point-wise方案[20]分别预测个体实例;pair-wise方案[4, 13]被近似为二分类问题,其集中于一对实例的相对顺序;而list-wise方案[5, 6, 37,39]直接优化排序列表的度量。通常,list-wise LTR比其他方案具有更好的性能。




In this section, we first provide a brief introduction to the concept of Pareto efficiency. Then we introduce the details of the proposed framework, i.e. Pareto-Efficient Learning-to-Rank (PE-LTR). Assuming that there are differentiable loss functions for multiple objectives correspondingly, we propose a condition that guarantees the Pareto efficiency of the solution. We show that the proposed condition is equivalent to a constrained Quadratic Programming problem. Then we propose a two-step algorithm to solve this problem. Moreover, we provide methods to generate both Pareto Frontier and specific single recommendation with PE-LTR.

3.1 Preliminary 前言

First, we provide a brief introduction to Pareto efficiency and some related concepts. Pareto efficiency is an important concept in multiple objective optimization. Given a system which aims to minimize a series of objective functions f 1 , . . . , f K f_1,...,f_K f1,...,fK , Pareto efficiency is a state when it is impossible to improve one objective without hurting other objectives in terms of multi-objective optimization.
首先,我们简要介绍了帕累托效率和相关概念。帕累托效率是多目标优化中的一个重要概念。假定一个系统的目标是最小化一系列目标函数 f 1 , . . . , f K f_1,...,f_K f1,...,fK,Pareto效率是指在多目标优化中,一个目标不可能在不影响其他目标的情况下得到改善的状态。

Definition 3.1. Denote the outcomes of two solutions as s i = ( f 1 i , . . . , f K i ) s_i = (f_1^i,...,f_K^i) si=(f1i,...,fKi) and s j = ( f 1 j , . . . , f K j ) s_j =(f_1^j,...,f_K^j) sj=(f1j,...,fKj), s i s_i si dominates s j s_j sj if and only if f 1 i ≤ f 1 j , f 2 i ≤ f 2 j , . . . , f K i ≤ f K j f_1^i ≤ f_1^j, f_2^i ≤ f_2^j , . . . , f_K^i ≤ f_K^j f1if1j,f2if2j,...,fKifKj (for minimization objectives).
The concept of Pareto efficiency is built upon the definition of domination:
Definition 3.2. A solution s i = ( f 1 i , . . . , f K i ) s_i = (f_1^i , . . . , f_K^i) si=(f1i,...,fKi) is Pareto efficient if there is no other solution s j = ( f 1 j , . . . , f K j ) sj =(f_1^j,...,f_K^j) sj=(f1j,...,fKj) that dominates s i s_i si.
Therefore, a solution that is not Pareto efficient can still be improved for at least one objective without hurting the others, and it is always expected to achieve Pareto efficient solutions in multi-objective optimization. It is worth mentioning that Pareto efficient solutions are not unique and the set of all such solutions is named as the “Pareto Frontier”
定义3.1。将两个方案的结果表示为 s i = ( f 1 i , . . . , f K i ) s_i = (f_1^i,...,f_K^i) si=(f1i,...,fKi) s j = ( f 1 j , . . . , f K j ) s_j =(f_1^j,...,f_K^j) sj=(f1j,...,fKj),当且仅当 f 1 i ≤ f 1 j , f 2 i ≤ f 2 j , . . . , f K i ≤ f K j f_1^i ≤ f_1^j, f_2^i ≤ f_2^j , . . . , f_K^i ≤ f_K^j f1if1j,f2if2j,...,fKifKj(用于最小化目标)时, s i s_i si 支配 s j s_j sj


定义3.2。一个解 s i = ( f 1 i , . . . , f K i ) s_i = (f_1^i , . . . , f_K^i) si=(f1i,...,fKi)是帕累托有效的,如果没有其他解决方案 s j = ( f 1 j , . . . , f K j ) s_j =(f_1^j,...,f_K^j) sj=(f1j,...,fKj)支配 s i s_i si


3.2 Pareto-Efficient Learning to Rank 帕累托有效的排序学习

To achieve a Pareto efficient solution, we propose a Learning-to-Rank scheme that optimizes multiple objectives with the scalarization technique. Assuming that there are K objectives in a given recommender system, a model F(θ) needs to optimize these objectives simultaneously, where θ denotes the model parameters. Without loss of generality, we assume that there exist K differentiable loss functions L i ( θ ) , ∀ i ∈ 1 , . . . , K L_i(θ), ∀i ∈ {1,...,K} Li(θ),i1,...,K for the K objectives correspondingly.
为了得到帕累托有效解,我们提出了一个排序学习的方案,利用标量化技术优化多个目标。假设在给定的推荐系统中存在K个目标,则F(θ)模型需要同时优化这些目标,其中θ表示模型参数。在不损失一般性的情况下,我们假设K个目标存在K个可微损失函数 L i ( θ ) , ∀ i ∈ 1 , . . . , K L_i(θ), ∀i ∈ {1,...,K} Li(θ),i1,...,K

Given the formulations, optimizing i-th objective is equal to minimizing L i L_i Li . However, optimizing these K objectives simultaneously is non-trivial, since the optimal solution to one objective is usually sub-optimal for another one. Therefore, we use the scalarization technique to merge multiple objectives into a single one. Specifically, we aggregate the loss functions L i L_i Li with ω i , ∀ i ∈ 1 , . . . , K ω_i , ∀i ∈ {1, . . . , K } ωi,i1,...,K:

L ( θ ) = ∑ i = 1 K w i L i ( θ ) L(θ) = \sum_{i=1}^{K} w_i L_i(θ) L(θ)=i=1KwiLi(θ)

where ∑ i = 1 K w i = 1 \sum_{i=1}^{K} w_i = 1 i=1Kwi=

  • 1
  • 6
    觉得还不错? 一键收藏
  • 0




当前余额3.43前往充值 >
领取后你会自动成为博主和红包主的粉丝 规则
钱包余额 0


