论文《Federated Social Recommendation with Graph Neural Network》阅读_a federated social recommendation approach with en-CSDN博客

本文链接：https://blog.csdn.net/xingzhe123456789000/article/details/139635471

论文《Federated Social Recommendation with Graph Neural Network》阅读

论文概况
Intro
Methodology
论文总结

论文概况

今天总结一下最近阅读的关于联邦社会化推荐的论文《Federated Social Recommendation with Graph Neural Network》，论文由 UIC（伊利诺伊大学芝加哥分校）数据挖掘方向大牛 Philip S. Yu（俞士纶）团队 Zhiwei Liu 等人和北航 Hao Peng 等人完成。论文发表在期刊 TIST （ACM Transactions on Intelligent Systems and Technology） 2022 上。论文主要完成一个基于联邦学习框架下的社会化推荐模型 FeSoG。

论文地址：论文地址

论文整体来说，Writing部分相对比较流畅，easy to read。但是同时文中出现较大比例的typo、语法错误等，对于阅读不太友好。作为本人关于联邦推荐的入门读物，这里总结一下，下面进入具体介绍。

Intro

作者首先总结了联邦社会化推荐（Federated Social Recommender System，FSRS）的主要几个挑战：

异构性。social recommendation既包括 U-U 连接，也包括 U-I 连接。
个性化。每个用户的兴趣之间呈现非独立同分布，应该单独处理。
安全性。联邦推荐场景下应注意保护用户的个人隐私

针对上述挑战，作者提出基于图神经网络的联邦社会化推荐（Federated Social Recommendation with Graph Neural Networks, FeSoG）模型。

Methodology

问题形式化

这里先介绍问题形式化及关键定义。
用户集合 $\mathcal{U}= \left\{ u_1, u_2, \ldots, u_N\right\}$ 及物品集合 $\mathcal{T}= \left\{t_1, t_2, \ldots, t_M \right\}$ ，用户与物品、用户与用户之间形成评分矩阵 $\mathbf{R} \in \mathbb{R}^{N \times M}$ ，用户邻接矩阵 $\mathbf{S} \in\{0,1\}^{N \times N}$ 。

针对联邦推荐场景，

客户端（Client）：每个用户 $n$ 即一个客户端 $c_n$ ，客户端包含的数据包括用户对物品的评分（ $\mathbf{R}_{n\cdot}$ ）和他与其他用户之间的之间相邻关系（ $\mathbf{S}_{n\cdot}$ ）。
服务器（Center Server）：中心用户不包含用户的邻接数据和用户的交互矩阵。中心服务器包含的数据包含每个client上的参数，每个用户、每个物品的embedding。通过 server 和 client 之间每轮的数据更新，根据 client 上传给 server 的梯度数据进行更新，更新完成之后再将最新的参数都更新给每个客户端。
联邦社会化推荐（FSRS）：对于任意客户端 $c_n$ ，根据
当前用户 $n$ 的交互向量 $\mathbf{r}_{n} = \{{r}_{n1}, {r}_{n2}, \cdots, {r}_{nk}\}$ 及一阶邻居向量 $\mathbf{s}_{n} = \{{s}_{n1}, {s}_{n2}, \cdots, {s}_{np}\}$ （ $\in \{1, 2, \cdots, N\}$ ， $\in \{1, 2, \cdots, M\}$ ），FSRS 在不知道每个client 的 raw data 的基础上预测用户的交互数据。

问题形式化：

上述 FSRS 的定义如下所示：
用户 $n$ 的交互数据 $\mathcal{T}^{(n)}=\left\{t_1^{(n)}, t_2^{(n)}, \ldots, t_k^{(n)}\right\}$ ，社交邻居 $\mathcal{U}^{(n)}=\left\{u_1^{(n)}, u_2^{(n)}, \ldots, u_p^{(n)}\right\}$ ，FSRS 预测 mask 掉的物品 $\ T ( n ) t^* \in \mathcal{T} \backslash \mathcal{T}(n)$ 的交互分数。

通过对每个客户 $c_n$ 构建本地的局部异构图 $\mathcal{G}_{n}$ （包含U-U 及 U-I），

给出形式化定义：给定每个客户 $n$ 的局部异构图集合 $\left\{\mathcal{G}_{n} \mid _{n=1}^{N}\right\}$ ，在不接触 local graph 中raw data 的前提下，FSRS致力于预测连边 $u_n, t^*)$ 的值。

FeSoG

在这里插入图片描述

本地图设计

这部分就是设计了一个GAT网络，完成异构图上的embedding传播，这部分embedding是根据center server 下载的，本地用户只保存他自己的交互物品和好友用户，因此只能传播一阶邻居信息，这里不做赘述，把公式摆在底下。

$o_{n p}=\operatorname{LeakyReLU}\left(\mathbf{a}^{\top}\left[\mathbf{W}_1 \mathbf{e}_{u_n} \| \mathbf{W}_1 \mathbf{e}_{u_p}\right]\right).\tag{1}$

$\alpha_{n p}=\operatorname{softmax}_p\left(o_{n p}\right)=\frac{\exp \left(o_{n p}\right)}{\sum_{i=1}^P \exp \left(o_{n i}\right)}.\tag{2}$

$v_{n k}=\operatorname{LeakyReLU}\left(\mathbf{b}^{\top}\left[\mathbf{W}_2 \mathbf{e}_{u_n} \| \mathbf{W}_2 \mathbf{e}_{i_k}\right]\right).\tag{3}$

$\beta_{n k}=\operatorname{softmax}\left(v_{n k}\right)=\frac{\exp \left(v_{n i}\right)}{\sum_{i=1}^K \exp \left(v_{n i}\right)}.\tag{4}$

$\mathbf{h}_u^{(n)}=\sum_{p=1}^p \alpha_{n \rho} \mathbf{W}_h \mathbf{e}_{u_p}.\tag{5}$

$\mathbf{h}_t^{(n)}=\sum_{k=1}^K \beta_{n k} \mathbf{W}_h \mathbf{e}_k.\tag{6}$

$\mathbf{h}_t^{(n)}$ 是交互图 embedding， $\mathbf{h}_t^{(u)}$ 是社交图 embedding，通过注意力系数进行加权求和计算，如下所示：

$\begin{aligned} & \gamma_u=\frac{\exp \left(\mathrm{c}^{\top}\left[\mathbf{h}_u^{(n)} \| \mathbb{v}_u\right]\right)}{\exp \left(\mathrm{c}^{\top}\left[\mathbf{h}_u^{(n)} \| \mathbf{v}_u\right]\right)+\exp \left(\mathbf{c}^{\top}\left[\mathbf{h}_t^{(n)} \| \mathbb{v}_t\right]\right)+\exp \left(\mathrm{c}^{\top}\left[\mathbf{h}_s^{(n)}\|\|_{v_s}\right]\right)}, \\ & \gamma_t=\frac{\exp \left(\mathbf{c}^{\top}\left[\mathbf{h}_t^{(n)} \| \mathbf{v}_t\right]\right)}{\exp \left(\mathbf{c}^{\top}\left[\mathbf{h}_u^{(n)} \| \mathbf{v}_u\right]\right)+\exp \left(\mathbf{c}^{\top}\left[\mathbf{h}_t^{(n)} \| \mathbb{v}_t\right]\right)+\exp \left(\mathbf{c}^{\top}\left[\mathbf{h}_s^{(n)} \| \mathbf{v}_s\right]\right)}, \\ & \gamma_s=\frac{\exp \left(\mathbf{c}^{\top}\left[\mathbf{h}_s^{(n)} \| \mathbf{v}_s\right]\right)}{\exp \left(c^{\top}\left[\mathbf{h}_u^{(n)} \| v_u\right]\right)+\exp \left(c^{\top}\left[\mathbf{h}_t^{(n)} \| \mathbb{v}_t\right]\right)+\exp \left(c^{\top}\left[\mathbf{h}_s^{(n)} \| \mathbf{v}_s\right]\right)}.\end{aligned} \tag{7}$
$\mathbf{e}_{n}^*=\gamma_s \mathbf{e}_{u_n}+\gamma_u \mathbf{h}_u^{(n)}+\gamma_t \mathbf{h}_t^{(n)} \tag{8}$
通过内积计算交互分数：
$\hat{\mathbf{R}}_{n t}={\mathbf{e}_n^*}^\top \mathbf{e}_t.\tag{9}$
本地训练损失函数定义如下：
$\mathcal{L}_u=\sqrt{\frac{\sum_{t \in \mathcal{T}^{(u)}}\left(\mathbf{R}_{u t}-\hat{\mathbf{R}}_{u t}\right)^2}{\left|\mathcal{T}^{(u)}\right|}}\tag{10}$

Local Graph Neural Network
上述内容计算完成本地的所有计算过程，embedding 和模型参数通过center server download下来，在本地进行 GAT 计算，并通过反向传播完成参数更新（如公式(10)所示）。

本地差分隐私

在上述基础上，由于设计到 client 和 server 之间的梯度上传和参数下载，为了防止用户信息泄露（根据FedMF，连续两次上传的梯度被获取后，可以推测出用户的rating信息），在上传过程中加入了本地差分隐私（Local Differential Privacy，LDP）操作。

具体地，就是就梯度数据进行clip操作（范围控制在 $\delta$ 内），同时加入一定强度范围（ $\lambda$ ）内的 Laplacian Noise。具体如下所示：
$\tilde{\mathbf{g}}^{(n)}=\operatorname{clip}\left(\mathbf{g}^{(n)}, \delta\right)+\operatorname{Laplacian}(0, \lambda). \tag{11}$

具体操作中，为实现动态噪声，使用如下实现形式：
$\tilde{\mathbf{g}}^{(n)}=\operatorname{clip}\left(\mathbf{g}^{(n)}, \delta\right)+\operatorname{Laplacian}\left(0, \lambda \cdot \operatorname{mean}\left(\mathbf{g}^{(n)}\right)\right), \tag{12}$
其中， $\mathrm{g}^{(n)} = \{ \mathbf{g}_t^{(n)}, \mathbf{g}_m^{(n)}, \mathbf{g}_u^{(n)} \} = \frac{\partial \mathcal{L}_u} {\partial \Theta}$ 。

Pseudo-Item Labeling

为进一步保护用户隐私（梯度为 0 的梯度向量对应用户的未交互物品），作者加入了 $q$ 个假标签，记作 $\tilde{\mathcal{T}}^{(n)}=\left\{ \tilde{t}_1^{(n)}, \tilde{t}_2^{(n)}, \ldots, \tilde{t}_q^{(n)} \right\}$ 。在预测过程中，通过四舍五入的方式将 sample 出来的 pseudo items 作为标签进行训练，以提高传输过程中的安全性。具体如下：
$\tilde{\mathcal{L}}_u=\sqrt{\frac{\sum_{t \in \mathcal{T}(u) \cup \tilde{\mathcal{T}}^{(u)}}\left(\mathbf{R}_{u t}-\hat{\mathbf{R}}_{u t} \right)^2}{\left|\mathcal{T}^{(u)}\right|}}, \tag{13}$
其中， $\hat{\mathbf{R}}_{u t} \in \mathbb{R}$ 表示标签值， $\mathbf{R}_{u t} \in \mathbb{N}$ 表示预测值。

模型优化

作者将梯度数据分为三部分，模型梯度 $\overline{\mathbf{g}}_m$ ，物品 embedding 梯度 $\overline{\mathbf{g}}_t$ ，及用户 embedding 梯度 $\overline{\mathbf{g}}_u$ ，通过对应client 对应rating数量大小进行加权求期望的操作完成中心服务器平均梯度的计算，并根据梯度完成模型参数的更新。更新之后将参数下载到各个 client ，将参数进行更新。具体如下：
$\begin{aligned} &\overline{\mathbf{g}}_m =\frac{\sum_{n \in \mathcal{N}}\left|\mathcal{R}_n\right| \cdot \tilde{\mathbf{g}}_m^{(n)}}{\sum_{n \in \mathcal{N}}\left|\mathcal{R}_n\right|}, \\ &\overline{\mathbf{g}}_t =\frac{\sum_{n \in \mathcal{N}}\left|\mathcal{R}_n\right| \cdot \tilde{\mathbf{g}}_t^{(n)}}{\sum_{n \in \mathcal{N}}\left|\mathcal{R}_n^t\right|}, \\ &\overline{\mathbf{g}}_u =\frac{\sum_{n \in \mathcal{N}} \left|\mathcal{R}_n\right| \cdot \tilde{\mathbf{g}}_u^{(n)} }{\sum_{n \in \mathcal{N}} \left| \mathcal{R}_n^t \right|}, \end{aligned} \tag{14}$