LG-Fedavg

f_t_l

已于 2023-08-10 17:45:29 修改

阅读量486

点赞数

分类专栏： Federated Learning 文章标签：学习

于 2023-08-09 22:33:51 首次发布

本文链接：https://blog.csdn.net/f_t_l/article/details/132192621

版权

Federated Learning 专栏收录该内容

1 篇文章

订阅专栏

LG-Fedavg

贡献（Abstract）

local representation learning on each device: for learning useful and compact features from raw data (看不懂，这一点在论文里没有体现)
减小了服务器和客户端之间的通信开销
for fair representation learning (没仔细看)
通过个性化模型解决non-i.i.d问题（参考）

core idea

client 先学习本地的local representation（对应浅层网络？因为浅层网络不参与聚合），之后global model 操作 higher-level representation （指上传给服务器的深层网络）

The core idea of our method is to augment federated learning with local representation learning on each device before a global model operating on higher-level representation is trained on the data(now as representations rather than raw data) from all device.

由此对应：

推测：每个模型的深层网络之间接近，因为提取的是全局特征；
浅层网络之间距离远，因为提取的是本地特征

workflow (Local Global Federated Averaging）

client部分

每个client都维护两个模型：local net和global net，global net 只更新深层部分。

client 先更新local net 并将更新后的local net 的深层部分上传给server
server对各local net的深层部分做聚合得到 global net 的深层部分，并将 global net 的深层部分下发给clients
client 用 global net 的深层部分替代local net，得到 client 端维护的global net，并对比global net 和 local net 在本地数据集上的acc，acc较好的global net将取代local net，否则local net 不变。

server部分

每轮训练中都负责聚合local model的深层部分（只交换模型的深层部分，保留浅层部分，以保留在本地数据集上的准确性，似乎是可以理解的）
最终的global模型由聚合所有的完整local model得到（存疑，相当于模型的浅层部分只聚合一次，按道理效果会非常不好，但文中实验里效果比fedavg只差一点点，我的理解是浅层的聚合频率上升应该是提高全局模型泛化性的手段）

最终global model效果好的一个可能的解释

如果要提高全局模型的准确率，local representation按道理应该多聚合，因为local representation的部分，如果一直自己训练，很容易过拟合，但是文中2.2节global aggregation似乎给出了解释：

We argue that this synchronizes the local and global models: while local models can flexibly fit the data distribution on their device, the global model acts as a regularizer to synchronize the representations from all devices: each local model cannot overfit to local data because otherwise, the global model would incur a high loss.