Differentially Private Learning with Adaptive Clipping

motivation:这篇文章是在模型训练阶段添加满足DP的噪声从而达到隐私保护的目的,在之前读的论文中,不同的数据集大小,优化器,激活函数的不同都会影响整个模型的性能。看的比较多的就是在裁剪阈值C上进行优化,过大过小都不利于模型训练,所以需要找一个合适的阈值C。

在联邦学习(FL)设置中,使用用户级差分隐私(例如DP联邦平均)训练神经网络的现有方法涉及到通过将每个用户的模型更新裁剪为某个常数值来限制其贡献。

method:基于这样的前提,文章提出了一种分位数的思想,用分位数去找一个合适的裁剪临界值。左边第一个公式中的参数伽马就是分位数,通过令导数的期望为0可以找到一个与分位数相关的的C*(是X的γ分位数)。

公式4是假设在某轮中有m个X的样本值(x1,…xm)。这一轮损失的平均导数是公式4,其中参数b代表最大值为C时样本的平均概率分数;根据梯度下降对裁剪阈值C进行迭代更新。

右边是文章算法的一个流程,对于sample出来的每一个用户进行客户端训练,通过梯度下降和梯度裁剪得到返回值模型参数delta和参数b;因为模型参数和参数b是文章保护对象,所以添加噪声进行保护,通过贝塔参数对模型参数进行更新,最后得到塞塔;并且对参数b进行加噪,利用分位数和b来控制裁剪阈值的大小。

该方法密切跟踪分位数,使用的隐私预算可以忽略不计,与其他联邦学习技术(如压缩和安全聚合)兼容,并与DP- fedavg有一个直接的联合DP分析。实验表明,中值更新规范的自适应剪切在一系列现实的联邦学习任务中都能很好地工作,有时甚至优于事后选择的最佳固定剪切,而且不需要调整任何剪切超参数。
 

  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
Here is the completed code for Differentially Private Stochastic Gradient Descent, including per-example clipping and adding Gaussian noise as well as privacy budget composition: ```python import numpy as np from scipy import optimize from scipy.stats import norm import math def per_example_clipping(grad, clip_factor): """ Clip the gradient per example with a given clip factor. """ return np.clip(grad, -clip_factor, clip_factor) def add_gaussian_noise(grad, sigma): """ Add Gaussian noise to the gradient with a given standard deviation. """ return grad + np.random.normal(0, sigma, grad.shape) def get_epsilon(epoch, delta, sigma, sensitivity, batch_size, training_nums): """ Compute epsilon with basic composition from given epoch, delta, sigma, sensitivity, batch_size and the number of training set. """ steps = math.ceil(training_nums / batch_size) * epoch epsilon = sigma * math.sqrt(2 * math.log(1.25 / delta)) / sensitivity return epsilon * steps def dp_sgd(X, y, epochs, batch_size, clip_factor, sigma, delta): n, d = X.shape w = np.zeros(d) for epoch in range(epochs): for i in range(0, n, batch_size): X_batch = X[i:i+batch_size] y_batch = y[i:i+batch_size] grad = np.mean(X_batch * (sigmoid(X_batch.dot(w)) - y_batch).reshape(-1, 1), axis=0) clipped_grad = per_example_clipping(grad, clip_factor) noise_grad = add_gaussian_noise(clipped_grad, sigma) w -= noise_grad epsilon = get_epsilon(epoch+1, delta, sigma, clip_factor/batch_size, batch_size, n) print("Epoch {}: Epsilon = {}".format(epoch+1, epsilon)) return w ``` The `per_example_clipping` function clips the gradient per example with a given clip factor. The `add_gaussian_noise` function adds Gaussian noise to the gradient with a given standard deviation. The `get_epsilon` function computes epsilon with basic composition from given epoch, delta, sigma, sensitivity, batch_size and the number of training set. The `dp_sgd` function performs Differentially Private Stochastic Gradient Descent. For each epoch, it loops over the training set in batches and computes the gradient of the loss function using the sigmoid function. It then clips the gradient per example, adds Gaussian noise to the clipped gradient, and updates the weight vector. Finally, it computes the privacy budget using the `get_epsilon` function and prints it out. Note that the `get_epsilon` function uses basic composition to compute the privacy budget. It calculates the total number of steps based on the number of epochs and the batch size, and then uses the formula for epsilon with basic composition to compute the privacy budget for each epoch. It is worth noting that basic composition may not provide the tightest bound on privacy, and using the Moments Accountant method may provide a tighter bound.

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值