Differentially Private Next-Token Prediction of Large Language Models

本文是LLM系列文章,针对《Differentially Private Next-Token Prediction of Large Language Models》的翻译。

摘要

确保大型语言模型(LLM)的隐私变得越来越重要。实现这一点最广泛采用的技术是DP-SGD,它训练一个模型来保证差分隐私(DP)。然而,DP-SGD高估了对手对模型进行白盒访问的能力,因此导致比SGD更长的训练时间和更大的内存使用量。另一方面,商业LLM部署主要基于云;因此,对LLM的对抗性访问是黑匣子。受这些观察结果的启发,我们提出了集合分布的私有混合(PMixED):一种用于下一个token预测的私有预测协议,利用下一个采样的固有随机性和公共模型来实现差分隐私。我们通过引入RD Mollier来形式化这一点,RD Molliers将模型的每个输出分布从一组微调LLM投影到公共LLM输出分布周围的集合上,然后对投影的分布进行平均并从中采样。与DP-SGD不同,DP-SGD在训练过程中需要考虑模型架构,PMixED是模型不可知的,这使得PMixED成为当前部署的一个非常有吸引力的解决方案。我们的结果表明,PMixED实现了比样本级隐私更强的隐私保证,并且在大规模数据集上的隐私度ε=8优于DP-SGD。因此,PMixED为DP训练方法提供了一种实用的替代方法,以在不损害隐私的情况下实现强大的生成效用。

1 引言

2 相关工作

3 前言

4 PMixED:一种用于私有下一个token预测

  • 2
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
Here is the completed code for Differentially Private Stochastic Gradient Descent, including per-example clipping and adding Gaussian noise as well as privacy budget composition: ```python import numpy as np from scipy import optimize from scipy.stats import norm import math def per_example_clipping(grad, clip_factor): """ Clip the gradient per example with a given clip factor. """ return np.clip(grad, -clip_factor, clip_factor) def add_gaussian_noise(grad, sigma): """ Add Gaussian noise to the gradient with a given standard deviation. """ return grad + np.random.normal(0, sigma, grad.shape) def get_epsilon(epoch, delta, sigma, sensitivity, batch_size, training_nums): """ Compute epsilon with basic composition from given epoch, delta, sigma, sensitivity, batch_size and the number of training set. """ steps = math.ceil(training_nums / batch_size) * epoch epsilon = sigma * math.sqrt(2 * math.log(1.25 / delta)) / sensitivity return epsilon * steps def dp_sgd(X, y, epochs, batch_size, clip_factor, sigma, delta): n, d = X.shape w = np.zeros(d) for epoch in range(epochs): for i in range(0, n, batch_size): X_batch = X[i:i+batch_size] y_batch = y[i:i+batch_size] grad = np.mean(X_batch * (sigmoid(X_batch.dot(w)) - y_batch).reshape(-1, 1), axis=0) clipped_grad = per_example_clipping(grad, clip_factor) noise_grad = add_gaussian_noise(clipped_grad, sigma) w -= noise_grad epsilon = get_epsilon(epoch+1, delta, sigma, clip_factor/batch_size, batch_size, n) print("Epoch {}: Epsilon = {}".format(epoch+1, epsilon)) return w ``` The `per_example_clipping` function clips the gradient per example with a given clip factor. The `add_gaussian_noise` function adds Gaussian noise to the gradient with a given standard deviation. The `get_epsilon` function computes epsilon with basic composition from given epoch, delta, sigma, sensitivity, batch_size and the number of training set. The `dp_sgd` function performs Differentially Private Stochastic Gradient Descent. For each epoch, it loops over the training set in batches and computes the gradient of the loss function using the sigmoid function. It then clips the gradient per example, adds Gaussian noise to the clipped gradient, and updates the weight vector. Finally, it computes the privacy budget using the `get_epsilon` function and prints it out. Note that the `get_epsilon` function uses basic composition to compute the privacy budget. It calculates the total number of steps based on the number of epochs and the batch size, and then uses the formula for epsilon with basic composition to compute the privacy budget for each epoch. It is worth noting that basic composition may not provide the tightest bound on privacy, and using the Moments Accountant method may provide a tighter bound.

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

UnknownBody

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值