联邦学习安全防御之差分隐私

本博客地址:https://security.blog.csdn.net/article/details/124092828

一、集中式差分隐私

集中式差分隐私是建立在两个相邻数据集D和D′之上的,相邻数据集是指D和D′之间仅有一条数据不相同,差分隐私技术使得用户无法从获取的输出数据中区分数据是来源于数据集D,还是数据集D′,从而达到保护数据隐私的目的。

基于差分隐私的随机梯度下降算法(DP-SGD)与传统的随机梯度下降算法(SGD)的主要不同点在于DP-SGD算法在每一轮迭代过程中都会进行梯度裁剪和添加高斯噪声。

基于差分隐私的深度学习训练算法如下:

---------------------------------------------------------------------
input:训练样本数据:gif.latex?%5Cleft%20%5C%7B%20%28x_1%2C%20y_1%29%2C%20%28x_2%2C%20y_2%29%2C%20......%2C%20%28x_N%2C%20y_N%29%20%5Cright%20%5C%7D ;
           损失函数:gif.latex?%5Cpounds%20%280%29%20%3D%20%5Cfrac%7B1%7D%7BN%7D%20%5Csum_%7Bi%3D1%7D%5EN%20%5Cpounds%20%28%5Ctheta%20%3B%20x_i%2C%20y_i%29 ;
           学习率:gif.latex?%5Ceta%20_t
           梯度裁剪边界值:C;
           高斯噪声标准差:gif.latex?%5Csigma,每一轮的训练样本大小:B;
output:模型参数:gif.latex?%5Ctheta%20_T

随机初始化模型参数 gif.latex?%5Ctheta%20_0
for 对每一轮的迭代 t=1,2,.....,T,执行下面的参数更新 do
        从样本集中随机挑选大小为 B 的集合 gif.latex?B_t
        for 对每一个样本 gif.latex?i%20%5Cin%20B_t  do
                求取梯度值:gif.latex?g_t%28x_i%2C%20y_i%29%20%5Cleftarrow%20%5Cbigtriangledown%20%5Ctheta%20_t%28%5Cpounds%20%28%5Ctheta%3B%20x_i%2Cy_i%29%29
                # 该步骤是DPSGD算法的核心操作
                梯度裁剪:max%281%2C%20%5Cfrac%7B%5Cleft%20%5C%7C%20g_t%28x_i%2C%20y_i%29%20%5Cright%20%5C%7C_2%7D%7BC%7D%29
        end
        # 该步骤是DPSGD算法的核心操作
        将梯度聚合并添加高斯噪声:gif.latex?%5Ctilde%7Bg%7D_t%20%5Cleftarrow%20%5Cfrac%7B1%7D%7BB%7D%20%5Cleft%20%5C%7B%20%5Csum_%7Bi%7D%20%5Ctilde%7Bg%7D_t%20%28x_i%2C%20y_i%29%20+%20N%280%2C%20%5Csigma%20%5E2C%5E2I%29%20%5Cright%20%5C%7D
        更新模型参数:gif.latex?%5Ctheta%20_t%20%5Cleftarrow%20%5Ctheta%20_%7Bt-1%7D%20-%20%5Ceta%20_t%5Ctilde%7Bg%7D_t
end
---------------------------------------------------------------------

二、联邦差分隐私

与集中式差分隐私相比,在联邦学习场景下引入差分隐私技术,除了需要考虑数据层面的隐私安全,还需要考虑用户层面的安全问题。其不但要求保证每一个客户端的本地数据隐私安全,也要求客户端之间的信息安全,即用户在服务端接收到客户端的本地模型,既不能推断出是由哪个客户端上传的,也不能推断出某个客户端是否参与了当前的训练。

客户端的联邦差分隐私算法如下:

---------------------------------------------------------------------
input:客户端ID:k ;
            全局模型:gif.latex?%5Ctheta%20_0 ;
            模型参数:学习率:gif.latex?%5Ceta,本地迭代次数 E ;
            每一轮的训练样本大小:B ;
output:模型更新:gif.latex?%5Cbigtriangleup%20%5Ek%20%3D%20%5Ctheta%20-%5Ctheta%20_0

利用服务端下发的全局模型参数 gif.latex?%5Ctheta%20_0,并更新本地模型 gif.latex?%5Cthetagif.latex?%5Ctheta%20%5Cleftarrow%20%5Ctheta%20_0
for 对每一轮的迭代 t=1,2,.....,T,执行下面的操作 do
        将本地数据切分为 |B| 份数据 B
        for 对每一个 batch gif.latex?b%5Cin%20B do
                执行梯度下降:gif.latex?%5Ctheta%20%5Cleftarrow%20%5Ctheta%20-%5Ceta%20%5Cbigtriangledown%20l%28%5Ctheta%20%3Bb%29
                参数裁剪:gif.latex?%5Ctheta%20%5Cleftarrow%20%5Ctheta_0%20+clip%20%28%5Ctheta%20-%5Ctheta%20_0%29
        end
end

---------------------------------------------------------------------

服务端的联邦差分隐私算法如下:

---------------------------------------------------------------------
input:训练样本数据:gif.latex?%5Cleft%20%5C%7B%20x_1%2C%20x_2%2C......%2C%20x_N%20%5Cright%20%5C%7D ;
            损失函数:gif.latex?L%280%29%3D%5Cfrac%7B1%7D%7BN%7D%20%5Csum_%7Bi%3D1%7D%5E%7BN%7D%20L%28%5Ctheta%20%3Bx_i%29 ;
            学习率:gif.latex?%5Ceta%20_t ;
            梯度裁剪边界值 C ;
            噪声参数:gif.latex?%5Csigma,每一轮的训练样本大小:B ;


随机初始化模型参数 gif.latex?%5Ctheta%20_0
定义客户端权重:客户端 gif.latex?c_i 对应的权重为 gif.latex?w_k%20%3D%20min%28%5Cfrac%7Bn_k%7D%7B%5Chat%7Bw%7D%7D%2C1%29
gif.latex?W%3D%5Csum_%7Bk%7D%20w_k
for 对每一轮的迭代 t=1,2,.....,T,执行下面的参数更新 do
       
以概率 q 挑选参与本轮训练的客户端集合 gif.latex?C%5Et
        for 对每一个用户 gif.latex?k%20%5Cin%20C%5Et do
                执行本地训练:gif.latex?%5Cbigtriangleup%20_k%5Et%20%3D%20ClientUpdate%28k%2C%20%5Ctheta%20_%7Bt-1%7D%29
        end
        聚合客户端参数:gif.latex?%5Cbigtriangleup%20%5Et%20%3D%20%5Cleft%5C%7B%5Cbegin%7Bmatrix%7D%20%5Cfrac%7B%5Csum%20_%7Bk%20%5Cin%20c%5Et%7Dw_k%20%5CDelta%20_k%5Et%7D%7BqW%7D%20%26%2C%20for%20%5Ctilde%7Bf%7D_f%20%5C%5C%20%5Cfrac%7B%5Csum%20_%7Bk%20%5Cin%20c%5Et%7Dw_k%20%5CDelta%20_k%5Et%7D%7Bmax%28qW_%7Bmin%7D%2C%5Csum%20_%7Bk%20%5Cin%20c%5Et%7Dw_k%29%7D%20%26%20%2C%20for%20%5Ctilde%7Bf%7D_c%20%5Cend%7Bmatrix%7D%5Cright.
        对 gif.latex?%5CDelta%20%5Et 值进行裁剪:max%281%2C%20%5Cfrac%7B%5Cleft%20%5C%7C%20%5CDelta%20%5Et%20%5Cright%20%5C%7C%7D%7BC%7D%29
        求取高斯噪声的方差:gif.latex?%5Csigma%20%3D%5Cleft%5C%7B%5Cbegin%7Bmatrix%7D%20%5Cfrac%7BzS%7D%7BqW%7D%20%26%2Cfor%20%5Ctilde%7Bf%7D_f%20%5C%5C%20%5Cfrac%7B2zS%7D%7Bmax%28qW_%7Bmin%7D%29%7D%20%26%20%2Cfor%20%5Ctilde%7Bf%7D_c%20%5Cend%7Bmatrix%7D%5Cright.
        更新全局模型参数:gif.latex?%5Ctheta%20_t%20%5Cleftarrow%20%5Ctheta%20_%7Bt-1%7D%20+%20%5CDelta%20%5Et%20+N%280%2C%20I%5Csigma%20%5E2%29
end
---------------------------------------------------------------------

三、差分隐私防御的具体实现

3.1、配置信息

已对代码做出了具体的注释说明,具体细节阅读代码即可。

conf.json

{
	"model_name" : "resnet18",
	"no_models" : 10,
	"type" : "cifar",
	"global_epochs" : 100,
	"local_epochs" : 3,
	"k" : 2,
	"batch_size" : 32,
	"lr" : 0.01,
	"momentum" : 0.0001,
	"lambda" : 0.5,
	// dp必须设置为True
	"dp" : true,
	// C是裁剪边界值
	"C" : 1000,
	"sigma" : 0.001,
	"q" : 0.1,
	"W" : 1
}

3.2、客户端

相比于常规的本地训练,其主要修改点是在每一轮的梯度下降结束后,对参数进行裁剪。

client.py

import models, torch, copy

class Client(object):
	def __init__(self, conf, model, train_dataset, id = -1):
		self.conf = conf
		self.local_model = models.get_model(self.conf["model_name"]) 
		self.client_id = id
		self.train_dataset = train_dataset
		
		all_range = list(range(len(self.train_dataset)))
		data_len = int(len(self.train_dataset) / self.conf['no_models'])
		train_indices = all_range[id * data_len: (id + 1) * data_len]

		self.train_loader = torch.utils.data.DataLoader(self.train_dataset, batch_size=conf["batch_size"], 
									sampler=torch.utils.data.sampler.SubsetRandomSampler(train_indices))
									
	def local_train(self, model):

		for name, param in model.state_dict().items():
			self.local_model.state_dict()[name].copy_(param.clone())
			
		optimizer = torch.optim.SGD(self.local_model.parameters(), lr=self.conf['lr'], momentum=self.conf['momentum'])
		self.local_model.train()
		for e in range(self.conf["local_epochs"]):
			
			for batch_id, batch in enumerate(self.train_loader):
				data, target = batch
				if torch.cuda.is_available():
					data = data.cuda()
					target = target.cuda()
			
				optimizer.zero_grad()
				output = self.local_model(data)
				loss = torch.nn.functional.cross_entropy(output, target)
				loss.backward()
				optimizer.step()
				
				if self.conf["dp"]:
					model_norm = models.model_norm(model, self.local_model)
					
					norm_scale = min(1, self.conf['C'] / (model_norm))
					for name, layer in self.local_model.named_parameters():
						clipped_difference = norm_scale * (layer.data - model.state_dict()[name])
						layer.data.copy_(model.state_dict()[name] + clipped_difference)
						
			print("Epoch %d done." % e)	
		diff = dict()
		for name, data in self.local_model.state_dict().items():
			diff[name] = (data - model.state_dict()[name])
			
		return diff

3.3、服务端

服务端侧对全局模型参数进行聚合时添加噪声,噪声数据由高斯分布生成。

server.py

import models, torch

class Server(object):
	def __init__(self, conf, eval_dataset):
	
		self.conf = conf 
		self.global_model = models.get_model(self.conf["model_name"]) 
		self.eval_loader = torch.utils.data.DataLoader(eval_dataset, batch_size=self.conf["batch_size"], shuffle=True)
		
	def model_aggregate(self, weight_accumulator):
		for name, data in self.global_model.state_dict().items():
			update_per_layer = weight_accumulator[name] * self.conf["lambda"]
			if self.conf['dp']:
				sigma = self.conf['sigma']
				if torch.cuda.is_available():
					noise = torch.cuda.FloatTensor(update_per_layer.shape).normal_(0, sigma)
				else:
					noise = torch.FloatTensor(update_per_layer.shape).normal_(0, sigma)
				update_per_layer.add_(noise)
			
			if data.type() != update_per_layer.type():
				data.add_(update_per_layer.to(torch.int64))
			else:
				data.add_(update_per_layer)
				
	def model_eval(self):
		self.global_model.eval()
		total_loss = 0.0
		correct = 0
		dataset_size = 0
		for batch_id, batch in enumerate(self.eval_loader):
			data, target = batch 
			dataset_size += data.size()[0]
			
			if torch.cuda.is_available():
				data = data.cuda()
				target = target.cuda()
			output = self.global_model(data)
			
			total_loss += torch.nn.functional.cross_entropy(output, target,
											  reduction='sum').item()
			pred = output.data.max(1)[1]
			correct += pred.eq(target.data.view_as(pred)).cpu().sum().item()

		acc = 100.0 * (float(correct) / float(dataset_size))
		total_l = total_loss / dataset_size

		return acc, total_l

评论 5
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

武天旭

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值