【对抗攻击系列_PGD_2018_ICLR】Towards Deep Learning Models Resistant to Adversarial Attacks

class PGD(nn.Module):
    def __init__(self,model):
        super().__init__()
        self.model=model#必须是pytorch的model
        self.device=torch.device("cuda" if (torch.cuda.is_available()) else "cpu")
    def generate(self,x,**params):
        self.parse_params(**params)
        labels=self.y

        adv_x=self.attack(x,labels)
        return adv_x
    def parse_params(self,eps=0.3,iter_eps=0.01,nb_iter=40,clip_min=0.0,clip_max=1.0,C=0.0,
                     y=None,ord=np.inf,rand_init=True,flag_target=False):
        self.eps=eps
        self.iter_eps=iter_eps
        self.nb_iter=nb_iter
        self.clip_min=clip_min
        self.clip_max=clip_max
        self.y=y
        self.ord=ord
        self.rand_init=rand_init
        self.model.to(self.device)
        self.flag_target=flag_target
        self.C=C


    def sigle_step_attack(self,x,pertubation,labels):
        adv_x=x+pertubation
        # get the gradient of x
        adv_x=Variable(adv_x)
        adv_x.requires_grad = True
        loss_func=nn.CrossEntropyLoss()
        preds=self.model(adv_x)
        if self.flag_target:
            loss =-loss_func(preds,labels)
        else:
            loss=loss_func(preds,labels)
            # label_mask=torch_one_hot(labels)
            #
            # correct_logit=torch.mean(torch.sum(label_mask * preds,dim=1))
            # wrong_logit = torch.mean(torch.max((1 - label_mask) * preds, dim=1)[0])
            # loss=-F.relu(correct_logit-wrong_logit+self.C)

        self.model.zero_grad()
        loss.backward()
        grad=adv_x.grad.data
        #get the pertubation of an iter_eps
        pertubation=self.iter_eps*np.sign(grad)
        adv_x=adv_x.cpu().detach().numpy()+pertubation.cpu().numpy()
        x=x.cpu().detach().numpy()

        pertubation=np.clip(adv_x,self.clip_min,self.clip_max)-x
        pertubation=clip_pertubation(pertubation,self.ord,self.eps)


        return pertubation
    def attack(self,x,labels):
        labels = labels.to(self.device)
        print(self.rand_init)
        if self.rand_init:
            x_tmp=x+torch.Tensor(np.random.uniform(-self.eps, self.eps, x.shape)).type_as(x).cuda()
        else:
            x_tmp=x
        pertubation=torch.zeros(x.shape).type_as(x).to(self.device)
        for i in range(self.nb_iter):
            pertubation=self.sigle_step_attack(x_tmp,pertubation=pertubation,labels=labels)
            pertubation=torch.Tensor(pertubation).type_as(x).to(self.device)
        adv_x=x+pertubation
        adv_x=adv_x.cpu().detach().numpy()

        adv_x=np.clip(adv_x,self.clip_min,self.clip_max)

        return adv_x
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
Adversarial attacks are a major concern in the field of deep learning as they can cause misclassification and undermine the reliability of deep learning models. In recent years, researchers have proposed several techniques to improve the robustness of deep learning models against adversarial attacks. Here are some of the approaches: 1. Adversarial training: This involves generating adversarial examples during training and using them to augment the training data. This helps the model learn to be more robust to adversarial attacks. 2. Defensive distillation: This is a technique that involves training a second model to mimic the behavior of the original model. The second model is then used to make predictions, making it more difficult for an adversary to generate adversarial examples that can fool the model. 3. Feature squeezing: This involves converting the input data to a lower dimensionality, making it more difficult for an adversary to generate adversarial examples. 4. Gradient masking: This involves adding noise to the gradients during training to prevent an adversary from estimating the gradients accurately and generating adversarial examples. 5. Adversarial detection: This involves training a separate model to detect adversarial examples and reject them before they can be used to fool the main model. 6. Model compression: This involves reducing the complexity of the model, making it more difficult for an adversary to generate adversarial examples. In conclusion, improving the robustness of deep learning models against adversarial attacks is an active area of research. Researchers are continually developing new techniques and approaches to make deep learning models more resistant to adversarial attacks.

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值