[读论文-1]DIFUSCO: Graph-based Diffusion Solvers for Combinatorial Optimization

简介: 这篇论文主要介绍了一个新的模型DIFUSCO : 这个model 文中实现的可以解决的问题是TSP and MIS ,model 的主要实现方法是先根据diffusion model 进行向前传播-进行加噪声,然后进行先后传播进行去噪声。去噪声用的是12层的Anisotropic Graph Neural Networks,Graph节点数量为256.(网络是监督学习的)

该模型的效果非常的好:

code is available at here

下面是详细介绍:


Introduction 里面介绍了三种求解NPC问题的方法 - 自回归,非自回归,启发式方法。

分别介绍了三种方法的缺点:

自回归: Those methods typically suffer from the costly computation in their sequential decoding parts and hence are difficult to scale up to large problems

非自回归:unavoidably limits the capability of those methods to capture the multimodal nature of the problems, for example, when multiple optimal solutions exists for the same graph.

启发式方法 : have also suffered from the difficulty in scaling up and the latency in inference.partly due to the sparse rewards and sample efficiency issues at RL learning.

然后介绍了DIFUSCO model 对于三者的问题都有改善:

Firstly, DIFUSCO can perform inference on all variables in parallel with a few (≪ N ) denoising steps (Sec. 3.3), avoiding the sequential generation problem of autoregressive constructive solvers.

Secondly, DIFUSCO can model a multimodal distribution via iterative refinements, which alleviates the expressiveness limitation of previous non-autoregressive constructive models.

Last but not least, DIFUSCO is trained in an efficient and stable manner with supervised denoising (Sec. 3.2), which solves the training scalability issue of RL-based improvement heuristics methods. 

文章主要采用了graph-based diffusion (可以用“constructive heuristics solvers” 和 “improvement heuristics solvers”),model 的实现有两种 : continuous diffusion with Gaussian noise  and discrete diffusion with Bernoulli noise ,  其中 discrete 的要好一点 。

相关工作:主要是 Diffusion Models for Discrete Data,包含了(Typical diffusion models ,Another line of work studies diffusion models for discrete data ,Discrete diffusion models)

model 设计部分:

foreard:

reverse:

加快inference的方法:

1.One way to speed up the inference of denoising diffusion models is to reduce the number of steps

2.取子集

ps:cosine schedule更好一点

然后介绍了一下网络结构

然后是Decoding Strategies

generate instance 的热力图 , 采用了Bernoulli sampling , 会discards the comparative information, 为了保留信息:

TSP Decoding :

1. Greedy decoding + 2-opt + all the edges are ranked by (Aij + Aji)/∥ci − cj∥

2. guided by the heatmap scores + k-opt + guided by the heatmap scores(Aij)

最后就是实验数据了

  • 8
    点赞
  • 24
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值