基于机器学习和深度学习的图数据异常检测综述

最新推荐文章于 2025-03-09 17:00:33 发布

AI蜗牛车

最新推荐文章于 2025-03-09 17:00:33 发布

阅读量3.7k

点赞数 2

文章标签：算法机器学习人工智能 python 深度学习

原文链接：https://mp.weixin.qq.com/s?__biz=MzA4ODUxNjUzMQ==&mid=2247496506&idx=1&sn=cb5ffff91f9288e3497a0610666a2f19&chksm=902a41e6a75dc8f04775fc7b49c5826c73a9ffb90a66ec3cf4e1534a4a2957215a6d9b5d74ef&scene=126&&sessionid=0

版权

背景

「图」 (Graph) 普遍用于建模结构/关系性数据，「节点」 (vertices/nodes) 表示实体，「边」 (edges) 表示实体间存在的关系。

「异常检测」（Anomaly detection）指从数据中挖掘出与大部分对象不同的目标对象（异常点/离群点），这些目标对象的分布和产生机制与其它对象不同。

「图异常检测」（Graph anomaly detection）问题是指：

「Object-level」：在单个图中识别出异常的图对象，例如异常节点、异常边或者异常子图（node/edge/sug-graph-level anomalies）;
「Graph-level」：在图集合/图序列中识别出异常图；

传统异常检测方法和图异常检测方法的区别如下图所示，传统方法不能高效地用于大规模图学习任务并且难以捕获对象节点间的关系，因此基于深度学习的图异常检测方法逐渐兴起，尤其是图神经网络 GNN 的热潮。

但基于深度学习的图异常检测方法存在非常多的局限性：

如何设计异常感知的模型目标函数，可以在模型训练过程中区分异常对象？
如何解释模型检测到的异常，尤其在于金融等传统行业？
如何提高模型的的训练效率并且节约计算资源？
如何在缺少监督信息的情况下优化深度模型的超参数？

面向图数据的异常检测可以应用于社会生活的各个领域，如金融、互联网安全、社交关系挖掘、电信诈骗检测等等。

本文主要学习和总结下当前基于深度学习的图异常检测算法，主要依赖的论文大纲 [^1] 并对其进行简化和补充。

算法模型

根据算法模型的检测级别图异常检测任务大体上分为三类：

在上述分类下可以根据图数据类型可以进一步区分，主要包括：① 静态图：简单图，属性图 ② 动态图。📢 同一模型可以解决不同级别的问题。

由于篇幅较长，因此在这个系列中会分三部分进行介绍，最后给出总结。可以根据以上分类下的介绍进行查看！！

数据集

目前已有大量的图数据集开源，可在论文提供的数据源下载：

个人收集：图神经网络 GNN 基准数据集汇总[^2]
论文总结

学习资源

awesome-fraud-detection-papers

开源算法模型

目前开源但不局限于以下内容：

Model	Language	Platform	Graph	Code
Sedanspot	C++	-	Dynamic Graph	https://www.github.com/dhivyaeswaran/sedanspot
AnomalyDAE	Python	Tensorﬂow	Dynamic Attribute Graph	https://github.com/haoyfan/AnomalyDAE
MADAN	Python	-	Static Attributed Graph	https://github.com/leoguti85/MADAN
PAICAN	Python	Tensorﬂow	Static Attributed Graph	http://www.kdd.in.tum.de/PAICAN/
Changedar	Matlab	-	Dynamic Attributed Graph	https://bhooi.github.io/changedar/
ONE	Python	-	Static Plain Graph	https://github.com/sambaranban/ONE
DONE&AdONE	Python	Tensorﬂow	Static Attributed Graph	https://bit.ly/35A2xHs
SLICENDICE	Python	-	Static Attributed Graph	http://github.com/hamedn/SliceNDice/
SemiGNN	Python	Tensorﬂow	Static Attributed Graph	https://github.com/safe-graph/DGFraud
CARE-GNN	Python	Pytorch	Static Attributed Graph	https://github.com/YingtongDou/CARE-GNN
GraphConsis	Python	Tensorﬂow	Static Attributed Graph	https://github.com/safe-graph/DGFraud
GLOD	Python	Pytorch	Static Attributed Graph	https://github.com/LingxiaoShawn/GLOD-Issues
GCAN	Python	Keras	Heterogeneous Graph	https://github.com/l852888/GCAN
HGATRD	Python	Pytorch	Heterogeneous Graph	https://github.com/201518018629031/HGATRD
GLAN	Python	Pytorch	Heterogeneous Graph	https://github.com/chunyuanY/RumorDetection
ANOMRANK	C++	-	Dynamic Graph	https://github.com/minjiyoon/anomrank
DAGMM	Python	Pytorch	Dynamic Graph	https://github.com/danieltan07/dagmm
OCAN	Python	Tensorﬂow	Static Graph	https://github.com/PanpanZheng/OCAN
DevNet	Python	Tensorﬂow	Static Graph	https://github.com/GuansongPang/deviation-network
RDA	Python	Tensorﬂow	Static Graph	https://github.com/zc8340311/RobustAutoencoder
GAD	Python	Tensorﬂow	Static Graph	https://github.com/raghavchalapathy/gad
GEM	Python	-	Static Graph	https://github.com/safe-graph/DGFraud/tree/master/algorithms/GEM
MIDAS	C++	-	Dynamic Graph	https://github.com/Stream-AD/MIDAS
DeFrauder	Python	-	Static Graph	https://github.com/LCS2-IIITD/DeFrauder
DeepFD	Python	Pytorch	Bipartite Graph	https://github.com/JiaWu-Repository/DeepFD-pyTorch
STS-NN	Python	Pytorch	Static Graph	https://github.com/JiaWu-Repository/STS-NN
UPFD	Python	Pytorch	Graph Database	https://github.com/safe-graph/GNN-FakeNews
DeepSphere	Python	Tensorﬂow	Dynamic Graph	https://github.com/picsolab/DeepSphere
OCGIN	Python	Pytorch	Graph Database	https://github.com/LingxiaoShawn/GLOD-Issues
DeepSAD	Python	Pytorch	Non Graph	https://github.com/lukasruff/Deep-SAD-PyTorch
DATE	Python	Pytorch	Non Graph	https://github.com/Roytsai27/Dual-Attentive-Tree-aware-Embedding