(2019)Outlier detection in Graphs : A study on the Impact of Multiple Graph Model(论文笔记)

最新推荐文章于 2022-10-24 15:59:53 发布

麦地与诗人

最新推荐文章于 2022-10-24 15:59:53 发布

阅读量701

点赞数

分类专栏：异常检测

本文链接：https://blog.csdn.net/YPP0229/article/details/102739087

版权

异常检测专栏收录该内容

28 篇文章 5 订阅

订阅专栏

1. a single graph representation derived from a given dataset

2. multiple graphs models to represent a given database.

The classical approach for detecting outliers in a dataset is to model the data as a singl graph and to apply a single outlier detection method, as sketched in Figure 1(a).
在这里插入图片描述
By using this approach, the identification of outliers is biased by the given model and the selected algorithm.

Alternatively, one could use an ensemble approach to apply a set of complementary outlier detection methods on a single graph and combine their results, such that the algorithm bias is reduced. This approach is sketched in Figure 1(b).
在这里插入图片描述

Existing work for outlier detection in graphs follows the methodologies in Figures 1(a) and 1(b). As a consequence the built-in bias from the graph model selection is not adressed

Here we propose a new methodology that tackles the reduction of graph model bias towards outlier detection by generating multiple graph models to represent the same data.

The overall workflow for an ensemble method combining outlier detection results from multiple graphs is depicted in Figure 1©.
First, multiple graph models represent the same dataset, possibly taking different aspects of the dataset into account for deriving different graph models. We assume, though, that the nodes in different graphs represent the same entities. Only their relations change from model to model.

Next, some algorithm to detect (node) outliers in graphs are applied to each graph model.

In the last step, results from the outlier detection on the different graph representations are combined.

Through the ensemble of different graphs modeling the same data, we can expect an increasing precision and robustness of the outlier detection
在这里插入图片描述

Conclusion

Outlier detection is a subjective and unsupervised task that demands good knowledge and understanding of the data.

Using a single graph model of relation-rich datasets may only model some aspects of the data, thus not making proper use of potential information.

Using multiple graph models may capture more and complementary information.

We therefore suggest, based on our findings, to explore real world data using multiple graph models that are as complementary as possible.

In a practical application, a data analyst is interested in certain entities that lend themselves as a set of nodes in a graph representation while several attributes or inter-relational connections may be represented as edges between nodes. Instead of looking for the one and only, best-ever graph representation of some given raw data, the data analyst should
therefore generate multiple graph models describing different aspects of the raw data,capturing a large variety of characteristics, or putting different emphasis on certain characteristics. That is, the graphs may differ both quantitatively (how dense they are) and qualitatively (which relationships are expressed in the graph structure).

These multiple graph models aim to materialize the various perspectives that the analyst wants to highlight, that is, they should cover the problem scenario as well as possible and in as many different ways as suitable.

Clearly, many questions remain open. We focused in this study purely on the aspect of the impact of multiple graph models for a given dataset.

We evaluated this impact using two different outlier detection algorithms, four combination functions, and two similarity
measures on synthetic and real world data.

For a practical application, various aspects will have strong influence on the achievable quality, for example the algorithm used to detect outliers on the individual graphs and the method used to combine the individual results(as we have seen in this evaluation).
However based on our study we can maintain the recommendation to consider several different graph representations in any case.

麦地与诗人

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
(2019)Outlier detection in Graphs : A study on the Impact of Multiple Graph Model(论文笔记)

1. a single graph representation derived from a given dataset2. no works that discuss the pros and cons of employing multiple graphs models to represent a given database.Graph ModelsMultiple Graph ...
复制链接

扫一扫

专栏目录