改进符号数据可视化以促进模式识别和知识发现

最新推荐文章于 2021-01-06 23:50:32 发布

VISINF

最新推荐文章于 2021-01-06 23:50:32 发布

阅读量377

点赞数

分类专栏：国际学术期刊浙大CAD&CG国重 Open Access 文章标签：可视化数据分析大数据

本文链接：https://blog.csdn.net/VISINF/article/details/103988180

版权

浙大CAD&CG国重同时被 3 个专栏收录

21 篇文章 0 订阅

订阅专栏

Open Access

19 篇文章 0 订阅

订阅专栏

国际学术期刊

17 篇文章 0 订阅

订阅专栏

符号数据通常是从大型数据集聚合而来，用于隐藏条目特定的细节，并将大量数据(如大数据)转换成可分析量。在总体趋势比个别细节更重要的地方它可用来提供总览。符号数据有多种形式，如区间、直方图、类别和模态多值对象。符号数据也可以认为是一种分布。目前，实际使用的符号数据可视化方法是zoomstars，它有许多局限性。最大的限制是因为需要另一维度的数据，默认分布(直方图)在2D内不受支持。
在这里插入图片描述
本文研究符号数据的可视化，并分析其复杂结构带来的挑战,同时提出了对zoomstars的几种改进，使其能够通过分位数或等价的区间方法实现2D内直方图的可视化。此外，还提出了对分类变量和模态变量的几项改进，使之能更清楚地展现所呈现的类别。

根据数据类型和期望的目标，本文为用户提供了基于zoomstars的不同可视化方案。此外，提出了一种形状编码的方法，可在综合的类似表格的图中可视化整个数据集。这些可视化方法及其可用性通过三个符号数据集进行了验证，这三个数据集在探索性数据挖掘阶段分别用来识别趋势、相似对象和重要特征，检测数据中的异常值和差异。
Example of Zoomstars using SODAS software (a) 2D (with opened distribution window for feature “Urbancity”) and (b) 3D zoomstars of environment data .

关键词：
数据可视化，符号数据， Zoomstar，形状编码，探索性数据分析

全文信息

Improving symbolic data visualization for pattern recognition and knowledge discovery

BY: Kadri Umbleja, Manabu Ichino, Hiroyuki Yaguchi

Abstract:
This paper examines the visualization of symbolic data and considers the challenges rising from its complex structure. Symbolic data is usually aggregated from large data sets and used to hide entry specific details and to transform huge amounts of data (like big data) into analyzable quantities. It is also used to offer an overview in places where general trends are more important than individual details. Symbolic data comes in many forms like intervals, histograms, categories and modal multi-valued objects. Symbolic data can also be considered as a distribution. Currently, the de facto visualization approach for symbolic data is zoomstars which has many limitations. The biggest limitation is that the default distributions (histograms) are not supported in 2D as additional dimension is required. This paper proposes several new improvements for zoomstars which would enable it to visualize histograms in 2D by using a quantile or an equivalent interval approach. In addition, several improvements for categorical and modal variables are proposed for a clearer indication of presented categories. Recommendations for different approaches to zoomstars are offered depending on the data type and the desired goal. Furthermore, an alternative approach that allows visualizing the whole data set in comprehensive table-like graph, called shape encoding, is proposed. These visualizations and their usefulness are verified with three symbolic data sets in exploratory data mining phase to identify trends, similar objects and important features, detecting outliers and discrepancies in the data.

Keywords: Data visualization, Symbolic data, Zoomstar, Shape encoding, Exploratory data analysis

Link: https://www.sciencedirect.com/science/article/pii/S2468502X19300014

期刊信息：
Visual Informatics（中文名《可视信息学》）是由浙江大学主办、浙江大学出版社和Elsevier出版集团联合出版、在线发行、开放获取的国际学术期刊。该刊聚焦于面向人类感知的视觉信息的建模、分析、合成、增强与自然交互。主编是周昆教授、Hans-Peter Seidel教授。
Elsevier link (including First Online Articles): https://www.journals.elsevier.com/visual-informatics
Submit your paper:
https://www.editorialmanager.com/VISINF/default.aspx
Tel:(86-571)88206681-519
E-mail: lujinzhi@cad.zju.edu.cn
Linked in：Visual Informatics
Wechat
在这里插入图片描述

VISINF

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
改进符号数据可视化以促进模式识别和知识发现

符号数据通常是从大型数据集聚合而来，用于隐藏条目特定的细节，并将大量数据(如大数据)转换成可分析量。在总体趋势比个别细节更重要的地方它可用来提供总览。符号数据有多种形式，如区间、直方图、类别和模态多值对象。符号数据也可以认为是一种分布。目前，实际使用的符号数据可视化方法是zoomstars，它有许多局限性。最大的限制是因为需要另一维度的数据，默认分布(直方图)在2D内不受支持。本文研究符号数据...
复制链接

扫一扫