Relational Deep Learning: Graph Representation Learning on Relational Databases


论文来源
https://relbench.stanford.edu/paper.pdf
https://github.com/snap-stanford/relbench

Abstract

背景:Much of the world’s most valued data is stored in data warehouses, where the data is spread across many tables connected by primary-foreign key relations.
问题:The core problem is that no machine learning method is capableof learning directly on the data spread across multiple relational tables.
方法:Here we introduce an end-to-end deep representation learning approach to directly
learn on data spread across multiple tables. We name our approach Relational Deep Learning. The core idea is to view relational tables as a heterogeneous graph, with a node for each row in each table, and edges specified by primary-foreign key relations. Message Passing Neural Networks can then automatically learn across multiple tables to extract representations that leverage all input data, without any manual feature engineering.
结果:To facilitate research, we also develop RELBENCH, a set of benchmark datasets and an implementation of Relational Deep Learning.Overall, we define a new research area that generalizes graph machine learning and broadens its applicability to a wide set of AI use cases.

1、Introduction

Many predictive problems over relational data have significant implications for human decision making. However, existing learning paradigms, notably tabular learning, cannot be directly applied to interlinked relational tables.

There are several issues with feature engineering: (1) it is a manual, slow and labor intensive process; (2) feature choices are likely highly-suboptimal; (3) only a small fraction of the overall space of possible features can be manually explored; (4) by forcing data into a
single table, information is aggregated into lower-granularity features, thus losing out on valuable
fine-grain signal; (5) whenever data distribution changes or drifts, current features become obsolete
and new features have to be manually reinvented.

The core of our approach is to represent relational tables as a heterogeneous Relational Entity Graph, where each row defines a node, columns define node features, and primary-foreign key relations define edges.

Graph Neural Networks (GNNs) can then be applied to build end-to-end predictive models.

2、Predictive Tasks on Relational Tables

Thus, when the model is trained on a training example that was sampled from a specific time s in the past, it is of utmost importance
to ensure that the model only sees the state of the database as it was before that time t.

3、Predictive Tasks as Graph Representation Learning Problems

Here, we formulate a generic machine learning architecture based on Graph Neural Networks, which solves predictive tasks on relational databases.

4、RELBENCH: A Benchmark for Relational Deep Learning

RELBENCH enables training and evaluation of machine learning models on relational data. RELBENCH supports deep learning framework agnostic data loading, task specification, standardized data splitting, and transforming data into graph format. RELBENCH provides standardized evaluation metric computations, and a leaderboard for tracking progress. We
additionally provide example training scripts built using PyTorch Geometric and PyTorch Frame.

The goal of RELBENCH is to facilitate scalable, robust, and reproducible machine learning research on relational tables.

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值