2021-07-22 DGL 图神经网络开源工作更新翻译

最新推荐文章于 2024-01-03 18:40:14 发布

路人与大师

最新推荐文章于 2024-01-03 18:40:14 发布

阅读量1.2k

点赞数

分类专栏：深度学习框架

原文链接：https://github.com/dmlc/dgl/releases/tag/v0.7.0

版权

深度学习框架专栏收录该内容

9 篇文章

订阅专栏

这是一个新的主要版本，包含各种系统优化、新特性和增强功能、新模型和错误修复。

This is a new major release with various system optimizations, new features and enhancements, new models and bug fixes.

重要改变，在pypi安装上进行了与以往安装的区分。

Important: Change on PyPI Installation

New Tutorials for Multi-GPU and Distributed Training

改进的CPU消息传递内核

Improved CPU Message Passing Kernel

PyTorch Lightning Compatibility

Performance Optimizations

Other Enhancements

异常修复

Bug Fixes

Important: Change on PyPI Installation

PyPI上不再提供DGL 的pip wheels 。请使用以下命令安装DGL

DGL pip wheels are no longer shipped on PyPI.Use the following command to install DGL with pip:

新的安装命令

pip install dgl -f https://data.dgl.ai/wheels/repo.html for CPU.
pip install dgl-cuXX -f https://data.dgl.ai/wheels/repo.html for CUDA.
pip install --pre dgl -f https://data.dgl.ai/wheels-test/repo.html for CPU nightly builds.
pip install --pre dgl-cuXX -f https://data.dgl.ai/wheels-test/repo.html for CUDA nightly builds.

这不会影响conda的安装。

This does not impact conda installation.

基于GPU的邻居采样

GPU-based Neighbor Sampling

DGL现在支持GPU上的统一邻居采样和MFG转换，由NVIDIA的@nv dlasalle提供。ogbn乘积图（ogbn-product graph ）上的GraphSAGE实验在g3.16x实例上获得了>10x的加速比（从每轮113s减少到11s）。以下文件已相应更新：

DGL now supports uniform neighbor sampling and MFG conversion on GPU, contributed by@nv-dlasallefrom NVIDIA. Experiment for GraphSAGE on the ogbn-product graph gets a>10xspeedup (reduced from 113s to 11s per epoch) on a g3.16x instance. The following docs have been updated accordingly:

一个新的用户指南章节使用GPU的邻居采样有关何时和如何使用这个新功能。
NodeDataLoader的API文档。

A new user guide chapter Using GPU for Neighborhood Sampling about when and how to use this new feature.
The API doc of NodeDataLoader.

多GPU和分布式训练的新教程

New Tutorials for Multi-GPU and Distributed Training

该版本带来了两个新的教程，分别是关于节点分类和图形分类的多GPU训练。还有一个关于跨多台机器的分布式培训的新教程。所有这些都可以在 https://docs.dgl.ai/.

The release brings two new tutorials about multi-GPU training for node classification and graph classification, respectively. There is also a new tutorial about distributed training across multiple machines. All of them are available at https://docs.dgl.ai/.

改进的CPU消息传递内核

Improved CPU Message Passing Kernel

由于英特尔的@sanchitmisra，更新包含了用于GNN消息传递的核心GSpMM内核的新CPU实现。新内核在稀疏CSR矩阵上执行平铺，并利用Intel的LibXSMM生成内核，这比旧内核提供了高达4.4倍的加速。请看他们的论文https://arxiv.org/abs/2104.06700 详情。

The update includes a new CPU implementation of the core GSpMM kernel for GNN message passing, thanks to@sanchit-misrafrom Intel. The new kernel performs tiling on the sparse CSR matrix and leverages Intel’s LibXSMM for kernel generation, which gives an up to4.4x speedupover the old kernel. Please read their paperhttps://arxiv.org/abs/2104.06700for details.

为多GPU训练和分布式训练提供更高效的节点嵌入

More efficient NodeEmbedding for multi-GPU training and distributed training

DGL现在利用NCCL在训练期间同步稀疏节点嵌入（dgl.nn.NodeEmbedding）的梯度（归功于NVIDIA的@nv-dlasallefrom）。NCCL功能在dgl.optim.SparseAdam和dgl.optim.SparseAdagrad中都可用。实验表明，在g4dn.12xlarge（4 T4 GPU）实例上，在ogbn-mag图上训练RGCN的加速比为20%（从47.2s降到39.5s/epoch）。当检测到NCCL后端支持时，将自动启用优化。

DGL now utilizes NCCL to synchronize the gradients of sparse node embeddings (dgl.nn.NodeEmbedding) during training (credits to@nv-dlasallefrom NVIDIA). The NCCL feature is available in bothdgl.optim.SparseAdamanddgl.optim.SparseAdagrad. Experiments show a20% speedup(reduced from 47.2s to 39.5s per epoch) on a g4dn.12xlarge (4 T4 GPU) instance for training RGCN on ogbn-mag graph. The optimization is automatically turned on when NCCL backend support is detected.

dgl.distributed.DistEmbedding的稀疏优化器现在使用同步梯度更新策略。我们添加了一个新的优化器dgl.distributed.optim.SparseAdam。dgl.distributed.SparseAdagrad已移动到dgl.distributed.optim.SparseAdagrad。

The sparse optimizers fordgl.distributed.DistEmbeddingnow use a synchronized gradient update strategy. We add a new optimizerdgl.distributed.optim.SparseAdam. Thedgl.distributed.SparseAdagradhas been moved todgl.distributed.optim.SparseAdagrad.

稀疏矩阵乘法和加法支持

Sparse-sparse Matrix Multiplication and Addition Support

我们添加了两个新的api dgl.adj_product_graph和dgl.adj_sum_graph，分别执行稀疏矩阵乘法和加法作为图操作。他们可以运行与CPU和GPU与自动标签支持。这些函数的一个示例用法是图形变换器网络。

We add two new APIsdgl.adj_product_graphanddgl.adj_sum_graphthat perform sparse-sparse matrix multiplications and additions as graph operations respectively. They can run with both CPU and GPU with autograd support. An example usage of these functions isGraph Transformer Networks.

PyTorch Lightning兼容性

PyTorch Lightning Compatibility

DGL现在与PyTorch Lightning兼容，用于单个GPU训练或分布式数据并行训练。请看这个用PyTorch Lightning训练GraphSAGE的示例。

DGL is now compatible with PyTorch Lightning for single-GPU training or training with DistributedDataParallel. See this example of training GraphSAGE with PyTorch Lightning.

节点分类: https://github.com/dmlc/dgl/blob/master/examples/pytorch/graphsage/train_lightning.py
无监督学习: https://github.com/dmlc/dgl/blob/master/examples/pytorch/graphsage/train_lightning_unsupervised.py
Node classification: https://github.com/dmlc/dgl/blob/master/examples/pytorch/graphsage/train_lightning.py
Unsupervised learning: https://github.com/dmlc/dgl/blob/master/examples/pytorch/graphsage/train_lightning_unsupervised.py

感谢@justusschock使DGL数据加载程序与PyTorch Lightning（#2886）兼容。

We thank@justusschockfor making DGL DataLoaders compatible with PyTorch Lightning (#2886).

新模型

New Models

DGL在0.7中添加了一批19个新模型示例，使模型总数达到90+。用户现在可以使用上的搜索栏https://www.dgl.ai/ 快速定位带有标记关键字的示例。以下是新增模型列表。

A batch of19 new model examplesare added to DGL in 0.7 bringing the total number to be 90+. Users can now use the search bar onhttps://www.dgl.ai/to quickly locate the examples with tagged keywords. Below is the list of new models added.

学习对象、关系和物理的交互网络
Interaction Networks for Learning about Objects, Relations, and Physics (https://arxiv.org/abs/1612.00222.pdf) (#2794, @Ericcsr)
多GPU RGAT支撑的OGB-LSC节点分类
Multi-GPU RGAT for OGB-LSC Node Classification (#2835, @maqy1995)
标签完全不平衡的网络嵌入
Network Embedding with Completely-imbalanced Labels (https://ieeexplore.ieee.org/document/8979355) (#2813, @Fizyhsp)
改进的时态图网络
Temporal Graph Networks improved (#2860, @Ericcsr)
扩散卷积递归神经网络
Diffusion Convolutional Recurrent Neural Network (https://arxiv.org/abs/1707.01926) (#2858, @Ericcsr)
大型时空图学习的门控注意网络
Gated Attention Networks for Learning on Large and Spatiotemporal Graphs (https://arxiv.org/abs/1803.07294) (#2858, @Ericcsr)
更深的GCN
DeeperGCN (https://arxiv.org/abs/2006.07739) (#2831, @xnuohz)
深度图对比表征学习
Deep Graph Contrastive Representation Learning (https://arxiv.org/abs/2006.04131) (#2828, #3009, @hengruizhang98)
受经典迭代算法启发的图神经网络
Graph Neural Networks Inspired by Classical Iterative Algorithms (https://arxiv.org/abs/2103.06064) (#2770, @ffttyy)
GraphSAINT (#2792) (@lt610)
标签传播
Label Propagation (#2852, @xnuohz)
将标签传播和简单模型相结合，可使图神经网络性能更佳
Combining Label Propagation and Simple Models Out-performs Graph Neural Networks (https://arxiv.org/abs/2010.13993) (#2852, @xnuohz)
GCNII (#2874, @kyawlin)
在GPU上训练的潜在Dirichlet分配
Latent Dirichlet Allocation on GPU (#2883, @yifeim)
基于异构信息网络的冷启动用户跨域保险推荐系统
A Heterogeneous Information Network based Cross Domain Insurance Recommendation System for Cold Start Users (#2864, @KounianhuaDu)
五种异构图模型：HetGNN/GTN/HAN/NSHE/MAGNN
Five heterogeneous graph models: HetGNN/GTN/HAN/NSHE/MAGNN (#2993, @Theheavens)
新的OGB-arxiv和OGB蛋白（OGB-proteins）结果
New OGB-arxiv and OGB-proteins results (#3018, @espylapiza)
基于小批量抽样的异构图注意网络
Heterogeneous Graph Attention Networks with minibatch sampling (#3005, @maqy1995)
用于图像聚类的层次图神经网络学习方法
Learning Hierarchical Graph Neural Networks for Image Clustering (https://arxiv.org/abs/2107.01319) (#3087, #3105)

新数据集