直播预告|ICML专场最后一场啦!来蹲守直播间呀

点击蓝字

eaa272f5eb65d27d0eba8bb81cfbe17b.png

关注我们

AI TIME欢迎每一位AI爱好者的加入!

daca895dc129531561913196f920f2fb.png

9月28日 15:00~20:30

AI TIME特别邀请了多位PhD,带来ICML-6!

7a9911d8b2d60b67cc5ae6f00c7d6667.png

哔哩哔哩直播通道

扫码关注AITIME哔哩哔哩官方账号

观看直播

链接:https://live.bilibili.com/21813994

15:00-17:00

★ 嘉宾介绍 ★

3bf002dca2bc94f50c4b461349165279.png

贾彬彬

从2017年9月开始在东南大学计算机科学与工程学院攻读博士学位,主要研究方向为多维分类(Multi-Dimensional Classification),目前发表CCF A类期刊1篇、会议3篇,CCF B类期刊3篇,CCF C类会议1篇。

报告题目:

56c054b0850371cce8d4e76d224a1a4a.gif

基于稀疏标记编码的多维分类方法

内容简介:

In multi-dimensional classification (MDC), there are multiple class variables in the output space with each of them corresponding to one heterogeneous class space. Due to the heterogeneity of class spaces, it is quite challenging to consider the dependencies among class variables when learning from MDC examples. In this paper, we propose a novel MDC approach named SLEM which learns the predictive model in an encoded label space instead of the original heterogeneous one. Specifically, SLEM works in an encoding-training-decoding framework. In the encoding phase, each class vector is mapped into a real-valued one via three cascaded operations including pairwise grouping, one-hot conversion and sparse linear encoding. In the training phase, a multi-output regression model is learned within the encoded label space. In the decoding phase, the predicted class vector is obtained by adapting orthogonal matching pursuit over outputs of the learned multi-output regression model. Experimental results clearly validate the superiority of SLEM against state-of-the-art MDC approaches.

7124928e8820f6dc3de3e66bba088499.png

吴齐天

上海交大博士生,导师为严骏驰教授,研究方向为机器学习与数据挖掘。在ICML/NeurIPS/KDD发表多篇一作论文,获评2021年百度AI新星。

报告题目:

f397dacc3e3d28815ad8047319f7b419.gif

基于图结构学习的归纳式协同过滤

内容简介:

Recommendation models can effectively estimate underlying user interests and predict one's future behaviors by factorizing an observed user-item rating matrix into products of two sets of latent factors. However, the user-specific embedding factors can only be learned in a transductive way, making it difficult to handle new users on-the-fly. In this paper, we propose an inductive collaborative filtering framework that contains two representation models. The first model follows conventional matrix factorization which factorizes a group of key users' rating matrix to obtain meta latents. The second model resorts to attention-based structure learning that estimates hidden relations from query to key users and learns to leverage meta latents to inductively compute embeddings for query users via neural message passing. Our model enables inductive representation learning for users and meanwhile guarantees equivalent representation capacity as matrix factorization. Experiments demonstrate that our model achieves promising results for recommendation on few-shot users with limited training ratings and new unseen users which are commonly encountered in open-world recommender systems.

5e1a67743bf9a519d095391a63c489f4.png

唐瀚霖

从2017年9月就读于University of Rochester大学CS,于2021年7月获得CS博士学位。主要研究方向是大规模训练下的通信加速的优化器设计以及广义优化器研究。导师是刘霁老师。在ICML和NIPS上发表过相关论文。

报告题目:

6169977d3a9265212b84291e7d7767e9.gif

一种使用一比特通信的Adam优化器

内容简介:

Scalable training of large models (like BERT and GPT-3) requires careful optimization rooted in model design, architecture, and system capabilities. From a system standpoint, communication has become a major bottleneck, especially on commodity systems with standard TCP interconnects that offer limited network bandwidth. Communication compression is an important technique to reduce training time on such systems. One of the most effective methods is error-compensated compression, which offers robust convergence speed even under 1-bit compression. However, state-of-the-art error compensation techniques only work with basic optimizers like SGD and momentum SGD, which are linearly dependent on the gradients. They do not work with non-linear gradient-based optimizers like Adam, which offer state-of-the-art convergence efficiency and accuracy for models like BERT. In this paper, we propose 1-bit Adam that reduces the communication volume by up to 5×, offers much better scalability, and provides the same convergence speed as uncompressed Adam. Our key finding is that Adam's variance (non-linear term) becomes stable (after a warmup phase) and can be used as a fixed precondition for the rest of the training (compression phase). Experiments on up to 256 GPUs show that 1-bit Adam enables up to 3.3× higher throughput for BERT-Large pre-training and up to 2.9× higher throughput for SQuAD fine-tuning. In addition, we provide theoretical analysis for our proposed work.

acd5157a6f49f2f9b3cb53f0fc83751e.png

牛帅程

自2018年9月开始在华南理工大学软件学院攻读博士学位。导师为谭明奎教授,以及腾讯AI Lab的吴家祥和赵沛霖研究员。主要研究方向为神经网络结构搜索和迁移学习,并在相关领域会议和期刊发表论文多篇,包括ICML, CVPR, IJCAI, TIP, TKDE等。

报告题目:

de4209d96217f281f98116591750457e.gif

针对动态变化数据的模型结构自动调整

内容简介:

In real-world applications, data often come in a growing manner, where the data volume and the number of classes may increase dynamically. This will bring a critical challenge for learning: given the increasing data volume or the number of classes, one has to instantaneously adjust the neural model capacity to obtain promising performance. Existing methods either ignore the growing nature of data or seek to independently search an optimal architecture for a given dataset, and thus are incapable of promptly adjusting the architectures for the changed data.  To address this, we present a neural architecture adaptation method, namely Adaptation eXpert (AdaXpert), to efficiently adjust previous architectures on the growing data. Specifically, we introduce an architecture adjuster to generate a suitable architecture for each data snapshot, based on the previous architecture and the different extent between current and previous data distributions. Furthermore, we propose an adaptation condition to determine the necessity of adjustment, thereby avoiding unnecessary and time-consuming adjustments. Extensive experiments on two growth scenarios (increasing data volume and number of classes) demonstrate the effectiveness of the proposed method.

19:30-20:30

f80d795bf409b428a69049f2bfed15df.png

钟方威

现为北京大学人工智能研究院博士后,获博士后创新人才计划资助。在此之前于2021年获得北京大学博士学位。他的研究兴趣是融合计算机视觉、机器人学习、多智能体、虚拟现实和认知科学等多个领域知识,实现高效自主的机器人。他已在人工智能领域顶级学术期刊和会议发表论文多篇,包括了IEEE TPAMI、ICML、ICLR、NeurIPS、CVPR、AAAI等。他多次受邀担任 NeurIPS、ICML、ICLR、CVPR、ICCV、AAAI等人工智能领域顶级国际会议程序委员/审稿人。

报告题目:

c1d3462a40c3dbdbfc8e71f3c4fcdffb.gif

向抗视觉混淆的主动目标跟踪迈进

内容简介:

In active visual tracking, it is notoriously difficult when distracting objects appear, as distractors often mislead the tracker by occluding the target or bringing a confusing appearance. To address this issue, we propose a mixed cooperative-competitive multi-agent game, where a target and multiple distractors form a collaborative team to play against a tracker and make it fail to follow. Through learning in our game, diverse distracting behaviors of the distractors naturally emerge, thereby exposing the tracker's weakness, which helps enhance the distraction-robustness of the tracker.For effective learning, we then present a bunch of practical methods, including a reward function for distractors, a cross-modal teacher-student learning strategy, and a recurrent attention mechanism for the tracker. The experimental results show that our tracker performs desired distraction-robust active visual tracking and can be well generalized to unseen environments. We also show that the multi-agent game can be used to adversarially test the robustness of trackers.

2126d4c140c400c615e5226d70dab7dd.png

孙胜扬

加拿大多伦多大学的五年级在读博士生,主要研究方向是基于贝叶斯理论的不确定性估计,侧重于高斯过程,核方法,以及贝叶斯深度学习。曾在ICML,ICLR,NeurIPS,AISTATS等会议上发表文章。

报告题目:

5ec8a8bf26521cf5351319ec4dff1105.gif

基于谐波核分解的变分高斯过程

内容简介:

We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability. We propose the harmonic kernel decomposition (HKD), which uses Fourier series to decompose a kernel as a sum of orthogonal kernels. Our variational approximation exploits this orthogonality to enable a large number of inducing points at a low computational cost. We demonstrate that, on a range of regression and classification problems, our approach can exploit input space symmetries such as translations and reflections, and it significantly outperforms standard variational methods in scalability and accuracy. Notably, our approach achieves state-of-the-art results on CIFAR-10 among pure GP models.

# 今日视推荐 #

直播结束后我们会邀请讲者在微信群中与大家答疑交流,请添加“AI TIME小助手(微信号:AITIME_HY)”,回复“icml”,将拉您进“AI TIME ICML 会议交流群”!

12ae4525af0300b6ae4a4d0f319fd7c0.gif

AI TIME微信小助手

829141a84a4b632349e91bbd657c8c3d.png

主       办:AI TIME 

合作媒体:学术头条、AI 数据派

合作伙伴:智谱·AI、中国工程院知领直播、学堂在线、学术头条、biendata、 Ever链动

AI TIME欢迎AI领域学者投稿,期待大家剖析学科历史发展和前沿技术。针对热门话题,我们将邀请专家一起论道。同时,我们也长期招募优质的撰稿人,顶级的平台需要顶级的你,

请将简历等信息发至yun.he@aminer.cn!

微信联系:AITIME_HY

AI TIME是清华大学计算机系一群关注人工智能发展,并有思想情怀的青年学者们创办的圈子,旨在发扬科学思辨精神,邀请各界人士对人工智能理论、算法、场景、应用的本质问题进行探索,加强思想碰撞,打造一个知识分享的聚集地。

483671e18b2cfde7ba2139f2aa756107.png

更多资讯请扫码关注

5d581f6bc1c7fa9075fa0889248e1d42.png

我知道你“在看”哟~

b3e9b95c8832067ff5574818ac0bf1c1.gif

点击 阅读原文 预约直播

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值