每日学术速递2.8

AiCharm

已于 2023-02-08 15:41:02 修改

阅读量294

点赞数

分类专栏： # 每日学术速递文章标签：深度学习人工智能神经网络计算机视觉

于 2023-02-08 11:07:57 首次发布

本文链接：https://blog.csdn.net/muye_IT/article/details/128932020

版权

每日学术速递专栏收录该内容

120 篇文章 79 订阅

订阅专栏

CV - 计算机视觉 | ML - 机器学习 | RL - 强化学习 | NLP 自然语言处理

Subjects: cs.Cv

1.MAP: Memory-aware Automated Intra-op Parallel Training For Foundation Models

标题：MAP:记忆感知的基础模型自动操作内并行训练

作者： Yuliang Liu, Shenggui Li, Jiarui Fang, Yanjun Shao, Boyuan Yao, Yang You

文章链接：https://arxiv.org/abs/2302.02599v1

项目代码：https://github.com/hpcaitech/colossalai

摘要：

最近，大型模型在各个领域都取得了最先进的表现。为了支持大型模型的训练，我们必须使用分布式训练技术。然而，找到一个有效的分布式执行计划不仅需要细粒度的模型统计，如每个操作者的内存和计算开销，而且即使对分布式训练领域的专家来说也是一项劳动密集型的任务。在本文中，我们介绍了MAP，一个建立在PyTorch基础上的编译器，以实现内存感知的自动并行化。为了剖析操作者的成本，现有的训练系统和机器学习管道要么是对每个操作者进行物理执行，要么是用按比例的输入张量来估计内存的使用，这些方法往往很费时，而且会产生误导。与现有的方法相比，MAP提供了一个易于使用的符号剖析器，以微不足道的时间成本生成任意PyTorch模型的内存和计算统计数据，因此它将提升ML开发人员的高生产力。此外，MAP还可以无缝加速PyTorch计算图上的不同静态规划任务，只需要对用户代码进行几行修改，就可以生成一个具有顶级性能的分布式执行计划的新模块实例。源代码可在https://github.com/hpcaitech/ColossalAI 上公开获取。

Recently, large models have achieved the state of the art performances in various fields. In order to support large model training, we have to use distributed training techniques. However, finding an efficient distributed execution plan not only requires fine-grained model statistics, such as memory and computing overhead of each operator but also is a labor-intensive task even for an expert in the field of distributed training. In this paper, we introduce MAP, a compiler built upon PyTorch to implement Memory-aware Automated Parallelization. To profiling operator costs, existing training systems and machine learning pipelines either physically execute with respect to each operand or estimate the memory usage with a scaled input tensor, which are often time-consuming and misleading. Compared with existing methods, MAP provides an easy-to-use symbolic profiler to generate memory and computing statistics of an arbitrary PyTorch model with trivial time cost, so it will boost high productivity for ML developers. In addition, MAP can also seamlessly speed up different static planning tasks on computation graphs for PyTorch, and requires only a few lines of modification to user code to generate a new module instance that has a top-performing distributed execution plan. The source code is publicly available at https://github.com/hpcaitech/ColossalAI

Subjects: cs.CL

2.LoFT: Enhancing Faithfulness and Diversity for Table-to-Text Generation via Logic Form Control

标题：LoFT:通过逻辑形式控制加强表到文本生成的忠实性和多样性

作者：Yilun Zhao, Zhenting Qi, Linyong Nan, Lorenzo Jaime Yu Flores, Dragomir Radev

文章链接：https://arxiv.org/abs/2302.02962v1

项目代码：https://github.com/yale-lily/loft

摘要：

逻辑表到文本（LT2T）生成的任务是从表中生成逻辑上忠实的句子。目前该领域存在两个挑战：1）忠实性：如何在表格内容的基础上生成事实正确的句子；2）多样性：如何生成对表格提供不同观点的多个句子。这项工作提出了LoFT，它利用逻辑表格作为事实验证器和内容规划器来控制LT2T的生成。在LogicNLG数据集上的实验结果表明，LoFT是第一个同时解决不忠实性和缺乏多样性问题的模型。我们的代码可在https://github.com/Yale-LILY/LoFT。

Logical Table-to-Text (LT2T) generation is tasked with generating logically faithful sentences from tables. There currently exists two challenges in the field: 1) Faithfulness: how to generate sentences that are factually correct given the table content; 2) Diversity: how to generate multiple sentences that offer different perspectives on the table. This work proposes LoFT, which utilizes logic forms as fact verifiers and content planners to control LT2T generation. Experimental results on the LogicNLG dataset demonstrate that LoFT is the first model that addresses unfaithfulness and lack of diversity issues simultaneously. Our code is publicly available at https://github.com/Yale-LILY/LoFT.

Subjects: cs.LG、cs.ML、cs.CV

3.Probabilistic Contrastive Learning Recovers the Correct Aleatoric Uncertainty of Ambiguous Inputs

标题：概率对比学习恢复了模糊输入的正确不确定度

作者：Michael Kirchhof, Enkelejda Kasneci, Seong Joon Oh

文章链接：https://arxiv.org/abs/2302.02865v1

项目代码：https://github.com/mkirchhof/probabilistic_contrastive_learning

摘要：

对比训练的编码器最近被证明可以反转数据生成过程：它们将每个输入，例如图像，编码为生成图像的真实潜伏向量（Zimmermann等人，2021）。然而，现实世界的观察往往具有内在的模糊性。例如，图像可能是模糊的，或者只显示一个三维物体的二维视图，所以可能有多个潜势产生了它们。这使得潜质向量的真实后验具有异方差不确定性的概率。在这种情况下，我们扩展了常见的InfoNCE目标和编码器，以预测潜质分布而不是点。我们证明，这些分布恢复了数据生成过程的正确后验，包括它的不确定性水平，直到潜空间的旋转。除了提供校准的不确定性估计外，这些后验允许计算图像检索中的可信区间。它们包括具有与给定查询相同的潜像的图像，受其不确定性的影响。

Contrastively trained encoders have recently been proven to invert the data-generating process: they encode each input, e.g., an image, into the true latent vector that generated the image (Zimmermann et al., 2021). However, real-world observations often have inherent ambiguities. For instance, images may be blurred or only show a 2D view of a 3D object, so multiple latents could have generated them. This makes the true posterior for the latent vector probabilistic with heteroscedastic uncertainty. In this setup, we extend the common InfoNCE objective and encoders to predict latent distributions instead of points. We prove that these distributions recover the correct posteriors of the data-generating process, including its level of aleatoric uncertainty, up to a rotation of the latent space. In addition to providing calibrated uncertainty estimates, these posteriors allow the computation of credible intervals in image retrieval. They comprise images with the same latent as a given query, subject to its uncertainty