gpt2 代码自动补全_如果您大肆宣传GPT-3编写代码,那么您还没有听说过NAS

本文介绍了GPT-2如何用于代码自动补全,对比了GPT-3在编写代码领域的炒作,指出 NAS(神经架构搜索)在这一领域的潜在优势。
摘要由CSDN通过智能技术生成

gpt2 代码自动补全

GPT-3 made headlines in the machine learning community recently over viral videos showing human language-to-code translation. The model could be used to automate many redundant and repeated coding, for example in HTML/CSS or in the construction of a simple neural network. As many others have pointed out, GPT-3 is only a tool, and very limited by its training data.

GPT-3最近在机器学习社区中成为了引起轰动的头条新闻,该视频展示了人类语言到代码的翻译。 该模型可用于自动化许多冗余和重复的编码,例如在HTML / CSS或简单神经网络的构建中。 正如许多其他人指出的那样,GPT-3只是一种工具,受其培训数据的限制很大。

Much of the more sophisticated things developers want to do — for instance, add some specific momentum-based animation to a site — cannot be done by GPT-3 because it a) isn’t advanced enough, b) doesn’t have enough training data, and c) arguably isn’t “creative”. GPT-3 may become a helpful tool to help developers spend less time on retyping old commands and more time on brainstorming creative infrastructure designs and debugging complex, cross-system bugs, but it’s in no way a threat to the livelihoods of programmers.

开发人员想做的许多更复杂的事情-例如,向站点添加一些基于动量的动画-GPT-3无法完成,因为它a)不够先进,b)没有足够的培训数据,并且c)可以说不是“创意”。 GPT-3可能会成为有用的工具,可以帮助开发人员减少重新键入旧命令的时间,而将更多时间用于集思广益的创意基础架构设计和调试复杂的跨系统错误,但这绝不威胁程序员的生计。

Of particular interest was a demonstration in which GPT-3 wrote out code in Keras, the popular high-level deep learning library, for a neural network given a natural text input like “I have a dataset of 60 by 60 pixel images and want to classify them into 10 classes”. Then, GPT-3 would spit out default code for constructing a basic template convolutional neural network.

尤其令人感兴趣的是一个演示,其中GPT-3在流行的高级深度学习库Keras中为神经网络编写了代码,给出了自然文本输入,例如“我有60 x 60像素图像的数据集,并且想要将它们分为10类”。 然后,GPT-3将吐出默认代码以构建基本的模板卷积神经网络。

GPT-3 ‘writing code’ for neural networks is much like a beginner trying to type out sample code they remembered from a tutorial: it doesn’t care about its performance or fitting its architecture to the dataset, it is just trying to predict the next most likely character based on what it saw earlier in a training dataset. Although it’s a blunt way of putting it, using GPT-3 to write code is like automating your search for simple code snippets on StackOverflow. Perhaps it will find commercial success in code autocomplete.

用于神经网络的GPT-3“编写代码”非常类似于初学者,尝试输入他们从教程中记住的示例代码:它不在乎其性能或使其架构适合数据集,而只是在尝试预测根据先前在训练数据集中看到的内容,选择下一个最可能出现的角色。 尽管这是一种钝器,但使用GPT-3编写代码就像在StackOverflow上自动搜索简单的代码片段一样。 也许它将在代码自动完成方面找到商业上的成功。

However, this idea, AI learning to create AI, isn’t new, with a first formal conception in 2002. While GPT-3 regurgitates code it has seen previously in broad code prescriptions, Neural Architecture Search — NAS for short — takes this idea to the next level by actually searching the optimal neural network to the dataset. Much of a machine learning engineer’s work relies on testing some of the trillions of potential neural network structures based on intuition and experience, and NAS can considerably reduce that cost.

但是,这个想法,即学会学习创建AI的AI并不是什么新鲜事物,它于2002年首次正式提出概念。尽管GPT-3对代码进行反流处理,但以前在广泛的代码处方中已经看到了这一点,而Neural Architecture Search(简称NAS)则采用了这一想法。通过实际搜索数据集的最佳神经网络将其提升到一个新的水平。 机器学习工程师的大部分工作都依赖于根据直觉和经验来测试数万亿潜在的神经网络结构,而NAS可以大大降低该成本。

NAS uses AI to create better and new AI through intelligent searching of potential structures that humans have never thought of. It’s not generalization, it’s discovery.

NAS通过智能搜索人类从未想过的潜在结构,使用AI来创建更好的新AI。 这不是概括,而是发现。

Generally, implementations of NAS have three components:

通常,NAS的实现包含三个组件:

  • A search space. This is the space in which the Neural Architecture Search can explore potential structures. The space is bounded by a set of possible layers defined by the author (convolutions, pooling, dense, dropout, etc.) and how they can be connected (skipping, layer sizes, etc.). This outlining of the search space requires some human experience.

    搜索空间 。 这是神经结构搜索可以探索潜在结构的空间。 该空间由作者定义的一组可能的图层(卷积,合并,密集,丢失等)以及如何连接(跳过,图层大小等)限制。 搜索空间的概述要求一些人的经验。

  • A search algorithm. The NAS algorithm first samples several network architecture candidates and receives rewards or penalties based on the performances of these ‘child models’ through metrics like accuracy or latency (running time). A human can specify which metrics are more important; for instance, if a small and lightweight model is desired, even at the cost of accuracy, smaller size and better latency may receive higher rewards.

    搜索算法。 NAS算法首先对几种候选的网络体系结构进行采样,并根据这些“子模型”的性能通过准确性或延迟(运行时间)等指标接收奖励或惩罚。 人们可以指定哪些指标更重要; 例如,如果需要一个小巧轻便的模型,即使以准确性为代价,更小的尺寸和更好的等待时间也会获得更高的回报。

  • An evolution strategy. NAS must measure and predict the performance of many proposed child models to obtain feedback for it to learn. This could be done by training each child model, but this is obviously very expensive and new methods have been proposed to conserve computational and time resources.

    进化策略 。 NAS必须测量和预测许多提议的子模型的性能,以获得反馈以供学习。 这可以通过训练每个子模型来完成,但是这显然非常昂贵,并且已经提出了新方法来节省计算和时间资源。

Image for post
Image created by author.
图片由作者创建。

In this sense, NAS resembles one’s typical RL approach to a search problem, but with some changes to tailor it to its task.

从这个意义上说,NAS类似于解决搜索问题的典型RL方法,但是进行了一些更改以使其适合其任务。

For one, there are many potential representations of neural network architectures. Since NAS revolves around the selection and evolution of such complex structures, it’s very important to find an appropriate method to render it. Since it’s incredibly inefficient to represent a neural network as an actual, physical neural network, there have been many proposed depictions of architectures that additionally support easier evolution.

首先,神经网络架构有许多潜在的代表。 由于NAS围绕此类复杂结构的选择和演变,因此找到一种合适的渲染方法非常重要。 由于将神经网络表示为实际的物理神经网络效率极低,因此提出了许多建议的体系结构描述,这些构架还支持更轻松的演化。

One approach is cell-based approach, inspired by the designs of modern image recognition models like Inception, in which humans decide a general architecture of stacking two types of cells: a normal cell (input & output have same dimensions) and a reduction cell (output is 1/2 of input). After the framework of the network — sequences of cells, perhaps multiple of the same cell back-to-back — is set, NAS creates the structures of these two cells.

一种方法是基于细胞的方法,它受到现代图像识别模型(例如Inception)的启发,在该模型中,人类决定了堆叠两种类型的细胞的一般体系结构:普通细胞(输入和输出具有相同的尺寸)和还原细胞(输出是输入的1/2)。 设置好网络框架(单元序列,可能是同一单元的多个背对背)后,NAS将创建这两个单元的结构。

Image for post
Source: Zoph et al. 2018 , image free to share.
资料来源: Zoph等。 2018年 ,图片免费分享。

This model is highly transferrable, since architectures are not dataset-specific and can be implemented with other implementations, and it is much faster at training. However, a cell-based representation is highly dependent on a good human construction of cells.

该模型具有很高的可转移性,因为体系结构不是特定于数据集的,并且可以与其他实现方式一起实现,并且训练速度更快。 但是,基于细胞的表示高度依赖于人类对细胞的良好构造。

Another approach is to represent neural networks as computational graphs, or ‘motifs’. Level 1 operations — primitive 1x1 convolutions or pooling, can be assembled into Level 2 motifs, and recursively onto higher-level network structures. This method is similar to a cell-based structure in that it aggregates several smaller structures, but it allows for much more complexity.

另一种方法是将神经网络表示为计算图或“基元”。 1级操作-原始的1x1卷积或池化,可以组合为2级主题,然后递归到更高级别的网络结构上。 此方法类似于基于单元的结构,因为它聚合了几个较小的结构,但是它允许更多的复杂性。

Image for post
Liu et al 2017Liu et al 2017 , image free to share. ,图片免费分享。

This method of representation can be thought of as more ‘natural’, since neural networks are essentially one-way graphs. Additionally, it can lead to interesting, non-symmetrical neural networks. However, this representation is costly to store and manipulate.

可以将这种表示方法视为更“自然”的方法,因为神经网络本质上是单向图。 另外,它可以导致有趣的,非对称神经网络。 但是,此表示形式的存储和操作成本很高。

With an established neural network representation, there still exist many algorithms to (sometimes) create and select architectures. A random search — one that randomly selects a network from the space — is, surprisingly, a difficult benchmark to beat with a well-designed search space.

使用已建立的神经网络表示,仍然存在许多算法(有时)可以创建和选择体系结构。 令人惊讶的是,随机搜索(从空间中随机选择一个网络)是很难与设计良好的搜索空间相提并论的基准。

Reinforcement Learning is commonly used in NAS methods. A reinforcement-learning based ‘controller’ generates/proposes child model architectures to be evaluated. This controller is a recurrent neural network, which learns to generate token-based sequential representations of potential neural networks. The RNN is updated with the performance of the potential architectures and trained to output better and better-performing networks.

强化学习通常在NAS方法中使用。 基于强化学习的“控制器”生成/提出要评估的子模型架构。 该控制器是一个递归神经网络,可以学习生成基于令牌的潜在神经网络顺序表示。 RNN会根据潜在架构的性能进行更新,并经过培训可以输出更好,性能更好的网络。

Image for post
Zoph & Le 2017Zoph&Le 2017 , image free to share. ,图片免费分享。

This, at least conceptually, is similar to what GPT-3 is doing: modelling a sequence. However, in NAS, the RNN is connected to reinforcement learning system, so it learns to generate networks that perform well based on the dataset. It can be thought of automating what human researchers do — utilize known existing ideas in conjunction with exploring new ideas.

至少从概念上讲,这类似于GPT-3所做的:对序列建模。 但是,在NAS中,RNN连接到强化学习系统,因此它会根据数据集学习生成性能良好的网络。 可以考虑使人类研究人员的工作自动化-利用已知的现有想法并探索新想法。

Another search algorithm is evolution-based, in which architectures are represented as ‘genes’. Mutations involve individually adding connections and nodes; two architectures from parents can be ‘mated’ or ‘crossed-over’ to form child structures. Through clever methods to reduce the search space and keep the population in check like eliminating older models, evolutionary approaches to NAS have shown to be relatively effective, although more expensive than RL methods.

另一种搜索算法是基于进化的,其中架构被表示为“基因”。 变异涉及单独添加连接和节点。 父母的两种架构可以“配对”或“交叉”以形成子代结构。 通过精巧的方法来减少搜索空间并控制种群(如消除旧模型),尽管相对于RL方法而言,NAS的进化方法虽然相对昂贵,但已证明是相对有效的。

Image for post
Source: Stanley & Miikkulainen . Image free to share.
资料来源: Stanley和Miikkulainen 图片免费分享。

While NAS is still in its early days, and its heavy computational cost restricting widespread access, it is very likely that AI-guided AI research may become more prevalent. Especially with fluid representations like the motif (graph)-based ones discussed above, new discoveries in clever mechanisms like LSTMs or Dropout may be found by AI. Another interesting thought experiment is to consider what would happen if the findings of NAS systems were used to improve the very system that discovered it.

尽管NAS仍处于起步阶段,其沉重的计算成本限制了广泛的访问,但AI指导的AI研究很有可能会变得更加普遍。 尤其是对于上面讨论过的基于主题(图形)的流体表示法,AI可能会发现诸如LSTM或Dropout这类聪明机制的新发现。 另一个有趣的思想实验是考虑如果将NAS系统的发现用于改进发现它的系统,将会发生什么。

Additionally, Neural Architecture Searches are being expanded to deep unsupervised learning, which is tremendously important due to the cost of labels and supervised learning. Furthermore, beyond the framework of the neural network, Auto-ML Zero expands NAS to the discovery of new machine learning algorithms like SVM or Decision Tree, using simple mathematical operations as building blocks and an aging evolutionary strategy.

此外,神经体系结构搜索正在扩展到深度无监督学习,由于标签和监督学习的成本,这非常重要。 此外, Auto-ML Zero超越了神经网络的框架,使用简单的数学运算作为构建块并采用了老化的进化策略,将NAS扩展为发现新的机器学习算法(例如SVM或决策树)。

NAS will not be the nail in the coffin for machine learning researchers, just as GPT-3 is not the death sentence for developers’ jobs. When the automated calculator was invented, human calculators began to use calculators to improve their efficiency in their new jobs at financial institutions and accounting, and humans will always have something to offer. Likewise, the path of technology development will continue onward, and breakthroughs are tools to assist further growth.

NAS不会成为机器学习研究人员的棺材,就像GPT-3并不是开发人员工作的死刑一样。 发明了自动计算器后,人类计算器开始使用计算器来提高其在金融机构和会计部门的新工作效率,而人类将始终可以提供一些东西。 同样,技术发展的道路将继续向前,突破是帮助进一步发展的工具。

The automation of AI discovery is incredibly promising, and has the potential to take the capabilities of machine learning far beyond what the human mind would limit it to.

人工智能发现的自动化非常有前途,并且有潜力使机器学习的能力远远超出人类的思维能力。

翻译自: https://towardsdatascience.com/if-youre-hyped-about-gpt-3-writing-code-you-haven-t-heard-of-nas-19c8c30fcc8a

gpt2 代码自动补全

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值