逻辑回归深度学习_深度学习不合逻辑

最新推荐文章于 2023-07-07 22:52:17 发布

weixin_26630173

最新推荐文章于 2023-07-07 22:52:17 发布

阅读量533

点赞数

文章标签：深度学习 python 机器学习人工智能 tensorflow

原文链接：https://towardsdatascience.com/deep-learning-is-not-logical-ce0941b74f0a

版权

逻辑回归深度学习

In this article I will show that deep learning is incapable of understanding logic and structure, and point to a potential solution inspired by neuroscience. This is important since most worthwhile problems in the world need to be solved logically, but modern deep learning largely failed in that department.

在本文中，我将展示深度学习无法理解逻辑和结构，并指出受神经科学启发的潜在解决方案。这很重要，因为世界上最有价值的问题需要从逻辑上加以解决，但是现代深度学习在该部门大为失败。

符合逻辑意味着什么？ (What does it mean to be logical?)

Logics, in statistical terms, is analogous to extreme confidence in a prediction given a set of observations. In a purely logical world, 1 + 1 = 2 is always true, whereas in a more random, non-logical system, 1 + 1 = 2 might be true only 90% of the time.

从逻辑上讲，逻辑类似于给定一组观察结果的预测的极高置信度。在纯逻辑世界中，1 + 1 = 2始终为真，而在更为随机的非逻辑系统中，1 + 1 = 2可能仅在90％的时间内为真。

Some logics arise from our understanding of the structure of the world, such as laws of physics. For example, you can’t just drop a ball and not expect it to fall to the ground. In an unstructured world, however, anything is possible, rendering it difficult to predict the future. For example, stock market is an unstructured world. My father has been watching over Novax Pharma since May, but he couldn’t have predicted that the CEO would sell out his stock and make its price plummet.

一些逻辑是由于我们对世界结构的理解而产生的，例如物理定律。例如，您不能只丢一个球而不能指望它会掉到地上。但是，在一个非结构化的世界中，一切皆有可能，因此很难预测未来。例如，股票市场是一个非结构化的世界。自5月以来，我父亲一直在关注Novax Pharma，但他无法预料到CEO会抛售其股票并使价格暴跌。

Statisticians might claim that we live in a random and unstructured world, but there are also many aspects of the universe that are predictable and logical, such as physics, mathematics, and science. Being logical allows us to plan far into the future, and form concrete paths that help us reach our goals. Almost all difficult problems that are worth solving need to use reasoning and logic.

统计学家可能会声称我们生活在一个随机无序的世界中，但是宇宙中还有许多可预测和逻辑的方面，例如物理，数学和科学。具有逻辑性使我们能够规划未来，并形成帮助我们实现目标的具体路径。几乎所有值得解决的难题都需要使用推理和逻辑。

纯粹的深度学习无法学习逻辑 (Pure deep learning cannot learn logic)

Can deep learning achieve the holy grail of logic? DeepMind asked this question in their 2019 paper where they implemented a transformer model to solve math problems [1]. Their result was impressive; the model reached higher than 90% accuracy in simple addition, subtraction, division, and multiplication. But the performance dropped to 50% when the operations were mixed, which suggested that the model was just guessing the solution rather than solving the problem step-by-step.

深度学习能否达到逻辑的圣杯？ DeepMind在他们的2019年论文中问了这个问题，他们在其中实现了一个变压器模型来解决数学问题[1]。他们的结果令人印象深刻。该模型在简单的加，减，除和乘法运算中达到了90％以上的精度。但是当混合操作时性能下降到50％，这表明该模型只是在猜测解决方案，而不是逐步解决问题。

There are other examples in deep learning where the models or the agents are so adept at their tasks that they generate illusions of logic and reasoning.

深度学习中还有其他示例，其中模型或主体非常擅长于其任务，从而产生逻辑和推理错觉。

OpenAI’s GPT-2 language model can generate human-like essays, but closer inspection shows that its outputs are not logically constructed and it simply mimics what it was trained to regurgitate in the training data. For example, the model sometimes writes about fires happening under water.

OpenAI的GPT-2语言模型可以生成类似人类的文章，但是仔细检查可以发现，其输出不是逻辑构造的，它只是模仿了训练后在训练数据中反省的内容。例如，模型有时会写出水下发生的火灾。

The video game agents developed by DeepMind and OpenAI to play StarCraft 2 and DotA 2, respectively, seem to operate logically, because they managed to defeat top professional gamers in their respective games. What most people don’t appreciate, however, is that the agents defeated their human opponents by leveraging their inhuman reflex and movements, not by making smart plays. For example in DotA 2, there is an action that a player can take that instantly kills another player if the health of the target is below a threshold, but little to no damage otherwise. Using it above the threshold is undeniably a mistake, but OpenAI’s DotA 2 bot does it all the time. AlphaStar, DeepMind’s AI for StarCraft 2, consistently tilts and loses games when human opponents use obscure strategies that cannot be overcome by mechanical skill alone.

DeepMind和OpenAI开发的分别玩StarCraft 2和DotA 2的视频游戏代理似乎在逻辑上运作，因为他们设法在各自的游戏中击败了顶级专业游戏玩家。但是，大多数人不理解的是，特工通过利用其非人性的反射和动作击败了人类对手，而不是通过巧妙地打法。例如，在DotA 2中，如果目标的生命值低于阈值，则玩家可以执行一项动作立即杀死另一名玩家，否则不会造成任何伤害。毫无疑问，在高于阈值的情况下使用它是一个错误，但是OpenAI的DotA 2机器人一直都在这样做。 DeepMind用于《星际争霸2》的AI AlphaStar在人类对手使用无法通过机械技能单独克服的晦涩策略时不断倾斜并输掉比赛。

You can argue that these agents might overcome the above flaws if they are trained longer. This may be true, but these issues typically do not manifest in even average human players. It is clear that the agents are missing an ingredient that is yet to make them as intelligent as human.

您可以争辩说，如果对它们进行长期培训，它们可能会克服上述缺陷。这可能是正确的，但这些问题通常甚至不会出现在普通人类玩家身上。显然，这些代理商缺少一种尚未使它们像人类一样聪明的成分。

神经科学给我们答案 (Neuroscience gives us the answer)

Back in February I stumbled upon the interview by Lex Fridman with theoretical neuroscientist Jeffrey Hawkins. In the interview, Hawkins described that neuroscientists are hypothesizing that the neural mechanism for spatial navigation in human could also be responsible for human’s ability to navigate abstract ideas. I mean, why not? Solving problems logically uses the same principle as spatial navigation in that both require planning a route from the origin to the destination.

早在2月，我偶然发现了Lex Fridman对理论神经科学家Jeffrey Hawkins的采访。在采访中，霍金斯描述了神经科学家正在假设人类空间导航的神经机制也可能是人类导航抽象思想的能力的原因。我的意思是，为什么不呢？从逻辑上讲，解决问题所使用的原理与空间导航相同，因为两者都需要规划从起点到目的地的路线。

In 2018, DeepMind happened to implement the world’s first agent that used the neural substrate for spatial navigation (called grid cells) [2]. The agent was tasked to navigate maze-like environments, and the involvement of grid cells taught the agent to take short-cuts all the time and create new short-cuts after the original short-cuts were blocked. This is an incredible feat, something the original discoverer of grid cells reviewed as “notoriously difficult in robotics”. Most importantly, however, their experiments found that the neural network used in the agent developed grid cell-like properties by just having the network estimate where the agent was located and facing in the maze every time it moved. To put it simply, their finding suggested that self-awareness (in this case spatial self-awareness) is a key ingredient for solving any kind of navigational problem. This shouldn’t come as a surprise for most of us, since assessing where we are relative to goals is crucial for achieving them.

在2018年，DeepMind恰好实现了世界上第一个使用神经基质进行空间导航的代理 (称为网格单元)[2]。该代理的任务是在迷宫般的环境中导航，网格单元的参与使该代理始终可以使用快捷方式，并在原始快捷方式被阻止后创建新的快捷方式。这是一项令人难以置信的壮举，网格单元的原始发现者认为它“在机器人技术方面非常困难”。然而，最重要的是，他们的实验发现，仅通过让网络估算代理每次移动时代理所位于的位置以及面对迷宫的位置，该代理中使用的神经网络就可以开发出类似于网格细胞的属性。简而言之，他们的发现表明， 自我意识(在这种情况下是空间自我意识)是解决任何种类导航问题的关键要素 。对于我们大多数人来说，这不足为奇，因为评估我们与目标的相对位置对于实现目标至关重要。

Here is the bombshell idea of this article. We have been training deep learning models by minimizing objective error of inference, but would it be possible for the models to understand the structure of this world and navigate abstract ideas if we minimize its error on self-awareness?

这是本文的重头戏。我们一直在通过最小化推理的客观误差来训练深度学习模型，但是如果我们将其在自我意识上的误差最小化，这些模型是否有可能理解这个世界的结构并导航抽象思想？

实施自我意识的AI (Implementing self-aware AI)

Assuming that self-awareness is one of the necessary keys for AI to solve problems logically, how would we even implement it? There are some takeaways from the grid cell paper that can shed some light on this:

假设自我意识是人工智能从逻辑上解决问题的必要关键之一，那么我们将如何实现它呢？网格单元纸有一些收获，可以使您对此有所了解：

The error on “self-awareness” was the difference between the ground-truth and the agent’s own predicted location and orientation.
关于“自我意识”的错误是地面真相与行动者自己的预测位置和方向之间的差异。
The ground-truth location and orientation (the self-awareness of the agent) were represented by neural activation signatures, each unit of which fires when the agent is at a unique location and orientation.
真实的位置和方向(代理的自我意识)由神经激活签名表示，当代理处于唯一的位置和方向时，每个单元都会触发。
Grid cells are activated at regular intervals relative to where the agent is located and how it is orientated
网格单元以相对于代理程序所在的位置及其定向方式的规则间隔激活
Grid cell activations emerge just prior to the final linear layer as the model trains to minimize the loss on self-awareness.
当模型训练以最小化自我意识的损失时，网格单元激活就出现在最终线性层之前。

To summarize, “self-awareness” will simply fall out of training as long as we define the environment which it exists in, and minimize the error on its prediction. Unfortunately, experiments involving grid cell has only been done on spatial navigation, so it is unclear whether it could be adapted to a non-spatial system.

总而言之，只要我们定义了“自我意识”的存在环境，并且将其预测误差降到最低，“自我意识”就不会受到训练。不幸的是，涉及网格单元的实验仅在空间导航上进行，因此尚不清楚它是否可以适用于非空间系统。

But I do have an experiment in mind. A hot-topic of research in NLP is to understand how to teach a model to capture cause-and-effect relationship. The earlier example I mentioned regarding GPT-2 writing about fires happening under water is an example of the model confusing cause-and-effect with correlation. Just because there are sentences that says water frequently extinguishes fire doesn’t mean fire influences water. Would grid cells that learn to navigate the vector space of word embeddings better capture this relationship?

但是我确实有一个实验的想法。 NLP研究的一个热门话题是了解如何教模型以捕获因果关系。我前面提到的有关GPT-2编写的有关在水下发生火灾的文章是该模型将因果关系与相关性混淆的一个示例。仅因为有些句子说水经常扑灭大火，并不意味着火会影响水。学习导航词嵌入向量空间的网格单元是否可以更好地捕捉这种关系？

结论 (Conclusion)

While the above experiment is speculative and this article could be a complete false alarm, it is also undeniable that deep learning has hit a wall, and a worthwhile way forward for the research community is to explore new ideas (gee, thanks captain obvious!). During my computer science thesis project I explored the potential of knowledge graph in imparting neural network the ability to reason. This project largely failed but at least I discovered that graph wouldn’t work and it helped me to move on to this new idea.

尽管上述实验只是推测性的，本文可能是一个完全错误的警报，但不可否认的是，深度学习已步入正轨，对于研究社区而言，值得探索的前进之路是探索新的想法(老兄，感谢船长！) 。在我的计算机科学论文项目中，我探索了知识图在赋予神经网络推理能力方面的潜力。这个项目在很大程度上失败了，但至少我发现该图不起作用，它帮助我继续了这个新想法。

As of writing this article, I am taking a break from deep learning to focus on my last year of education, so I am passing on the torch to you. For those of you who are tired of optimizing existing architectures or attracted to this idea, I heavily encourage you to look into DeepMind’s paper on grid cell and adapt it to a non-spatial application. Who knows, you might discover a new architecture that performs better than the existing ones. But if you are, just make sure you remember that you heard it from here first.

在撰写本文时，我正在从深度学习中休息一下，专注于我的最后一年的教育，因此我正在将火炬传递给您。对于那些厌倦了优化现有架构或被这个想法所吸引的人，我强烈建议您研究DeepMind关于网格单元的论文，并将其适应于非空间应用。谁知道，您可能会发现一种性能比现有体系结构更好的新体系结构。但是，如果您愿意，请确保您记得自己是从这里首先听到的。

[1] D. Saxton, E. Grefenstette, F. Hill, and P, Kohli, Analysing mathematical reasoning abilities of neural models (2019), ICLR 2019

[1] D.Saxton，E.Grefenstette，F.Hill和P，Kohli，分析神经模型的数学推理能力 (2019年)，ICLR 2019年

[2] A. Banino, et al. Vector-based navigation using grid-like Representations in Artificial Agents (2018), Nature 26, 1.

[2] A. Banino等人。使用人工代理中的类似网格表示的基于矢量的导航 (2018)，Nature 26，1。

翻译自: https://towardsdatascience.com/deep-learning-is-not-logical-ce0941b74f0a

逻辑回归深度学习

weixin_26630173

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
逻辑回归深度学习_深度学习不合逻辑

逻辑回归深度学习In this article I will show that deep learning is incapable of understanding logic and structure, and point to a potential solution inspired by neuroscience. This is important since most worth...
复制链接

扫一扫