用带遗传算法的神经网络解决贪吃蛇问题,看完这篇,从游戏AI角度理解神经网络!(一)

本文通过介绍如何利用遗传算法优化的神经网络解决贪吃蛇游戏,揭示了人工智能在无预先训练数据情况下自我学习的能力。文章探讨了神经网络的工作原理,遗传算法的应用,以及在设计AI时面临的挑战,强调即使没有预先定义的输入和输出,人工智能仍然需要设计师来构建整个系统。
摘要由CSDN通过智能技术生成


原文地址: Peter Binggeser

在这里插入图片描述

前言

Recently, I’ve become obsessed with artificial intelligence. Specifically, scenarios that enable AI to learn how to accomplish an abstract goal without ever being given labeled training data or explicit instructions.

最近,我变得痴迷于人工智能。特别的,通过让AI学习一个不带有给定标签的训练数据和任何特定指令,来实现抽象的目标,这种愿景是最令我着迷的。

Artificial intelligence can be an over hyped, misused, and often confusing term for many. Instead of another long rant about how AI will change your life (it will) or steal your job (it won’t) — this article will center instead around a concrete and familiar task:

人工智能可能是一个被过度宣传,误用,并且经常使人困惑的名词。这篇文章将会关注一个坚实而又熟悉的任务——而不是一而再再而三的告诉你AI会改变你的人生(确实如此)或者偷了你的工作(并不):

The Game of Snake

在这里插入图片描述Snake has simple rules:
The world is a grid.
The snake can only travel orthogonally along this grid.
This world has a border that kills the snake on contact.
The snake cannot stop moving.
If the snake runs into itself, it dies.
Every time the snake eats, it grows longer.
The goal is to grow as long as possible.

贪吃蛇有简单的规则:
世界是张网格。
蛇只能垂直或水平的在网格中穿梭。
世界有一个蛇碰到就会死的边界。
蛇不能停止移动。
如果蛇碰到自己,就会死。
每次蛇迟到果实,就会变得更长。
目标是让蛇变得尽可能长。

When playing the game, there is a decision to make each time the snake takes a step forward: continue straight, turn left, or turn right.

当玩这个游戏的时候,每次让蛇前进一步时都要做选择:继续向前走,左转,或右转。(视点是从蛇头看去的)

Our goal is to create an AI to learn how to make this same decision. First assessing the state of the world that the snake lives in, then choosing the move that will keep it alive and continue to grow longer.

我们的目标是创造一个能够学习如何做决定的AI。(做决定前)首先评估蛇所在世界的状态,然后选择能够让蛇存货并且继续变长的那一步。

在这里插入图片描述

Choosing a Method

There are many methods, algorithms, and techniques that could be used to solve Snake. Some of these could fall under the umbrella term of AI. I’m going to focus on a single method: genetic random mutation of a neural network.

有很多方法、算法和技巧能够用来解决贪吃蛇问题。有些能够算到AI的领域内。我将会专注于一个单一的方法:使用随机漂移理论的神经网络。(同遗传算法)

This is because:

I don’t have a dataset of high scoring Snake play throughs to use to train a neural network by example.
Personal interest in seeing if it’s possible to evolve logic that can play Snake through only random mutations.

这是因为:
我没有一个用于训练网络的高分完成贪吃蛇任务的数据库。
而且我个人的兴趣是看看进化算法能否仅仅只靠参数的随机变异来完成贪吃蛇任务。
Genetic random mutation of a neural network is likely the most unfamiliar phrase in this article for many readers — so let’s break down what is happening under the hood before going any further.

经遗传算法优化的神经网络很可能对绝大多数的读者来说都不熟悉——让我们深入之前把算法的神秘面纱揭开。

什么是神经网络?

Neural networks are like a modular synthesizer. A key is pressed, sending an electrical signal through a configuration of circuits designed by the musician to achieve a desired output tone — like a crunchy bass or smooth echoing strings神经网络就像是合成器。一个键按下去后,经过集成电路最终变成了乐师想要的音调。

A neural network is a kind of algorithm that can be used to determine the abstract relationship between some input data and a desired output. Typically, this is accomplished by training a neural network on thousands of examples. Over time the network will begin to identify the aspects of the input data that are most useful to determine the desired outcome. To achieve this, the neural network slowly adjusts coefficients and weights used in a series of complex formulas to process the input data as it is shown each additional example.

神经网络是一种能够决定输入数据和理想的输出的一种抽象关系转换的算法。通常的,完成以上过程需要上千个样本。一段时间后,网络会开始找出输入数据的特征,一般是能找出理想输出的最有效的特征(权重最高的特征)。为了达到目标,网络在每经过一个样例时,会慢慢用一系列复杂的准则调整系数和权重来对输入数据进行加工。

Neural networks come in many shapes, sizes, and varieties: convolutional, recurrent, long-short-term-memory, etc. Designing the right configuration for a given problem can be difficult, confusing, and feel like a bit of a dark art. This is where genetics come in.

神经网络有很多形状、大小和变种:卷积的,循环的、时间序列的等等。设计一个对于给定问题的网络配置可能是困难的、令人困扰的,并且感觉有点像黑魔法。这里就有遗传算法的用武之地。

遗传算法是什么?

Instead of picking a network type and then slowly training it based on example Snake gameplay, we are going to create a scenario for one to evolve on its own.

我们将创造一个方案让它自己进化而不是选择一种网络模型让它根据贪吃蛇的玩法慢慢训练。

All changes to the neural nets will be random — not through direct feedback of playing the game move-by-move. Overtime, small random changes to the neural networks should lead to a fully functioning AI as the top performers in each generation survive to breed the next.

网络中发生的所有变化都是随机的——没有像玩游戏一样每一个动作都有反馈。一段时间后,小的随机的变化会生成一个全功能的AI因为每一次迭代中表现最好的都被保留下来。

Our evolutionary process is going to function like so:
进化过程如下:
Randomly tweak the knobs and cables driving our neural network to create an initial set of unique versions.
Let each of those neural nets play Snake.
After every neural net has finished a game, select which neural nets performed best.
Create a new generation of unique neural networks based on randomly tweaking those top performing neural nets.
Repeat from step 2.

1.随机的扭曲节点和连接来创造一系列初始的网络
2.让每一组网络都进行贪吃蛇游戏
3.在每一个组完成游戏后,选择当中得分最高的。
4.在3中得到的最高的那个网络的基础上进行扭曲来创建一组新的网络。
5.从2开始重复

So now we can just relax and let our AI evolve naturally, right? Wrong.
然后我们能够让AI自由进化了嘛?不行。

Artificial Intelligence Still Needs a Designer

The genetic algorithm replaces the need for upfront training data, but it is up to us (the designer) to design the larger system that enables this to work. Specifically, we need to choose the input data, output data, and decide what defines good performance in Snake. To channel our synthesizer metaphor from above: we still need to make a keyboard, a speaker, and decide what kind of sound we want to hear.

A decent first approach for our input data is to provide the neural networks with the same information that we have. We play the game by looking at the screen: the colors of the pixels that make up the game’s environment. However, this would require the neural net to form connections that represent nearly every rule of Snake as described earlier. Learning about the walls, where the snake is, its direction, what food is and how to find it. The input data would need to be the color of every pixel of the game: hundreds or maybe even thousands of inputs. This is by no means impossible — but it is a lot more complicated than it needs to be.

Design from AI’s Perspective

Imagine playing Snake in the from a first-person point of view. Be the snake. Give the world some depth and picture yourself making left and right turns to avoid the giant walls of the world and the giant moving “walls” of your body and tail.

  • 2
    点赞
  • 16
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值