Teaching Mario to play with himself: AI, machine learning, and Super Mario Bros.

原文地址:http://www.extremetech.com/extreme/197886-teaching-mario-to-play-with-himself-ai-machine-learning-and-super-mario-bros

YouTube有视频

One of the most challenging aspects of artificial intelligence is teaching the computer how to measure, understand, and react to the world around us. Actions that are second nature to a human must be painstakingly “taught” to a robot. A team at the University of Tubingen in Germany has created a project that turns the concept of a real-world robot AI on its head by tackling a different challenge: teaching Mario to play his own game.

The team has created a video that explains how the system works, step-by-step, but the high-level overview is that Mario’s various actions and responses can all be quantified as values. The AI appears to start with very basic information about how to navigate the world and where he is in relation to various other objects. The team also created a means of tracking Mario’s curiosity about the world (he explores his environment more when curious) and how much he focuses on collecting coins (represented by hunger).

As it encounters enemies, the AI notes their existence. When queried with “What do you know about Goomba?” Mario responds with “I do not know anything about it” After experimental interactions, Mario learns that he can jump or land on Goomba, and that Goomba dies when he does so.

MarioLearning

This, then, is translated back into human speech: “If I jump on Goomba, then he certainly dies.” (This last may be an artifact of German-English translation and sentence structure). Mario learns how to navigate his environment, how to jump to higher areas to reach inaccessible locations, and how to trigger question blocks to grab power ups or other useful items. The AI has different rules for whether Mario is small or big and his behavior can vary depending on whether he’s got a fire flower or just a mushroom.

Reafference principle

How the reafference principle works in real life.

The slide above describes how the AI learns about its environment. Mario has an idea about how the world works — the first time he successfully jumps on Goomba, he says “If I jump on Goomba, it maybe dies.” He then tests this hypothesis on future Goombas, comparing the expected outcome with the actual result.

Mario doesn’t use scripted responses — he responds to syntax and understands a vast array of words and phrases. He can be told things “If you jump on Goomba, Goomba dies,” or he can learn them on his own. His complete syntax tree is shown below:

Mario's syntax tree

Humans, of course, apply these principles of learning and communication thousands of times a day, but we learn them when we’re infants. Teaching Mario that jumping on a Goomba will kill it is a fascinating example of AI, particularly since Mario simultaneously learns that jumping into a Goomba will hurt or kill himself.

Projects like this might one day be a useful step in training more advanced artificial intelligences rules about how to interact with humans. A properly constructed game could teach an AI the most basic rules of interacting with its environment first, then introduce more complex concepts and ideas as the game ran on. The best games already use these sorts of rules; many games will fold a basic tutorial on movement, attacks, and various player capabilities into the game itself, revealing these options as the game progresses and unlocking new abilities as the player demonstrates mastery of previous concepts. The complete video of the project is below:

If players can learn these rules within the relatively fixed and simple game environment, AIs may be able to learn them as well. The dangers and risks of AI have been explored a great deal of late, with multiple scientists calling for caution in our continuing research. Teaching Mario to play his own game seems relatively tame compared to the risk of societal upset and autonomous security drones.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值