深度学习构建ai的难点_构建学习型AI

最新推荐文章于 2022-06-28 12:04:05 发布

cunchi8090

最新推荐文章于 2022-06-28 12:04:05 发布

阅读量711

点赞数

文章标签：游戏 python 人工智能 java 大数据

原文链接：https://www.experts-exchange.com/articles/11183/Constructing-a-Learning-AI.html

版权

深度学习构建ai的难点

As game developers, we quickly learn that Artificial Intelligence (AI) doesn’t need to be so tough. To reference Space Ghost: “

作为游戏开发人员，我们很快了解到人工智能（AI）不需要那么强悍。要引用“太空鬼”：

Moltar, I have a giant brain that is able to reduce any complex machine into a simple yes or no answer.” Despite the humor of their conversation, he pegged the nature of AI. Turn anything into a simple yes or no answer. A simple AI, might be, “If player is to the left, move the enemy ship to the left” which is a very simple AI for interception or chase, but it is not a learning AI. A learning AI must have the ability to decide to go right for reasons it was not specifically programmed for in the first place. Moltar，我的大脑非常聪明，可以将任何复杂的机器简化为简单的“是”或“否”答案。尽管他们的谈话很幽默，但他还是盯住了AI的本质。将任何内容变成简单的是或否答案。一个简单的AI可能是“如果玩家在左边，则将敌舰移到左边”，这是一种非常简单的AI，用于拦截或追赶，但不是学习型AI。学习型AI必须有能力决定其正确性，因为它最初并不是专门为编程而设计的。

A learning AI is highly important for ensuring that non-player characters (NPCs) will have more interactive choices for the players, without requiring developers or AI designers to program AI for each possible option combination in the game. (A game with 10 spells, 10 items and 10 effects would be 10*10*10 combinations, or 1000 combinations to be programmed for.)

学习型AI对于确保非玩家角色（NPC）为玩家提供更多的交互选择，而无需开发人员或AI设计师为游戏中的每个可能选项组合编写AI都是非常重要的。（具有10个咒语，10个物品和10个效果的游戏将是10 * 10 * 10组合或要编程的1000组合。）

To cover the AI, I’ll discuss two different types of learning AI, reactionary and meditative.

为了涵盖AI，我将讨论两种不同类型的学习型AI，即React型和冥想型。

反动学习 (Reactionary Learning)

反动是游戏立即学习的地方。它仍然必须取决于预先编程的选择，但是这些选择可能取决于历史数据。 IE，在 Mortal Combat, let’s say the player keeps winning with one particular move when approaching as a first strike. After a few times of this happening, the AI should change its response to a different approach particularly if it is known counter to the player’s move of choice. 《真人快打》中，假设玩家在第一次突袭时一直以一个特定的动作获胜。在发生几次这种情况之后，AI应该更改其对其他方法的响应，特别是如果已知它与玩家的选择举动背道而驰。

Reactionary requires that the game’s AI is not just paying attention to the present state of things, but is also keeping certain histories of past conditions. Normally the AI might choose a basic punch or kick against a player as they approach, but the AI could be extended to look at what the last 3 action choices were after the player started moving towards the AI NPC. If there is a majority, I.E. 2 or 3 of the same move type, presume that is a favorite, and select a choice that counters it. I.E. if 2 out of the last 3 attacks chosen were low kicks, the AI may instead choose to jump.

反动要求游戏的AI不仅要关注事物的当前状态，而且还要保留过去状况的某些历史记录。通常，AI可能会在玩家接近时选择基本的拳法或向玩家踢脚，但是AI可以扩展为查看玩家开始向AI NPC移动后的最后三个动作选择。如果多数（即IE 2或3）具有相同的移动类型，则假定它是收藏夹，然后选择一个与之相对的选择。 IE，如果最后3次攻击中有2次是低踢，则AI可能会选择跳高。

冥想学习AI (Meditative Learning AI)

与反动学习不同，冥想学习不会在游戏中实时进行处理。 IE浏览器不会立即使用。如果它是仅客户端游戏（无服务器），则游戏可能仍会分析大量数据，但这通常是在关卡完成期间，一直保存或始终位于单独的较低级别线程中。这项学习的重点在于，它可以查看更多数据，并设计出用于游戏的新策略。这不仅限于动作的选择，还包括要注意的环境变量。

To start out, we’ll break the game up into abilities and metrics. Abilities would be what an NPC can do. Metrics include data points to listen to. (Please note that metrics can also include reactionary data to pay attention to, but for simplicity, I’ll leave that option out)

首先，我们将把游戏分为能力和指标。能力将是人大可以做的。指标包括要收听的数据点。

Abilities might include

Attack with ranged weapon

用远程武器攻击

Attack with melee weapon

近战武器攻击

Change weapon

换武器

Heal myself

治愈自己

Run away

逃跑

You can see that the abilities don’t specify which weapon, but simply choose a weapon with range or melee. These are the choices an NPC might have. These can have simple requirements as to whether or not an NPC can even do it. For instance, an NPC who only has a dagger, can not have “Attack with ranged weapon”, nor could they have “Change weapon” as an option.

您会看到这些能力并没有指定哪种武器，而只是选择了具有射程或近战的武器。这些是NPC可能拥有的选择。对于NPC是否可以做到，这些要求可能很简单。例如，只有匕首的NPC不能使用“远程武器攻击”，也不能选择“更改武器”。

Metrics might include

Last change in opponent health from this -

对手健康状况的最新变化-

This looks at the impact this had on the opponents health last time the ability with this was used.

Current opponent health

当前对手的健康

Distance of opponent

对手距离

Current health

当前健康

Average health change per action

每个动作的平均健康变化

The metrics are simply points of data. For each ability, you would give it a handful of metrics to pay attention to. It should only be what you think are obviously related to the ability most of the time. Depending on the complexity of the processing requirements, it will have less options. For instance, one metric might be the "total possible strength of any spell opponent could cast". That would require that the metric looks at the opponents’ mana, then through all of their spells they could execute for that mana amount, and then determine the amount of damage that might do, including checking armor types of the NPC, effects, etc… Where as “Opponent HP” is considerably less computation.

指标只是数据点。对于每种能力，您都会给它提供一些值得关注的指标。它应该只是您认为与大多数时间明显相关的功能。根据处理要求的复杂性，它将有较少的选择。例如，一个度量标准可能是“任何对手可能施展的总法术强度”。这将要求度量标准查看对手的法术力，然后通过他们的所有咒语以该法术力量执行，然后确定可能造成的伤害量，包括检查NPC的装甲类型，效果等。其中“ Opponent HP”的计算量要少得多。

For each of the metrics on an ability, you need to supply a percentage that the value of that metric matters. This should have a combined limit of less than 70%. (or some range) The AI will use the remaining percentage to make stuff up and see how it works.

对于一项功能的每个指标，您需要提供一个百分比，该指标的值很重要。组合限制应小于70％。（或一定范围内）AI将使用剩余的百分比来填充内容并查看其工作方式。

为NPC创建AI (Creating an AI for an NPC)

首先，设计您的NPC。也许是一个拥有匕首和治疗药水的兽人。其次，选择您认为最常见的能力，例如“近战武器攻击”和“治疗”。最后，您需要确定某些事物的偏好百分比。例如，它将选择40％的“近战武器攻击”和30％的“治疗”。这剩下30％的不确定性。人工智能将使用它进行学习，类似于我们之前针对每种能力建立的指标。

Optionally, you might also give it X coins for spending on items, or upgrading spells, for the NPC to use. A completely AI driven character would have 0 abilities by default, and simply be given a larger amount of coins to spend on items and spells.

（可选）您也可以给它X枚硬币，用于物品购买或升级法术，以供NPC使用。完全由AI驱动的角色默认情况下将具有0的能力，并且只需获得大量硬币即可用于物品和咒语。

Now that the general setup is made, the AI needs a way to study different setups. This part I usually refer to as a learning engine.

既然已完成常规设置，则AI需要一种研究不同设置的方法。我通常将此部分称为学习引擎。

学习引擎 (Learning Engine)

学习引擎是发生假战的地方，它会尝试不同的可能性并分析结果。引擎会创建您选择的NPC的变体。 (typically using a database to store all the data) For instance, it had 30% of its actions still available in the AI. It will randomly select a handful of other features to account for it. For instance, perhaps it chooses 30% for “Run Away”, or 10% for “Run Away” and 20% for “Heal”. Since the unit already had heal at 30% that would mean that heal now has a 50% chance to use the ability. Optionally, the AI may also choose to use only 10% leaving 20% left. But these are all choices the AI picks at random. The AI holds no preferences for what to pick. If any of the NPCs are identical, it will replace it with another new random selection.

It also does this same thing with the choices of metrics that each ability will pay attention to. Using the available percentages to add other random metrics to pay attention to. For instance, Perhaps it will add “Does opponent have [X]” and check if the player has a “water flask” or a “shield”. The water flask will probably not matter ever, but the AI will experiment any way.

每种功能都会注意的指标选择也做同样的事情。使用可用百分比添加要注意的其他随机指标。例如，也许它将添加“对手有[X]”并检查玩家是否有“水瓶”或“盾牌”。水瓶可能永远不会有关系，但AI会以任何方式进行试验。

If the AI has coins to spend, it would choose extra features/items prior to applying percentages, and then choose from the increased list of available abilities.

如果AI有钱可花，它将在应用百分比之前选择其他功能/项目，然后从增加的可用功能列表中进行选择。

Next, it takes each of the NPCs it generated and pits each one against each other one at least X amount of times in a simulated no graphics battle. (X could be 3 times or 1000) Each gets a score at the end of the battle, based on how many hit points it started vs. how many it ended with, Same for magic points, The time it took to win, etc... Each AI is then ranked in a list, to see which AI came out the best and then second best and so on. AI's that lost 100% of the time, are established as pretty dumb and discarded. AI's that win 100% of the time should be flagged for review, to see if they exploited a weakness in your game balancing.

接下来，在模拟的无图形战斗中，将生成的每个NPC放置到彼此之间至少X倍的次数。（X可能是3次或1000次），每个战役都会在战斗结束时获得分数，具体取决于它开始时的击中点数与结束时的击中点数，魔法点数相同，获胜所需的时间等。然后，每个AI都排在一个列表中，以查看哪个AI表现最好，然后排名第二，依此类推。

When a player enters an Orc dungeon, there could be X number of areas in the tunnel, and as the player progresses, the game engine selects the AI to use (the set of abilities and metrics to pay attention to) from the list of AI’s for orcs, by rank, making it tougher the farther you proceed.

当玩家进入兽人地牢时，隧道中可能会有X个区域，并且随着玩家的前进，游戏引擎会从AI列表中选择要使用的AI（要注意的一组能力和指标）对于兽人而言，按等级排序，则越难进行。

玩家的学习引擎 (Player’s Learning Engine)

玩家的学习引擎的运行方式几乎相同。除非您必须设计一些播放器角色，例如IE，它们必须处于特定级别。例如，选择您的起始关卡角色，然后选择它们在第2关处的位置的一两个变化，然后在第3关处进行更多的变化，等等……

Each player option then gets pitted against different AIs established earlier in a variety of battles. If level 1 never beats any of the AI’s, this shows you that you probably have a balancing issue for the game. Either Orcs shouldn’t be found in level 1 areas, or the Orcs abilities should be reduced. (the learning engine shouldn't pit player vs. player unless you expect that as a common feature in your game.)

然后，每个玩家选项都会与之前在各种战斗中建立的不同AI相抵触。如果1级从未超过任何AI，这说明您可能在游戏中遇到了平衡问题。不能在1级区域内找到兽人，或者应该降低兽人的能力。

Typically, for an average level 1 player, you would want simpler AIs that win only 10% of the time against a level 1 to be right at the beginning. Then as they progress, prior to levelling up, the player would find AIs that tend to win about 40% of the time, and so on. The player could select a difficulty level, which would simply limit the AI choices to ones that tend to win less.

通常，对于平均水平为1的玩家，您会希望较简单的AI刚开始时能赢得10％的时间，而相对于1级的玩家。然后，随着进度的发展，在升级之前，玩家会发现倾向于赢得大约40％的时间的AI，依此类推。玩家可以选择一个难度级别，这只会将AI的选择限制为往往会赢得较少的选择。

回顾 (Recap)

因此，这基本上使您的游戏AI适应了统计分析工具，因此您不必手动完成确定AI的所有工作。随着游戏和AI的发展，您可以调整AI指标和功能来关注玩家更常使用的功能。

Despite this section being a recap, there is still more. But the next items are here to give some ideas of how to bend and apply this to your game.

尽管对本节进行了回顾，但还有更多内容。但是接下来的几项是关于如何弯曲并将其应用于游戏的一些想法。

自动平衡 (Automatic Balancing )

平衡指标非常容易。

Character deals X damage, defends X attacks, wields X powers, etc…

角色造成X伤害，防御X攻击，使用X力量，等等……

Metrics are simple to compare from NPC to NPC or NPC to character. The challenge is getting the AI to change over time. Players eventually find weaknesses in the AIs, and will learn to exploit them to gain abilities faster. Guides are posted, masses apply it, making a unbalanced MMO RPG with 95% one class type, or all preferring one specific weapon.

度量很容易从NPC到NPC或从NPC到字符进行比较。挑战在于使AI随时间变化。玩家最终会发现AI的弱点，并将学会利用它们来更快地获得能力。指南被发布，群众使用它，制作出不平衡的MMO RPG，其中95％是一类，或者所有人都偏爱一种特定的武器。

To resolve this, metrics can be recorded live in game, particularly about a specific player type, at a particular level and against particular AIs. If it is common for a particular set-up to win far more often than any other, I.E. a thief wins 98% of the time, while the wizard class wins only 80% of the time. Common characters of both types can be run through the player learning engine to determine AI that can help balance that out. Then start increasing the number of times the particular player set-up runs into that AI type. The monster can stay the same, I.E. each player would run into an Orc. But a wizard would see an AI tuned to wizards, while thieves would see AIs tuned for a thief.

为了解决这个问题，可以在游戏中实时记录指标，尤其是有关特定玩家类型，特定级别和特定AI的指标。如果特定设置赢得胜利的频率比其他任何游戏都要高，那么IE小偷会赢得98％的时间，而向导类只会赢得80％的时间。可以通过玩家学习引擎运行这两种类型的通用角色，以确定可以帮助平衡这一点的AI。然后开始增加特定玩家设置遇到该AI类型的次数。怪物可以保持原样，即每个玩家都会遇到兽人。但是向导会看到将AI调整为向导，而小偷会看到将AI调整为小偷。

基本应用 (Basic Application)

即使您不花时间构建学习引擎来尝试所有这些战斗，您仍然可以通过开发AI来受益，因为AI可以响应特定指标。除了作为区分事物的好方法之外，您仍然可以在以后应用学习引擎，而不必重写您的AI。

Naturally, I've not included every detail; like the fact that some metrics need limits: I.E. health metrics would pay attention to health dropping below a certain level, and that the learning engine may try experimenting with that - or the fact that abilities change if you change your equipped items mid-battle. The point of this is article that you have a general idea of how to apply the ideas to get started.

自然，我并没有包含所有细节。例如某些指标需要限制的事实：IE健康状况指标会注意将健康状况降低到特定水平以下，学习引擎可能会尝试对此进行试验-或事实是，如果您在战斗中更换装备齐全的物品，能力就会改变。本文的重点是您对如何应用这些想法入门有了一个总体思路。

翻译自: https://www.experts-exchange.com/articles/11183/Constructing-a-Learning-AI.html

深度学习构建ai的难点

cunchi8090

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
深度学习构建ai的难点_构建学习型AI

深度学习构建ai的难点 As game developers, we quickly learn that Artificial Intelligence (AI) doesn’t need to be so tough. To reference Space Ghost: “ 作为游戏开发人员，我们很快了解到人工智能（AI）不需要那么强悍。要引用“太空...
复制链接

扫一扫