转自:http://lcllcl987.iteye.com/blog/473256
最近搞了一把人工智能,感觉AIML(Artificial Intelligence Mark-up Language)确实是个好东西,特笔记之。
AIML OVERVIEW:
http://www.pandorabots.com/pandora/pics/wallaceaimltutorial.html
AIML的一个java引擎:
http://www.geocities.com/phelio/chatterbean/?200931#BOTS
1: AIML OVERVIEW
首先看看AIML到底长啥样:
- <aiml>
- <category><pattern>WHO WANTS TO KNOW</pattern><template>ALICE wants to know.</template></category>
- <category><pattern>WHY ARE YOU CALLED</pattern><template> <srai>WHAT DOES ALICE STAND FOR</srai> </template></category>
- <category><pattern>WHY ARE YOU NAMED *</pattern><template> <srai>WHAT DOES ALICE STAND FOR</srai> </template></category>
- <category><pattern>WHY DO YOU DREAM *</pattern><template>I dream about adding new code to ALICE.</template></category>
- <category><pattern>WHY SILVER</pattern><template>ALICE is competing for the Loebner Silver Medal.</template></category>
- <category><pattern>WHY WERE YOU NAMED ALICE</pattern><template><srai>WHAT DOES ALICE STAND FOR</srai></template></category>
- <category><pattern>WHY WERE YOU NAMED</pattern><template><srai>WHAT DOES ALICE STAND FOR</srai></template></category>
- <category><pattern>WHY</pattern><that>I AM SMARTER *</that><template>ALICE won an award for being the "most human" robot.</template></category>
- <category><pattern>WOULD ALICE *</pattern><template><srai>WOULD YOU <star/> </srai> </template></category>
- </aiml>
简单说明一下AIML的常用标签:
1:pattern tag:支持模式匹配(正则表达式,模糊匹配),及基于template的返回
2:random tag:支持随机回答(一对多)
4:think,system tag: 支持简单逻辑记忆及自定义函数(本来打算扩展一下AIML, 搞一个支持Groovy语言的标签, 结果看到了它的<system>标签, 遂作罢)
5:javascript tag: 支持嵌入js脚本(适用于web chat开发,比如根据情绪改变表情等)。
6:srai tag: 支持多对一回答.
详细内容请参加AIML的官方文档:
http://alicebot.org/TR/2005/WD-aiml/WD-aiml-1.0.1-008.html
2:按照AIML, 鄙人整理了一个简单的DEMO,扩展了AIML的DATE标签,支持了java的时间掩码,并有自学习功能.
机器人Alice表现如下,可见其多么聪明:
- Alice>Hi ya! Welcome!
- you say>what's your name
- Alice>sorry, what?
- you say>this is not a good answer //开始学习功能
- Alice>Sorry. What would be a good answer?
- you say>my name is Alice
- Alice>Alright! You can ask me again to see if I got it.
- you say>what's your name
- Alice>my name is Alice
- you say>what is your name
- Alice>my name is Alice
- you say>my name is Lichunlei
- Alice>hello, Lichunlei.
- you say>do you remember me?
- Alice>Your name is Lichunlei, seeker. //Alice的记忆功能
- you say>what's time now?
- Alice>It is 10:59 A.M.
- you say>what date is today?
- Alice>Monday.
如果感觉机器人Alice的答案不满意, 只需输入包含not和good answer的句子,在你的指导下,Alice就可以开始学习新知识。
让它如此智慧的原因就是AIML文件, 此为机器人的大脑.
下为Alice的AIML文件:
- <?xml version="1.0" encoding="ISO-8859-1"?>
- <aiml>
- <!-- Copyright (c) 2007 ALICE A.I. Foundation, Inc. -->
- <!-- Last modified Seo 21, 2009, by Lichunlei -->
- <category><pattern>WHAT IS TIME *</pattern><template>It is <date format="h:mm a"/>.</template></category>
- <category><pattern>WHAT DAY IS TODAY</pattern><template><date format="E"/>.</template></category>
- <category><pattern>WHAT IS TODAY *</pattern><template><date format="EEE"/>.</template></category>
- <category><pattern>MY NAME IS *</pattern><template><think><set name="name"><star/></set></think>hello, <get name="name"/>.</template></category>
- <category><pattern>DO YOU REMEMBER ME</pattern><template>Your name is <get name="name"/>, seeker.</template></category>
- <category><pattern>I CAN NOT *</pattern><template>Why can't you do <set name="it"><person/></set>?</template></category>
- <category><pattern>MY INPUT</pattern> <template> 1:<input index="1"/> 2:<input index="2"/> 3:<input index="3"/></template></category>
- <category><pattern>*</pattern><template>sorry, what?</template></category>
- <!-- Greeting categories. -->
- <category>
- <pattern>WELCOME</pattern>
- <template>
- <think>
- <system> <!-- Defines a method to create new categories from user input at run-time. -->
- import bitoflife.chatterbean.AliceBot;
- import bitoflife.chatterbean.Context;
- import bitoflife.chatterbean.Graphmaster;
- import bitoflife.chatterbean.aiml.Category;
- import bitoflife.chatterbean.text.Transformations;
- void learn(String pattern, String template)
- {
- /* The "match" variable represents the current matching context. */
- AliceBot bot = match.getCallback();
- Context context = bot.getContext();
- Transformations transformations = context.getTransformations();
- pattern = transformations.normalization(pattern);
- Category category = new Category(pattern, new String[] {template});
- Graphmaster brain = bot.getGraphmaster();
- brain.append(category);
- }
- </system>
- </think>
- Hi ya! Welcome!
- </template>
- </category>
- <!-- A category set to learn simple user-fed categories. -->
- <category>
- <pattern>* NOT * GOOD ANSWER</pattern>
- <template>
- Sorry. What would be a good answer?
- </template>
- </category>
- <category>
- <pattern>_</pattern>
- <that>WHAT WOULD BE A GOOD ANSWER</that>
- <template>
- <system>learn("<input index="3"/>", "<input index="1"/>")</system>
- Alright! You can ask me again to see if I got it.
- </template>
- </category>
- </aiml>
之所以Alice可以学习, 重要的一点是<input/>标签,此标签记住了之前对方的聊天记录, 通过index可以得到(倒序索引)
程序相对简单,两个class:
Alice工厂: AliceBotMother
- package co.aiml;
- import java.io.FileInputStream;
- import java.io.ByteArrayOutputStream;
- import bitoflife.chatterbean.AliceBot;
- import bitoflife.chatterbean.Context;
- import bitoflife.chatterbean.parser.AliceBotParser;
- import bitoflife.chatterbean.util.Searcher;
- public class AliceBotMother
- {
- private ByteArrayOutputStream gossip;
- public void setUp()
- {
- gossip = new ByteArrayOutputStream();
- }
- public String gossip()
- {
- return gossip.toString();
- }
- public AliceBot newInstance() throws Exception
- {
- Searcher searcher = new Searcher();
- AliceBotParser parser = new AliceBotParser();
- AliceBot bot = parser.parse(new FileInputStream("Bots/context.xml"),
- new FileInputStream("Bots/splitters.xml"),
- new FileInputStream("Bots/substitutions.xml"),
- searcher.search("Bots/mydomain", ".*\\.aiml"));
- Context context = bot.getContext();
- context.outputStream(gossip);
- return bot;
- }
- }
命令行聊天程序:
- package co.aiml;
- import java.io.BufferedReader;
- import java.io.IOException;
- import java.io.InputStreamReader;
- import bitoflife.chatterbean.AliceBot;
- public class Chat
- {
- public static final String END = "bye";
- public static String input()
- {
- BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
- System.out.println("you say>");
- String input = "";
- try
- {
- input = in.readLine();
- } catch (IOException e) {
- // TODO Auto-generated catch block
- e.printStackTrace();
- }
- return input;
- }
- public static void main(String[] args) throws Exception
- {
- AliceBotMother mother = new AliceBotMother();
- mother.setUp();
- AliceBot bot = mother.newInstance();
- System.err.println("Alice>" + bot.respond("welcome"));
- while(true)
- {
- String input = Chat.input();
- if(Chat.END.equalsIgnoreCase(input))
- break;
- System.err.println("Alice>" + bot.respond(input));
- }
- }
- }
需要说明的是:
- AliceBot bot = parser.parse(new FileInputStream("Bots/context.xml"),
- new FileInputStream("Bots/splitters.xml"),
- new FileInputStream("Bots/substitutions.xml"),
- searcher.search("Bots/mydomain", ".*\\.aiml"));
context.xml:设置application的属性, 及时间格式等可变属性
- <context>
- <!-- The id is a unique string that identifies this context. -->
- <bot name="id" value="test_cases" />
- <!-- Bot predicates are set at load time, and cannot be changed at runtime. -->
- <bot name="output" value="Logs/gossip.txt" />
- <bot name="randomSeed" value="1" />
- <bot name="series" value="Alpha" />
- <bot name="version" value="0.7.5 Alpha" />
- <bot name="location" value="Atlanta" />
- <bot name="name" value="Alice" />
- <!-- Default values for predicates, can be changed later at runtime. -->
- <set name="dateFormat" value="yyyy-MM-dd HH:mm:ss" />
- <set name="name" value="dear friend" />
- <set name="me" value="Alice" />
- <set name="engine" value="ChatterBean" />
- <set name="topic" value="*" />
- </context>
如上属性,都可以用AIML的<bot>标签及<get>标签访问得到。
splitters.xml:定义什么是句子,即句子的结束符。
- <splitters>
- <splitter value="..." type="sentence"/>
- <splitter value="." type="sentence"/>
- <splitter value="!" type="sentence"/>
- <splitter value="?" type="sentence"/>
- <splitter value=";" type="sentence"/>
- <splitter value="," type="word"/>
- <splitter value=":" type="word"/>
- </splitters>
substitutions.xml:定义容错规则及特殊字符映射等。
- <substitutions>
- <!-- Input substitutions correct spelling mistakes and convert "sentence"-ending characters into characters that will not be identified as sentence enders. -->
- <input>
- <correction><!--sentence correction-->
- <substitute find="=reply" replace=""/>
- <substitute find="name=reset" replace=""/>
- <substitute find=":-)" replace=" smile "/>
- <substitute find=":)" replace=" smile "/>
- <substitute find=",)" replace=" smile "/>
- <substitute find=";)" replace=" smile "/>
- <substitute find=";-)" replace=" smile "/>
- <substitute find=""" replace=""/>
- <substitute find="/" replace=" "/>
- <substitute find=">" replace=" gt "/>
- <substitute find="<" replace=" lt "/>
- <substitute find="(" replace=" "/>
- <substitute find=")" replace=" "/>
- <substitute find=" u " replace=" you "/>
- <substitute find=" ur " replace=" your "/>
- <substitute find=" you'd " replace=" you would "/>
- <substitute find=" you're " replace=" you are "/>
- <substitute find=" you re " replace=" you are "/>
- <substitute find=" you've " replace=" you have "/>
- <substitute find=" you ve " replace=" you have "/>
- <substitute find=" what's " replace=" what is "/>
- ...
- </correction>
- <protection><!-- sentence protection -->
- <substitute find=",what " replace=". what "/>
- <substitute find=", do " replace=". do "/>
- <substitute find=",do " replace=". do "/>
- ...
- </protection>
- </input>
- <gender>
- <substitute find=" on her " replace="on him"/>
- <substitute find=" in her " replace="in him"/>
- <substitute find=" his " replace="her"/>
- <substitute find=" her " replace="his"/>
- <substitute find=" him " replace="her"/>
- ...
- </gender>
- <person>
- <substitute find=" I was " replace="he or she was"/>
- <substitute find=" mine " replace="his or hers"/>
- </person>
- <person2>
- ...
- <substitute find=" your " replace="my"/>
- </person2>
- </substitutions>
比如在上面的聊天DEMO中, 我输入what's your name, 和输入what is your name, 都能得到正确的回答,这是因为:
在substitutions.xml文件中有如下设置;
- <substitute find=" what's " replace=" what is "/>
3:扩展AIML标签(基于AIML的java引擎:chatterbean):
package bitoflife.chatterbean.aiml是chatterbean对于AIML标签的实现包。目前为止,实现了大多数常用AIML标签.
而对date标签只有一个最简单的实现, 也不支持java时间掩码.
鄙人理想中的date标签应该是:
- <category><pattern>WHAT IS TIME *</pattern><template>It is <date format="h:mm a"/>.</template></category>
标签类只需扩展TemplateElement即可。
所以, 修改之:
- public class Date extends TemplateElement
- {
- private final SimpleDateFormat format = new SimpleDateFormat();
- /**date tag format value, add by lcl**/
- private String formatStr = "";
- public Date()
- {
- }
- public Date(Attributes attributes)
- {
- //得到时间掩码
- formatStr = attributes.getValue(0);
- }
- public String process(Match match)
- {
- try
- {
- format.applyPattern(formatStr);
- return format.format(new java.util.Date());
- }
- catch (Exception e)
- {
- return defaultDate(match);
- }
- }
- private String defaultDate(Match match)
- {
- try
- {
- format.applyPattern((String) match.getCallback().getContext().property("predicate.dateFormat"));
- return format.format(new java.util.Date());
- }
- catch (NullPointerException e)
- {
- return "";
- }
- }
- }
4:要想让Alice足够聪明, 必须要有足够多的AIML, 如下地址是其所有的资料库:
http://www.alicebot.org/downloads/sets.html
加入到程序中, Alice几乎无所不知了。
5:如果需要做一个某领域的机器人专家, 基于AIML来实现,是一个不错的选择。
6:附件是Alice源码,及其上面的DEMO,eclipse工程.