对话框架ChatScript的变量_scriptchat-CSDN博客

本文链接：https://blog.csdn.net/juanjuan1314/article/details/72957528

变量在ChatScript中占有很重要的位置，很多用户的重要信息，系统信息都依靠变量来暂时存储。这里的变量不是底层程序里的变量，而是脚本变量。上一篇讲到ChatScript大致由两部分组成：底层驱动程序和脚本，本篇讨论的变量就是脚本变量。

ChatScript的变量分为5种：

1. 用户变量，以$和$$开头，其中$相当于全局变量，可以跨topic，跨vollay，而$$相当于局部变量，只存在于一个volley。

2. 通配变量（或叫匹配变量），以“_”开头，保存通配符或者模式匹配到的字符串。

3. 事实（fact）集合，以@开头。

4. 函数变量，以^开头，在脚本中调用函数，则用函数变量。

5. 系统变量，以%开头。

1. 匹配变量

对话系统在自然语言处理层面主要分三个部分：自然语言理解（NLU）、对话管理（DM）、自然语言合成(NLG)。NLU部分处理用户的文本，往往用户的文本中的某些词是在DM、NLG部分需要用到的，“某些词”通常是在特点位置，或者是某个类型的词，这些词采用通配符和模式的方式匹配，匹配之后存储。如下图：

其中，_~meat表示在eat后匹配概念为meat的词或短语。如果将_~meat替换成_*，则表示匹配eat后的所有内容。_0读取前面匹配到的词或短语，0表示读第一个匹配变量，底层程序设置一个volley中最多可使用20个匹配变量，同样，读取出来的变量在_0到_20之间。当vollay结束，匹配变量自动被清除。（补充一句，个人理解的一个volley就是一轮对话，即用户说一句或几句，bot回答一句或几句，这样就算是一轮，一个volley）。

匹配变量被匹配到后，系统会存储他的原词、标准词（canonical form）以及它在文本中的位置。原词和标准词的概念如，（小朋友、小孩、小娃娃、孩子）这4个词意义相近，于是把系统把他们同意成“小朋友”这个词。Bot在回答时，取标准词做回复，想用原词回答，就在前面加“’”，如’_0。

上面这个rule中，_0存_~fruit，_1存[_~animal _bear]，”[]”表示选择其中一个，_2存_~like。

当匹配变量没有匹配到内容，或者变量被人为置null，在使用时，不会报错，但也不会有输出。

要匹配数字时，不能使用（_1），_1表示的是匹配变量1，而应该使用_~number _0=1。

2. 用户变量

前面提到匹配变量只能作用在一个volley中，用户变量则可以让某些内容存储的时间更长。用户变量也分两种——全局和局部。全局变量以“$”开头，如果不清除，可以一直存在，而局部变量，以“$$”开头，也只作用于一个volley。

用户变量大部分的使用场景是存储匹配变量，如下：

用户变量的赋值符号“=”，“=”与变量、值之间必须至少有一个空格，不然很容易读不到值，这是亲测了的。

用户变量可以做一些简单的数学计算，如+=, -=, *=, /=, %=, |=, &=, ^=, and |^=等。同时，变量赋值时，也可以用其他变量做数学运算后赋值，但是，运算过程不能用括号来控制计算顺序。

Fact变量也可以通过以上运算符做运算，只是运算的含义更像是集合操作，

CS内部的运算顺序和C语言运算顺序不同，如

这是先计算“-=”再计算“*”的。

同时，用户变量也可以用在pattern中，如

这里，“=”等同于C语言中的“==”，不是赋值，而是判定。这个pattern的意思是当gender已经被赋值为male，如果user说“I like boys”，bot回应“Oh, dear”。Gender=male和I like boys是“&&”并列关系。

匹配变量和用户变量也可以使用逻辑运算符，如

匹配变量不仅仅可以用来存储匹配到的值，也可以被赋成别的值，只是如果被赋为非匹配值，那么，该变量存储的就只有原词，而没有标准词和所在位置，如果被赋为另一个匹配变量，则其标准词和位置存的是赋值变量的相关属性：

在底层，专门处理匹配变量和用户变量的模块——variableSystem。

要清除变量，则给他赋null。

3. 系统变量

系统变量以%开头，主要用于读取和存储系统的某些值和状态。

(1). 系统日期和时间

(2). 用户的输入

(3).bot的输出

(4). （bot的）系统变量

(5). Build data

其中，用户输入包含

variable	description
%bot	current bot responding
%revisedinput	Boolean is current input from ^input not direct from user
%command	Boolean was the user input a command
%foreign	Boolean is bulk of the sentence composed of foreign words
%impliedyou	Boolean was the user input having you as implied subject
%input	the count of the number of volleys this user has made ever
%ip	ip address supplied
%language	current dictionary language
%length	the length in tokens of the current sentence
%more	Boolean is there another sentence after this
%morequestion	Boolean is there a ? or question word in the pending sentences
%originalinput	all sentences user passed into volley, before adjusted in any way except OOB data is stripped off
%originalsentence	the current sentence after tokenization but before any adjustments
%parsed	Boolean was current input parsed successfully
%question	Boolean was the user input a question – same as ? in a pattern
%quotation	Boolean is current input a quotation
%sentence	Boolean does it seem like a sentence (subject/verb or command)
%tense	past , present, or future simple tense (present perfect is a past tense)
%user	user login name supplied
%userfirstline	value of %input that is at the start of this conversation start
%userinput	Boolean is the current input from the user (vs the chatbot)
%voice	active or passive on current inpu

Bot输出：

variable	description
%inputrejoinder	rule tag of any pending rejoinder for input or 0 if none
%lastoutput	the text of the last generated response for the current volley
%lastquestion	Boolean did last output end in a ?
%outputrejoinder	rule tag if system set a rejoinder for its current output or 0
%response	number of committed responses that have been generated for this sentence (see Advanced User- Advanced Output: Committed Responses

系统变量

variable	description
%all	Boolean is the :all flag on? (:all to set)
%document	Boolean is :document running
%fact	Numeric value most recent fact id
%freetext	kb of available text space
%freedict	number of unused dictionary words
%freefact	number of unused facts
%maxmatchvariables	highest number of match variables, currently 20
%maxfactsets	highest number of @factsets, currently 20
%host	name of the current host machine or "local"
%regression	Boolean is the regression flag on
%server	Boolean is the system running in server mode
%rule	get a tag to the current executing rule. Can be used in place of a label
%topic	name of the current "real" topic . if control is currently in a topic or called from a topic which is not system or nostay, then that is the topic. Otherwise the most recent pending topic is found
%actualtopic	literally the current topic being processed (system or not)
%trace	Numeric value of the trace flag (:trace to set)
%httpresponse	return code of most recent ^jsonopen call
%pid	Linux process id or 0 for other systems
%restart	You can set and retrieve a value here across a system restart.
%timeout	Boolean tells if a timeout has happened, based on the timelimit command line parameter

Build data

variable	description
%dict	date/time the dictionary was built
%engine	date/time the engine was compiled
%os	os invovled (linux windows mac ios)
%script	date/time build1 was compiled
%version	engine version number

4. Fact集合(fact set)

Fact是三元组集合，类似于知识图谱里面的三元组，包含主语（Subject）、谓语（verb）、宾语（Object），如下就是一个三元组：

词、数字、fact都可以作为fact的值。

对fact可做的操作有^createfact()、^find()、^query()。Fact set以@开始，用于存储^query()的结果。

Query的查询规则：

^query(kind subject verb object countfromset toset propagate match)，

其中kind有如下选择：

query flag	description
direct_s	find all facts with the given subject
direct_v	find all facts with the given verb
direct_o	find all facts with the given object
direct_sv	find all facts with the given subject and verb
direct_so	find all facts with the given subject and object
direct_vo	find all facts with the given object and verb
direct_svo	find all facts given all fields (prove that this fact exists)

Subject、verb、object不必三个同时出现，出现则表示此field需要匹配。

Count是输出查询结果的个数；fromset定义初始fact的factset；toset定义存储查询结果的factset，剩下两个参数的存在感就很低了。这些参数的缺省值为?，其中，count默认为-1,表示数量不限，toset默认为@0。

一般使用比较简洁的方式：^query( kind subject verb object )

^query查询结果存储到fact set中， fact集合被标记为@0、@1，等。fact set是fact的汇聚，也是s、v、o其中一个field的汇聚（因kind而定）。

Fact set的应用：

Bot在所有fact中查询（I own dog），如果查询到了，则RULE匹配成功，输出“yes”，否则，匹配失败，走别的规则。

Factset的赋值：

Factset赋值后被使用：

其中使用规则：

fields	description
@1subject	means use the subject field
@1verb	means use the verb field
@1object	means use the object field
@1fact	means keep the fact intact (a reference to the fact) – required if assigning to another set.
@1+	means spread the subject,verb,object onto successive match variables – only valid with match variables
@1-	means spread the object,verb,subject onto successive match variables– only valid with match variables
@1all	means the same as @1+, spread subject,verb,object,flags onto match variables. _6 = ^first(@1all) - this puts subject in _6, verb in _7, object in _8

Factset的操作函数有：

function
^first(factset)	返回第一个fact
^last(factset)	返回最后一个fact
^pick(factset)	随机返回一个fact
^sort(factset{more fact-sets} )	排序
^delete( factset )	删除
^length( fact-set )	返回fact的个数
^nth(factset count)	检索第count个fact
unp