https://github.com/bwilcox-1234/ChatScript/blob/master/WIKI/ChatScript-Basic-User-Manual.md
Overview
ChatScript is a system for manipulating natural language, not just for building a chatbot.
Input and output sentences
By default these characters end sentences:
E.g. the following user phrase is split in three sentences:
What 's your name ? my is Alfred; I'm from London.
One can also disable sentence ending.
CS outputs one or more sentences back to user.
My name is Harry!
I'm a chatbot demo, from California (USA).
Rules
A full rule has a kind, a label, a pattern, and an output. Here is a simple full rule:
? : MEAT (you like meat) I do
The rule kind is ?:, which means it only reacts to questions. The label is MEAT. The pattern is in ( ) and looks to find the words you like meat in consecutive order, anywhere in the input.e.g. do you like meat, The output is I do.
...
Topics
Rules are bundled into collections called topics.
Code Script Syntax
CharSet Encoding : ascill text files and UTF-8 files
Whitespace : ignores excessive white space.
Case : insensitive
Comments : the hash mark #
Legal declaration names : Variables must start with $ or $$ or $_,the continue with a starting alphabetic character and then continue with alpha-numerics, underscores, or hyphens.
Hell World Demo
$ BINARIES/LinuxChatScript64 local
If you want to learn about the simple Harry bot and try changing it, read the BOTHARRY document. Same if you want to build your own bot (starting by cloning Harry). Let's quickly survey what comes built in.
Fast Overview of topic files (.top)
Some people will try to dive right in, without reading the material, so here is a quick guide to what you see in this simple topic file.
Rules start with t: or ?: or u: or s:
- s: means the rule reacts to statements.
- ?: means the rule reacts to questions.
- u: means the rule reacts to the union of both.
- t: means the rule offers a topic gambit when chatbot has control
u: (run away) --> parens typically try to find specific words or sequences of words in the user's input.
u: ([ scare afraid]) --> [ ] means find anywhere in the sentence one of the words scare or afraid. And scare can be in any of its related forms: scared, scare, scaring, scares.
~animals means any of a large list of names of animals. Instead of having to write [ elephant tiger leopard ... ]
The gambits, t: lnes, offer a story or expectes conversation flow.For example if you ask what someone does for a hobby, you are expected after their response to answer the question about yourself. As in:
Topic: school [school university learn]
t: Where do you go to school?
t: I go to Harvard.
t: What is your major?
t: I am studying finance.
If he says I go to Yale.
#! give sample input from a user that the immediately following rule is expected to match and handle.
The special comment gives only one example of matching input, not all possible inputs that can match. It help you understand what a responder or rejoinder is supposed to react to. It has no impact whatsoever on a user in chat.
What Files are Where
SIMPLE TOPICS
Here is an example of a simple topic declaration:
topic: ~DEATH [dead corpse death die body]
t: I don't want to die
?: (When will you die) I don't know.
The topic declares its name, its keywords, and then its rules. It ends with the end of the file or a new top level declaration (which includes topic:, concept:, table:, tablemacro:, outputmacro:, patternmacro:, dualmacro:, bot:, data:, canon:, query:, plan:, describe:, and replace: ).
A topic name must start with a ~, an alphabetic character, and then be a standard legal name (contains only alpha-numeric characters, underscores, hyphens, and periods).
Keywords : allow the system to consider this topic based on matches with the user input.
Gambit Rules : create a coherent story on the topic if the bot is in control of the conversation.
Execution Order : For gambits, the order tells a story. For responders, rules are usually orderd most specific to least specific, possibly bunched by a theme.
the file RAWDATA/skeleton.top has a bunch of topics already predefined with keywords but no responders or gambits.
Rejoinders : If you expect the user might respond in a particular manner to the chatbot' last output, you can script rules to examine his next input and see if it matches. When it works, it makes your chatbot seem like it understands the user. These are called rejoinders and all rules can have them.
s: ( I like spinach ) Are you a fan of the Popeye cartoons?
a: ( yes ) I used to watch him as a child. Did you lust after Olive Oyl?
b: ( no ) Me neither. She was too skinny.
b: ( yes ) You probably like skinny models.
a: ( no ) What cartoons do you watch ?
b: ( none ) You lead a deprived life.
b: (Mickey Mouse) The Disney icon.
Rule Labels : Labels have a variety of uses. Other rules can use functions that target a particular labeled rule. You can use the debug abilities to test that rule and you can see that rule more easily in a trace. And you get a kind of documentation telling you what your rule is about.
SIMPLE PATTERNS
Writing patterns is a delicate balancing act. If you are too specific, the pattern will miss all sorts of opportunities to respond to similar meanings.
?: (when will you go home) I go home tomorrow
But if your pattern is too broad, the bot responds to completely wrong meanings.
s: (home) I go home tomorrow.
In sequence ( ) : s: ( "I love you" ) Do you really?
Second, when trying to write words where you are not sure how the system will tokenize it and whether it is one word or a sequence of words.e.g, Bob's is actually tokenized as two words: Bob 's. And in Wordnet, New_Year's_Eve is a single word.
Put things with punctuation in them in double quotes to be safe. Pattern matching a sequence is limited to 5 words in row and will do both original and canonical forms.
Sentence boundaries < and > : you need to actually know where an input begins or ends.
Simple Indefinite Wildcards * : The wildcard * means 0 or more words in sequence.
Precise Wildcards *n : ?: ( when *1 you *1 home ) I went home yesterday
Range-restricted Wildcards *~n : ?: (you *~2 go *~2 home) I often go to that home. This responds equally to You can go home and you should not go to your home.
Unordered Matching << >> : s: ( << I birds love >> ) I love birds too.
Choices [ ] : You can match alternate words in the same position by placing those choices in brackets .
~ Concepts : Choices are handy for synonyms, but you have to repeat them over and over in different rules.
concept: ~eat [ eat ingest "binge and purge"]
s: ( I ~eat meat ) Do you really ? I am a vegan
ChatScript System Variables and Engine-defined Concepts manual.
Capitalization :
Proper names : u: ( "Dr . Watson" )
Interjections, "discourse acts", and concept sets : the interjections.txt file augments this concept with discourse acts, phrases that are like an interjection. All interjections and discourse acts map to concept sets, which come thru as the user input instead of what they wrote.
Canonization : LIVEDATA/ENGLISH/canonical.txt
Not ! And NotNot !! : ! means it must not be found anywhere after the current match location : u: ( !~negativeWords I * ~like * ~meat ) You like meat . !! checks just the next word from where you are .
Optional Words { } : u: ( {"be you go"} home )
Commands : to inquire about things, control things, debug things, etc. e.g. :word word
SIMPLE OUTPUT
When a rule does that, it has accomplished the goal of the topic.
Direct Output :
AutoFormat : ChatScript Advanced User Manual.
Literal Output \ : \n
Randomized Output [ ] : ?: (hi) [hello.][hi][hey] Are you going to [dance][swim][eat] anytime soon?
VARIABLES
CS supports several levels of memorization. ChatScript Fact Manual .
_ Match Variables : Just place an underscore in front of what you want memorized.
?: ( do you eat _~meat ) No, I hate _0.
$ User_Variables : s: ( I eat _*1 > ) $food = '_0 I eat oysters.
Clearing variables : $myvar = null
Long-term variables : The system normally stores variables on a per-user basis.
% System Variables : %hour, %bot and others.
Summary
Just remember, to start, all you need is to write a topic, with keywords, trival gambits, responders and rejoinders with simple patterns, and output that is simply exactly what you want the bot to say.