Tiered pyAIML Chat System

From Hanson Robotics Wiki
Jump to: navigation, search

This page gives some basic information on the workings of the old "demo" (pre openCog) chat, and serves as a common resource for dialog development status and development ideas that may be implemented in this system or modified for openCog or other dialog systems. Development version on Amazon EC2

To merge our work and unify testing, and to more easily support testing without installing HR software, we have a chat server that runs on amazon. 
At the time of writing, the server can be started and stopped by those authorized to log in via ssh. 

To start:


To stop the running server:

ps -u ubuntu 
observe pid of python
kill -9 <python pid>
The server runs from branch update of character_dev,  and AIML git commits to the character_dev/update branch trigger a reload of the data according to definitions in the character.

If a chat session is happening and reload is triggered occur the session will be interrupted.

Python client

If you have installed HEAD, you can run the python client. 
 By default the HEAD client is communicating with localhost, but you can use the command ip to switch to the amazon server. is the current amazon server IP address. It could be changed later.

Browser client

Type the following URL to test the chatbot running on the amazon server.
Client commands

Deployment version in HEAD

The chat server and (SOLR server, if desired) can be launched locally and the speech recognition and communicate locally. 

Configuring which aiml files to load and load order

 In the character_dev and HEAD versions, the process for configuring is the same but the location of the aiml differs, as reflected in relative paths. 
 To change the files and ordering in HEAD: 
 To change the files and ordering in character_dev  (for the amazon server) 

Why weighted tiers? There were at least two motivations for the tiers.

  • Trying to add specific responses to a large flat AIML set with many reductions (substitutions) is difficult. Other rules are likely to win the matching process. By putting a character tier ahead of more generic responses, you insure that your responses to key questions are matched. If no response is found in the first tier you may pass to one or more additional tiers, such as generic.
  • It gives the ability to sequence through a set of different response types, with probability. In the general case with dynamically changing weights at every conversation turn, it can support a variety of useful dialog structures:

narrative sequencing, gradual changes in style, speech act sequences, and topic priorities.


Tiers as Characters
 Weights on tiers 
 Sessions and transcript logging 
 Multiple response tiers raises coordination issues
  Repetition avoidance
 Topic counts
 AIML uses its topic tag generally as a tie breaker; when identical matches are found, the category (statement-response pair) inside a topic tag is the selected response. 

There is no way to really 'dwell' within a particular topic. We created a strategy where whatever the user says when in the topic, we give a response from the topic category. This is accomplished by putting a catchall pattern <pattern> * </pattern><template><srai> topic_name </srai></template>. If there is some match in the same tier, the robot may interrupt its monolog; if not, it will continue to hold forth on a topic, unless the weights allow it to fall through to a generic tier. We use a topic count which is global for all tiers to keep track of the topic count.

Topical material can then be organized as a first level tier, which weighting allowing occasional responding to the users opinions. Or a cycle of weight turns

Solr tier

  See the separate hr_solr repo for instructions on processing aiml or other XML with SOLR.  We use the beautiful soup bs4 library for ingesting aiml. 

Modifications to pyAIML

 In order to make patterns more readable to authors, the code that reads patterns does string.upper() on patterns. 
 The pyAIML code handling predicates *should* communicate the topic to the global chatbot code. 

Authoring Support Utilities

 Several utilities exist to translate AIML to a convenient spreadsheet form.   
    aiml2tsv.py   is now preffered to aiml2csv,  so that commas can be imbedded in responses (templates) without using #Comma
    The code currently strips set commands in templates, but preserves all contents of think commands.  We may revisit this, as existing ALICE aiml expects predicates to be set and gives bad answers.
    csvUtils.py has functions to translate .csv files to AIML,  for either a three column short form or the long form that attempts to preserve more AIML structure 
    csvScoring.py looks at the Rate column in a transcript saved by the chat logger and accumulates various statistics, appending to a .csv file. 

Strategies to improve chat in an AIML framework

* Using tiers to enhance control

Strategies to improve chat in an AIML plus python NLP