Initial OpenCog Based Dialogue System

From Hanson Robotics Wiki
Jump to: navigation, search

Written initially by Ben G in mid-April 2016, this page gives some thoughts on a very first pass at a fully OpenCog-based dialogue system for the Hanson robots.

OpenPsi

The basic control framework for the OpenCog dialogue system described here is to be OpenPsi, which is an OpenCog-ized version of the Psi framework for motivated action (originally developed by Dietrich Dorner, then made more computationally-oriented by Joscha Bach in his work with MicroPsi).

The current implementation of OpenPsi is here:

https://github.com/opencog/opencog/tree/master/opencog/openpsi

OpenPsi is pretty simplistic, and should be viewed as

a way of directly implementing simple context and goal dependent action selection behaviors a central framework about which to build subtler approaches to action selection

Specifically, OpenPsi deals with rules of the form

Context & Procedure ==> Goal

or in other words

ImplicationLink
   AND
       Context
       Schema
   Goal

In essence, it does this:

Maintains a set of (explicitly defined) system goals Ongoingly: checks the action rules connected to these goals, to see which rules have their contexts fulfilled at the present moment among those rules that have their contexts fulfilled, selects a rule for enaction. The rule is selected with a probability determined by, e.g.: the degree to which its context is fulfilled, and the weights (importances) of the goals

Obviously, this is a sort of “animal like” model of action selection – e.g. it lacks any built-in notion of a narrative arc, of sequences of actions unfolding over time. However, it is compatible with such notions. For instance, if one has a series of actions A1, A2, A3, A4, A5 that should be done in order, then from an OpenPsi view the important thing is that part of the context for enacting A4 is that (A1, A2, A3) have already been enacted in that order. This is a simple, concrete example of what I mean by using OpenPsi as “a central framework about which to build subtler approaches to action selection”,

Also, handling “action orchestration” (e.g. you can easily smile and say something at the same time; but you can't easily say two different things at the same time … but sometimes you can choose to give two different responses to the same utterance said to you, in sequence...) is not done by OpenPsi either. OpenPsi just uses the results of action orchestration, in determining what the context is at a certain point in time.

Canned Dialogue Using OpenPsi

As a simple and relevant example of using OpenPsi, let's see how to most simply use a canned dialogue rule via OpenPsi. The immediate use-case for this is the importation of simplistic AIML-type rules into OpenCog. By this I mean rules like

How are you doing? ==> I am fine

Are you conscious? ==> I'm as conscious as you are, meat machine!

which specify a particular response to a particular statement.

AIML and ChatScript and other similar chatbot frameworks allow specification of rules like this, and also include specific markup intended to support specific methods of matching the triggers (left hand side) of the rules. They also allow specification of rules with fancy abstractions, e.g. in AIML one can say

Did you eat my *? ==> No I didn't eat your * you moron, I'm a robot!

Of course we could support rules like that in OpenCog using VariableNodes. But in this section I'm not considering this case; and empirically, it seems that the vast majority of chatbot rules in practical use do not involve this kind of complexity.

Suppose we have a canned rule of the form

INPUT UTTERANCE ==> OUTPUT UTTERANCE

and suppose we have reason to believe that following this rule will satisfy a certain goal in OpenCog, called DEMAND … with a certain weight WEIGHT-TO-THIS-DEMAND. Further, suppose this rule has some identifier called ID.

Then we can translate this rule into the following Scheme Atomese, for easy use with OpenPsi:

(psi-action-rule
    '()
    (list (EvaluationLink
        (GroundedPredicateNode "scm: do-fuzzy-match")
        (ListLink (SentenceNode "<<INPUT UTTERANCE>>"))))
    (ExecutionOutputLink
        (GroundedSchemaNode "scm: say")
        (ListLink (SentenceNode "<<OUTPUT UTTERANCE>>"))
    )
    (ConceptNode "OpenPsi: <<DEMAND>>")
    "Increase"
    "chat-<<ID>>"
    <<WEIGHT-TO-THIS-DEMAND>>
)

Here the Atom

GroundedPredicateNode "scm: do-fuzzy-match"

does a bunch of work: it gets the sentence that was most recently said to the OpenCog system, and performs a fuzzy match against this sentence and INPUT_UTTERANCE. (The fuzzy matcher itself currently needs some tweaking to perform effectively in this context, but that's a separate issue.)

A script has been written that transforms CSV files into Scheme of the above form. This allows e.g. chat rule authors to code chat rules in a Google Sheet, and then export the sheet to CSV and auto-convert it to Scheme Atomese for loading into OpenCog.


For example, the Atomese for for "Are you conscious? -- I'm as conscious as you are, meat machine!" would be as follows (if we assume this rule fulfills three goals: Sociality, PleaseUser and Humor):


(psi-action-rule
    '()
    (list (EvaluationLink
        (GroundedPredicateNode "scm: do-fuzzy-match")
        (ListLink (SentenceNode "Are you conscious?"))))
    (ExecutionOutputLink
        (GroundedSchemaNode "scm: say")
        (ListLink (SentenceNode "I'm as conscious as you are, meat machine!"))
    )
    (ConceptNode "OpenPsi: Sociality")
    "Increase"
    "chat-13"
    1
)

(psi-action-rule
    '()
    (list (EvaluationLink
        (GroundedPredicateNode "scm: do-fuzzy-match")
        (ListLink (SentenceNode "Are you conscious?"))))
    (ExecutionOutputLink
        (GroundedSchemaNode "scm: say")
        (ListLink (SentenceNode "I'm as conscious as you are, meat machine!"))
    )
    (ConceptNode "OpenPsi: PleaseUser")
    "Increase"
    "chat-13"
    1
)

(psi-action-rule
    '()
    (list (EvaluationLink
        (GroundedPredicateNode "scm: do-fuzzy-match")
        (ListLink (SentenceNode "Are you conscious?"))))
    (ExecutionOutputLink
        (GroundedSchemaNode "scm: say")
        (ListLink (SentenceNode "I'm as conscious as you are, meat machine!"))
    )
    (ConceptNode "OpenPsi: Humor")
    "Increase"
    "chat-13"
    1
)

Initial Dialogue Logic

For an initial, very-early-stage OpenCog dialogue system using OpenPsi as the core structural principle, I suggest the following conceptual approach.

I suggest initially we have four type of dialogic response

General question-answering Self-focused question-answering Canned verbal response Command response

(Yes, there is a lot of other stuff we can do, and will do soon. This is just intended for a relatively simple start.)

Canned Verbal Response

The “Canned verbal response” response type has been considered above.

Command Response

For command responses, the basic logic should be something like the following. This is for the case of the command “smile”:

Implication
	AND
		EvaluationLink
			GroundedPredicateNode “scm:do-fuzzy-command-match”
			SentenceNode “smile”
		ExecutionLink
			GroundedSchemaNode “smile”
	ConceptNode “PleaseTheUser”

Self-focused QA

Here the basic logic should be as follows, considering the example of the question “What direction are you looking in?” ...

Implication
	AND
		EvaluationLink
			GroundedPredicateNode “scm:do-fuzzy-match”
			SentenceNode “what direction are you looking?”
		ExecutionLink
			GroundedSchemaNode “say”
			ExecutionOutputLink
				GroundedSchemaNode “direction-I-am-looking-to-wordnode”
	ConceptNode “PleaseTheUser”

Here the assumption is that

GroundedSchemaNode “direction-I-am-looking-to-wordnode”

outputs a WordNode describing what direction the robot is looking in.

A fancier way to do it would be

Implication
	AND
		EvaluationLink
			GroundedPredicateNode “scm:do-fuzzy-match”
			SentenceNode “what direction are you looking?”
		ExecutionLink
			GroundedSchemaNode “say”
			ExecutionOutputLink
				GroundedSchemaNode “word selection”
				ExecutionOutputLink
					GroundedSchemaNode “direction-I-am-looking”
	ConceptNode “PleaseTheUser”

In this approach,

GroundedSchemaNode “direction-I-am-looking”

finds the direction the robot is looking and give this information in the form of (say) a ConceptNode, and then the

GroundedSchemaNode “word selection”

takes on the task of transforming this ConceptNode into a WordNode. This “fancier” approach is a more “correct” way to do things but may be an unnecessary complexity at this stage.

General (i.e. fuzzy matcher based) QA

Here the basic logic should be as follows, considering the example of the question “Who is the President of China?” ...

Implication
	AND
		AND
			EvaluationLink
				GroundedPredicateNode “scm:do-fuzzy-match”
				SentenceNode “Who is the President of China?”
			GreaterThanLink
				ExecutionLink
					GroundedSchemaNode “getConfidence”
					ExecutionLink
						GroundedSchemaNode “scm:get-fuzzy-match-answer”
						SentenceNode “Who is the President of China?”
				.7
		ExecutionLink
			GroundedSchemaNode “say”
			ExecutionOutputLink
				GroundedSchemaNode “scm:get-fuzzy-match-answer”
				SentenceNode “Who is the President of China?”
	ConceptNode “PleaseTheUser”

As written, the above example code would have the undesirable property of submitting the query to the fuzzy matcher twice, unless some caching were taking place behind the scenes and getting used by

GroundedSchemaNode “scm:get-fuzzy-match-answer”

The code could be made fancier to avoid this but I decided to keep it simple so it would be maximally clear.

Interweaving Verbal and Nonverbal Responses

It's probably an obvious point, but in this approach verbal and nonverbal approaches are handled basically the same way, e.g. one could have a rule

Implication
	AND
		AND
			EvaluationLink
				GroundedPredicateNode “one person present”
			EvaluationLink
				GroundedPredicateNode “scm:face-expression-perceived”
				ConceptNode “smile”
		ExecutionLink
			GroundedSchemaNode “smile”
	ConceptNode “PleaseTheUser”

encoding the notion “ if there is only one person present and you see a smile, smile back”

OpenPsi can then choose either a nonverbal or verbal action in any given situation.