Bootstrap Dialog System Design
The Bootstrap Dialog System is the key deliverable for the Texai project this year. Its use case is described this previous comment. Although this application could be written without fitting into an Albus Hierarchical Control System, I chose to do so because eventually that is the Texai architecture, and spending the extra effort now will avoid a subsequent rewrite and greatly facilitate system elaboration.
Here is a diagram of a generic Albus Hierarchical Constrol System. Click on the image for a full-sized version.
Here is a diagram of the constituents of an Albus Node. Click on the image for a full-sized version.
Here is a diagram of the Texai implmentation of an Albus Node. Click on the image for a full-sized version.
![]()
Here is a diagram of the bootstrap dialog system. Click on the image for a full-sized version.
Key: cyan = Albus node, yellow = skill, grey = library (i.e. one or more Java objects)
These are the node descriptions:
linguistic knowledge acquisition node [top node at this stage of development]
mission
- My mission is to acquire linguistic knowledge, both vocabulary and grammar rules. I sense user speech acts, which I execute. I report the result of the execution, and issue clarifying information and queries to the user according to my knowledge acquisition scripts.
sensations handled
- user speech act
tasks commanded
- generate text for the given response to the user
skills
- vocabulary acquisition, using a library for knowedge base mapping
- grammar acquisition
dialog session node
mission
- My mission is to maintain a dialog session with each user. I sense English input text and end-of-message. I comprend the text and obtain the user speech act. I generate text for the given response semantics.
sensations handled
- words and and end-of-message
sensations sent to parent
- comprehended text and perceived user speech act
tasks handled
- generate text for the given response to the user
tasks commanded
- write output text to the user
skills
- English comprehension skill, using libraries for incremental Fluid Construction Grammar rule application, semantic entailment (heuristic elaboration of the incremental discourse context with entailed knowledge), and spreading activation (incremental pruning the alternative semantic interpretations)
- English generation skill, using libraries for discourse planning (search for the best combination of grammar rules to accept the given semantics and given discourse context), and for grammar rule application to generate each output word
Console chat session node
mission
- My mission is to maintain a console chat session with the user at the development console. I detect input words and the end-of-message. I write given text to the console.
sensations handled
- input text characters and line termination from the console
perceptions sent to parent
- perceived words and and end-of-message
tasks handled
- write words to the console
- write end-of-message to the console
UI Console chat session node
mission
- My mission is to maintain a GUI console chat session with the user at the development console, in the even that the command-line console is not available to my application. I detect input words, the end-of-message, and the exit-application event. I write given text to the console.
Behaviors of the GUI console chat session node are otherwise the same as the alternative console chat session node.
Use case
Let’s pretend that the Texai dialog system for lexical acquisition exists, and that the system is to learn constructions for examples provided by Ben Goertzel. The below dialog is Controlled English, in which the system understands and generates constrained syntax and vocabulary.
[user] Do you understand “The book is on neuroscience”?
[texai] Not entirely, neuroscience is not a physical location, how can the book be on it?
[user] The utterance means that the book is about neuroscience.
[texai] Could I generalize “on neuroscience” to any field of scientific discovery, such as “on Astronomy”, and mean that the book is about Astronomy?
[user] yes
[texai] Could I generalize field of scientific discovery to some general region, such “on deep space?”
[user] yes, this role can be generalized to any topic.
[texai] Does the utterance “the book is on Mars” entail either the interpretation that (1) the book is physically located on Mars, or (2) the book is about the topic Mars?
[user] yes
…
[user] Do you understand “the book is on fire?”
[texai] Yes, the book is about the topic fire.
[user] There is another, more common, interpretation: There is a fire in which the book is the thing that is burning.
[texai] Could I generalize fire to any decomposition process, such as rot?
[user] no
…
[user] Do you understand “the book is on my shit-list?”
[texai] Yes, the book is about the topic shit-list.
[user] There is another, more common, interpretation: There is a shit-list, and the book is an element of the list.
[texai] I know from Wiktionary that a shitlist is “a group of people who a person holds in disregard”, but a book is not a person.
[user] The elements of a shit-list can be things.
[texai] Now I understand that “the book is on my shit-list” commonly means that the book is an element of the group of things that you hold in disregard.
Arthur T. Murray on 20 Jan 2008 at 10:37 am #
I hope that there will be a public interface for Netizens to converse with the AI over the Web.
Steve Reed on 20 Jan 2008 at 12:57 pm #
Thanks for studying my design Arthur. In fact I have already registered texai as a user at both jabber.org and at Google Chat. So when I get the console chat session operating OK, then I will plug in the Smack API as another bottom node so that anyone can chat with the system with a Jabber-compatible client.
-Steve
Joe Simone on 21 Jan 2008 at 10:34 pm #
How will texai handle conflicting knowledge, erroneous knowledge and net hooliganism? I understand the dialog will be controlled, but will it be directed as to gather specific facts that would be missing from its current KB?
Thanks.
Steve Reed on 22 Jan 2008 at 12:00 am #
Joe, during the initial use of the bootstrap dialog, I will manually review all interactions for two purposes: (1) to keep bad data out of the KB, (2) to develop scalable heuristics for automating the quality control. I expect that the system will operate with real user names, email address confirmation, and passwords, and that the system will inform the users that if vandalism is detected then they will be banned, and marked as not a friend of the system.
Regarding what to learn, the system will be motivated to fill gaps in linguistic knowledge, both vocabulary mapping (e.g. WordNet synsets to OpenCyc concepts) and grammar constructions. Furthermore, acquired knowledge will be vetted with a heuristically determined number of other users.
===========
I am writing the dialog system from the lower level nodes upwards to the highest level node, in order to begin coordinated testing as soon as possible. I discovered while writing the console chat Java class that the java.io.Console instance is not always available, e.g. during a NetBeans debugging session. Therefore I altered the design to accomodate a GUI (window) in which the text chat dialog can occur. The system will initialize automatically to either the command-line console or the GUI console as the bottom node that interacts with the real world. I also changed the ‘performative’ sensation to ’speech act’.
Mclean Edwards on 23 Jan 2008 at 5:07 am #
I am curious as to why you do not specifically limit the vocabulary for initial AI development.
Since humans are environment sensing organisms, our vocabulary is rooted in mainly visual terms and kinetic terms. The simply phrase ‘See spot run.’ is immediately understandable as a child where as
‘Sort this list of names alphabetically’ takes a more advanced hold of the language (ie: the child needs to be able to read/write and understand more abstract concepts).
For an electronic intelligence, this situation is reversed. For such a system, ‘See spot run.’ is an incredibly abstract and difficult concept, and requires at the very least an operational and complex visual subsystem in order to differentiate it from any other abstract real-world concept.
My argument is that as we teach children language based on visual and kinetic clues rather than the philosophical inconsistencies in Kant’s works, we should teach vocabulary to AI ‘children’ in terms that are more tangible to them, such as code, logic, and mathematics.
Furthermore a simplified vocabulary should be much simpler to teach, even if to us this vocabulary is far from simple.
Banned vocabulary: Emotions, real world objects (besides the user), sensations, kinetic and visual references, and most adjectives and adverbs.
This would decimate the available vocabulary, and lead to a more sane knowledge base, but would unfortunately prevent training by Instant Messenger and unspecialized humans.
Steve Reed on 23 Jan 2008 at 9:50 pm #
McLean, the bootstrap dialog system vocabulary and grammar are somewhat limited. It is a Controlled English. The system’s external sensors and actuator will conduct a chat session with the user. The system, at its lowest hierarchical level, will thus have grounded perceptions of character strings. It its highest level it will have reflective knowledge of the mapping between word forms, WordNet/Wiktionary word senses and OpenCyc-derived knowledge base (KB) terms. It will also have reflective knowledge about its grammar constructions. Built-in behavior at the highest level will motivate the system to understand perceived utterances for which it currently lacks either (1) referential vocabulary (e.g. noun word form –> noun word sense –> KB term mapping), (2) relational vocabulary (e.g. verb word form —> verb word sense —> KB action term mapping & argument mapping), or (3) a grammar construction that covers the utterance. The system will have built-in behavior to conduct a clarification dialog with the user to obtain the desired knowledge, to demonstrate to the user that the knowledge has been learned correctly, and to perform a regression test ensuring that the new knowledge is not in conflict with prior knowledge.
Although I understand your point about restricting word categories such as adjectives and adverbs, I believe that Fluid Construction Grammar and the knowledge base terms readily accomodate them. In Cyc, the convention is to represent these modifiers as class membership. For example a red object is an object that is a member of the class of red-colored things. A fast activity is an activity that is a member of the class of those activities that progress in a fast manner.
The bootstrap dialog system has as two ultimate goals (1) the acquistion of broad coverage of English vocabulary and grammar, and (2) the aquisition of skills, especially programming skills so that its behavior can be improved without further recourse to programming it in Java.
-Steve
Steve Reed on 01 Feb 2008 at 2:31 am #
[I received the following email comment from Evgenii Philippov]
I have carefully studied your site and here are my thoughts. I did not yet check FCG and Double R materials.
About me. I am software developer (10 years of Java experience) with pure math education and last 10-15 years I devote a lot of my free time to NLP and deep NLP understanding.
Overall, I am moving in exactly the same direction as you: my main area of interest are intelligent chat-bots with precise deep NL understanding, and automatic program analysis and synthesis.
I did a project similar to yours and reached similar results. That is, I used a simple use-case (4 lines of English) and successfully parsed it into logical propositions. I used a custom-built parser with custom grammar formalism, and a toy grammar. You have moved a lot further: you use a good formalism for parsing/generation, and a good grammar formalism as a starting point.
OK, we achieved that
What I find complex are further stages. I did not find any thoughts or details on these on your site.
Some examples of hard tasks are:
- Learning unknown grammar rules
- Learning unknown words
- Resolving word sense ambiguities
- Resolving co-reference ambiguities
- Handling ungrammatical input (imagine parsing javadocs as a use case)
What mechanisms do you want to employ in these areas?
I’ll share my use case which my system used. My ultimate goal for the first stage was to create an automatic programmer who would be capable of parsing the following specification, and capable to emit a Java program that implements this specification. The specification was:
Write a TCP echo server. (Client connects to the server, sends bytes to the server, and receives them back immediately, until client does disconnect on behalf of its own). Server must be able to serve more than one client simultaneously. The server should be implemented in Java.
My system, as I said, is able to parse this text into a set of logical predicates. It accomplishes this using a toy grammar and a toy set of verb frame predicates. (I used OpenCyc as a predicate storage, but I haven’t used its commonsense reasoning except for querying the predicate extent and things hierarchy queries.) I have some screenshots of my system’s internal structures and results, if you want.
Now I am stuck with further development. I expected to find some future directions on your site, but failed.
Here are the reasons why my system can’t move:
- I find it difficult to cope with ambiguities. I don’t really know how to reason about them.
- I don’t know how to represent programs and programming domain knowledge. This applies even to synthesis.
- I find it difficult to cope with ungrammatical input.
Actually my system is just my personal toy, not affiliated with anyone. And I would like to join forces since we try to reach exactly the same goal, and I like your approach.
One another thing that I would like to ask:
What place for (a) AGI in general and (b) Novamente/OpenCog-like AGI do you see in the TexAI architecture? I guess AGI will be necessary for language understanding, acquisition, automatic programming, bootstrapping, wikipedia parsing — nearly for any of the ultimate TexAI goals.
Evgenii Philippov
Steve Reed on 01 Feb 2008 at 2:32 am #
Hi Evgenii,
Thanks for commenting on my web site. Assuming that you do not mind, I am going to post your message as a comment on my blog to which I will reply as follows:
First, given that our approach to creating artificial intelligence is indeed very similar, I suggest that you study Fluid Construction Grammar, and Double R Theory. Tomorrow I give a talk at Cycorp about these two topics, and my partially completed presentation is stored in the Texai project code repository at http://texai.svn.sourceforge.net/viewvc/*checkout*/texai/IncrementalFCG/doc/Incremental%20Fluid%20Construction%20Grammar.ppt
> Learning unknown grammar rules
The dialog system will learn unknown grammar rules by being taught by a human teacher. I will develop knowledge acquisition scripts for this purpose. I will not use a statistical approach because I want precise understanding.
> Learning unknown words
The published Texai lexicon is already very large, thus it is likely that the dialog system will be able to find existing word forms. However, only about 11,000 word senses are mapped to concepts in the knowledge base. The dialog system will have knowledge acquisition scripts that allow the human teacher to map previously unmapped words to the knowledge base. While at Cycorp I wrote a workflow tool for mapping WordNet word senses to Cyc terms, and I have a good idea what behavior is required from the system.
> Resolving word sense ambiguities
> Resolving co-reference ambiguities
I will use the theory of Walter Kintsch named Construction/Integration. In this theory, all the possible interpretations of an utterance are put into a connected network in which the nodes are the individual propositions and in which the nodes are linked if they share concept terms. Spreading activation is employed to prune out all but the most strongly stimulated interpretation. I have written a Java module that duplicates his book example, and I am eager to see if this works. Unfortunately, there are no other parsers that use this technique.
> Handling ungrammatical input (imagine parsing javadocs as a use case)
Construction grammar has the feature that any syntax unit can be a construction. If I can understand a Javadoc comment, then I should be able to describe a construction rule for it. I think that constructions can be generalized to accommodate non-text elements such as layout and icons.
> My ultimate goal for the first stage was to create an automatic programmer who would be capable of parsing the following specification, and capable to emit a Java program that implements this specification …
I also want my bootstrap dialog system to acquire programming skills. Regarding your use case I expect that your system should have an adequate commonsense understanding of the application domain, and relevant domain algorithms:
* TCP socket protocol
* Client/Server paradigm
* multithreading, including thread safety
* what “echo” means
It should have an understanding of the generally applicable programming algorithms that are relevant:
* object-oriented program classes and instances
* variable typing
* statement sequencing and control
* method definition and invocation
* program composition (e.g. initialization, process, finalization, and exception handling)
* test case generation
* programming safety (i.e. be Friendly and do not cause the host to crash)
And it should understand the relevant Java concepts:
* how to compose a Java class
* how to map attributes of domain objects to Java instance variables
* how to compose Java statements
* how to use applicable class libraries
* how to compile and execute a Java program
If the Texai dialog system were applied to your use case, then I expect a great deal of the effort would be to acquire the background algorithm knowledge from human teachers. Furthermore, I think that the specified program would be developed during a dialog with the teacher. It would only be after the system has adequately learned the subject, that it could program automatically from text specifications.
> What place for (a) AGI in general and (b) Novamente/OpenCog-like AGI do you see in the TexAI architecture?
I wish to create Artifical General Intelligence, and am building the bootstrap dialog system to achieve it ultimately using a multitude of volunteers to teach it. This is in constrast to building an intelligent chatbot for its own sake, and therefore requiring AGI because that is what’s needed.
OpenCog and Novemente are closer to raw perceptions than my own work. I believe that my approach, at a higher level of abstraction, can achieve recursive self-improvement faster than beginning with low level sensations and pattern recognition. I am using the Albus Hierarchical Control System architecture because my system will ultimately have interaction with, and have its concepts grounded in, the real world.
Cheers.
-Steve
texai.org » Bootstrap Dialog System Status 2008-02-08 on 09 Feb 2008 at 2:12 am #
[…] I posted the bootstrap dialog system design. See the project roadmap to see how this component supports the creation of artificial intelligence. […]
John Bäckstrand on 26 Mar 2008 at 7:39 pm #
Wooha, jabber and AI. Sounds like pure goodness to me