Bootstrap Dialog System Status 2008-04-08
Note that the system now can only understand a single utterance. Progress towards a broader coverage of English awaits the completion of the Grammar Acquisition Skill, and the Vocabulary Acquisition Skill.
I’ve completed writing the English Comprehension Skill, described with graphs here, and tested it on my use case utterance “the book is on the table“. This skill contains, as a component, the parsing rule application library described in my previous system status post. In February and March I completed two additional libraries: one to perform Kintsch spreading activation, and the other to perform discourse elaboration. Both use the AI technique of spreading activation. Accordingly, I factored out a generally useful Java library for spreading activation and released it on the SourceForge project site as a separate package.
Attention is now focused on English language generation, which will be implemented in the English Generation Skill, and will be supported by at least two new libraries:
- Discourse Planner Library - aggregates meaning propositions into sentence-sized chunks and attaches rhetorical relations (e.g. motivation, elaboration) that relate sentences
- Generation Rule Application Library - looks up and applies Fluid Construction Grammar rules to transform a set of meaning propositions into an English utterance
NL generation systems typically contain a module for content determination. I will postpone writing this until its capability can be defined as part of the Texai dialog behaviors such as lexicon acquisition.
Having a cognitively plausible system (i.e. a human-thinking-like system) is a design guideline for the Texai project. Therefore I divide English language generation into two different kinds of cognitive activities:
- dialog utterance generation
- monologue or document generation.
In the first case I expect that being cognitively plausible means that use of a pipelined architecture is precluded, and that the generation system incrementally generates a short utterance or only a few sentences that are grounded in a rich shared discourse context. In the second case I think that being cognitively plausible is much more permissive with regard to the variety of problem solving techniques that can be employed to craft an appropriate document for an expected audience. To illustrate the differences, I suppose that (deep) planning cannot be used when generating a dialog utterance, but that it is a good tool when laying out a document. Humans may perform the act of planning when creating a document.
The dialog utterance generation case is clearly easier to code, as it is mostly the same behavior steps as dialog utterance comprehension, except performed in reverse. So I plan to code for case (1) above, but keeping case (2) in mind for future compatibility.