Incremental Fluid Construction Grammar Released
On the SourceForge project site, I just released the Java library for Incremental Fluid Construction Grammar.
Fluid Construction Grammar is a natural language parsing and generation system developed by researchers at emergent-languages.org. The system features a production rule mechanism for both parsing and generation using a reversible grammar. This library extends FCG so that it operates incrementally, word by word, left to right in English. Furthermore, its construction rules are adapted from Double R Grammar. See this post for more information about Double R Grammar.
Execution scripts for a parsing benchmark and for the unit test cases are supplied in Linux and Windows versions.
The use case utterance “the block is on the table” yields the following RDF statements (i.e. subject, predicate, object triples). A yet-to-be written discourse mechanism will resolve ?obj-4 to the known book and ?obj-18 to the known table.
Parsed statements about “the book”:
- ?obj-4 rdf:type cyc:BookCopy
- ?obj-4 rdf:type texai:FCGClauseSubject
- ?obj-4 rdf:type texai:PreviouslyIntroducedThingInThisDiscourse
- ?obj-4 texai:fcgDiscourseRole texai:external
- ?obj-4 texai:fcgStatus texai:SingleObject
Parsed statements about “the table”:
- ?obj-18 rdf:type cyc:Table
- ?obj-18 rdf:type texai:PreviouslyIntroducedThingInThisDiscourse
- ?obj-18 texai:fcgDiscourseRole texai:external
- ?obj-18 texai:fcgStatus texai:SingleObject
Parsed statements about “the book on the table”:
- ?on-situation-localized-14 rdf:type texai:On-SituationLocalized
- ?on-situation-localized-14 texai:aboveObject ?obj-4
- ?on-situation-localized-14 texai:belowObject ?obj-18
Parsed statements about that the book “is” on the table ( the fact that ?on-situation-localized-14 is a proper sub-situtation of ?situation-localized-10 should also be here):
- ?situation-localized-10 rdf:type cyc:Situation-Localized
- ?situation-localized-10 texai:situationHappeningOnDate cyc:Now
- ?situation-localized-10 cyc:situationConstituents ?obj-4
Next tasks are to integrate IFCG into the existing, but not yet released, dialog framework. The framework will heuristically guide the application of construction rules during parsing, and plan the application of rules during generation. Furthermore the framework will incrementally prune alternate interpretations during parsing by employing Walter Kintsch’s Construction/Integration method for discourse comprehension.
Steve Reed on 10 Jan 2008 at 9:00 pm #
In the AGI mail list, Ben Goertzel asked:
What is the semantics of
?on-situation-localized-14 rdf:type texai:On-SituationLocalized
On-SituationLocalized is a term I created for this use case, while postponing its associated definitional assertions. What I have in mind is that On-SituationLocalized is a specialization of SitutationLocalized in which some object is “on” some other object. Because the Texai KB is derived from the rule-lacking, RDF-compatible subset of OpenCyc, I am postponing for now the representation of required commonsense rules about these terms.
Ben asked:
How would your system parse
“The book is on neuroscience”
or
“The book is on the Washington Monument”
or
“The book is on fire”
or
“The book is on my shit list”
Construction Grammar (CxG) is differs from other grammars in that there is much less emphasis on grammar rules. Its basic principle is that language consists of pairings between form and meaning. That is, humans acquire these pairings as young children, matching their parent’s intentions with some signs or utterances. As they develop linguistic skills, humans induce that patterns exist among these pairings, allowing novel combinations to be understood and produced. According to CxG, there is no universal grammar, there are no absolute grammar rules constraining what is well-formed or not, and there may not even be lexical categories (e.g. nouns, adverbs, etc.) Therefore, from the viewpoint of CxG, your example variations of the “on construction” have their own associated semantics, and are necessarily covered by the rules that I developed for my sense of “on”. In my system the construction rules are adopted from Double R Grammar, which renounces traditional lexical categories in favor of functional, semantically motivated constituents, namely (1) objects, things that can be referenced and (2) relationships among referents. Double R Grammar thus facilitates the mapping of text form to semantics.
Let’s pretend that the Texai dialog system for lexical acquisition exists, and that the system is to learn constructions for your examples. The below dialog is Controlled English, in which the system understands and generates constrained syntax and vocabulary.
[user] Do you understand “The book is on neuroscience”?
[texai] Not entirely, neuroscience is not a physical location, how can the book be on it?
[user] The utterance means that the book is about neuroscience.
[texai] Could I generalize “on neuroscience” to any field of scientific discovery, such as “on Astronomy”, and mean that the book is about Astronomy?
[user] yes
[texai] Could I generalize field of scientific discovery to some general region, such “on deep space?”
[user] yes, this role can be generalized to any topic.
[texai] Does the utterance “the book is on Mars” entail both the interpretation that (1) the book is physically located on Mars, and (2) the book is about the topic Mars?
[user] yes
…
[user] Do you understand “the book is on fire?”
[texai] Yes, the book is about the topic fire.
[user] There is another, more common, interpretation: There is a fire in which the book is the thing that is burning.
[texai] Could I generalize fire to any decomposition process, such as rot?
[user] no
…
[user] Do you understand “the book is on my shit-list?”
[texai] Yes, the book is about the topic shit-list.
[user] There is another, more common, interpretation: There is a shit-list, and the book is an element of the list.
[texai] I know from Wiktionary that a shitlist is “a group of people who a person holds in disregard”, but a book is not a person.
[user] The elements of a shit-list can be things.
[texai] Now I understand that “the book is on my shit-list” commonly means that the book is an element of the group of things that you hold in disregard.
…
Hope this answers your questions. And thanks for advancing my use case!!!
Steve Reed on 10 Jan 2008 at 9:56 pm #
Another comment from the AGI list, in which I answer a question from Mike Dougherty:
If I understand your question correctly it asks whether a non-expert user can be guided to use Controlled English in a dialog system. In such a system it is expected that small differences exist between the few things that the system understands and the vast number of things that the system does not understand. The differences can be morphological (e.g. spelling), or lexical (e.g. vocabulary), or syntactic (e.g. passive vs active), or semantic (e.g. word sense). Therefore my challenge is to (1) find a polite, non-boring, engaging manner to get the user to say things the way the system can understand, and (2) enable the system to understand new forms, such as what the user is trying to say but currently cannot be understood. The Texai bootstrap dialog system will be an expert system on lexical knowledge acquisition, and hopefully will swiftly grow past the very-hard-to-use stage.
This is an idea that I wanted to try at Cycorp but Doug Lenat said that it had been tried before and failed, due to great resistance among users to Controlled English. Let’s see if this idea can be made to work now, or not.
Steve Reed on 10 Jan 2008 at 10:20 pm #
Comment from the AGI list in which I answer a question from Will Pearson:
[Will said]
…What I mean by it, is systems that can learn from lessons like the following
http://www.primaryresources.co.uk/english/PC_prefix2.htm …
[I answered]
Affixes are morphological constructions and my system could have rules to handle them. I plan eventually to include such rules for combinations that are new. However the Texai lexicon will explicitly represent all common word forms and multi-word phrases that would otherwise be covered by rules in order to accommodate exceptions. My goal is precise understanding and generation, and that goal is guided by the desire to be cognitively plausible, (i.e. do as humans do). I believe that the human mental lexicon caches morphological rules in the projected word forms paired with their semantics, and invokes these rules only when comprehending a new or uncommon combination.
texai.org » The Current State of Affairs in Pursuit of a Commonsense AI on 12 Jan 2008 at 12:23 am #
[…] (i.e. ambitious) project is learning from English dialog, in the deep understanding sense. Incremental Fluid Construction Grammar is big step in this direction. Various chatbots modify their behavior based on experience but that […]
texai.org » Bootstrap Dialog System Design on 20 Jan 2008 at 12:45 am #
[…] is the key deliverable for the Texai project this year. The use case is described this previous comment. Although this application could be written without fitting into an Albus Hierarchical Control […]