Maintaining Consistency and Plausibility in Integrated Natural Language Understanding

Maintaining Consistency and Plausibility in 'Integrated Natural

Language Undeistanding

Toyoaki Nishida, Xuemin Liu, Shuji Doshita, and Atsushi Yamada

Department of information Science Kyoto University Sakyo-ku, Kyoto 606, Japan

phone: 81-75~751-?,lll ext. 5396 email: nishida%doshita.kuis.kyoto-u.junet%j apan@relay.cs.net

in tliis paper, we present an inference mechanism called the integrated parsing engine which provides a uniform abd active inference mechanism for natural language understanding. It can (1) make plausible assumptions, (2) reason with multiple alternatives, (3) switch the search process to the maximally plausible alternative, (4) detect contradiction and tame con-clutions which depend on inconsistent assumptions, and (5) update plausibility factor of each belief based on new observations. We demonstrate that a natural language understanding system using the integrated parsing engine as a subsystem can pursue a guided search for most plausible interpretation by making use of syntax, semantics, and contextual information.

Natural language understanding involves lots of hard issues such as various types of ambiguities, indeterminacies caused by ellipses or fragmental utterances, or ill-formedness. Being confronted with these difficulties, it does not seem reason able to seek for a method of logically deducing the speaker's intended meaning or plan from utterances. Instead, it is much more natural to characterize natural language understanding as an abd active process of exploring most plausible interpretation which can explain given utterances.

In this paper, we present an abductive inference mechanism, called the integrated pars- ing engine, for natural language understanding. The integrated parsing engine is able to:

• make plausible assumptions at appropriate time

e reason with multiple alternatives based on different sets of assumptions

• switch the search process to the maximally plausible alternative

• detect contradiction resulting from inconsistent assumptions and eliminate all conflations which depends on these assumptions

® update plausibility factor of each belief based on new observations.

Thus, the integrated parsing engine is general enough to carry out linguistic and nonlinguis tic inferences in a uniform manner, by drawing information from various sources: syntax, semantic, discourse, pragmatics, or real world.

In the remainder of this paper, we first describe mechanisms for maintaining consistency and plausibility. We then show how these two mechanisms interact to guide the inference pro cess. Finally, we use an implemented example to demonstrate how the integrated parsing engine is used to interpret sentences by taking contextual factors into account.

The (/ME (Consistency Maintenance Engine) is a. component of the integrated parsing engine responsible for maintaining consistency among beliefs. Basic design principles of the CME is based on de Kleer's ATMS (Assumption-based Truth Maintenance Engine) [de 86]. The CM]'] maintains a set of alternative beliefs, each of which, consists of a set of assumptions and their conclusions, as follows:

alternative t{All, • • • ?A\mx}B\11 • • •tBimialternativen {A„:t, .•^•J^nra,}Bn1.1 • •- ,Bnm^

cnvir onment conclusions An external problem solver is assumed to exist which makes assumption, adds conclusion, and detects contradietion.

The main task of CME is to maintain alternative beliefs by removing all alternatives whose net of assumptions has turned out contradictory. Like ATMS, the CME takes advantage of the following monotonie property:

if a contradiction is derived from a set of assumptions A, then contradiction is also derived from any set of assumptions B such that B D A.

Thus, if contradiction is derived from a set of assumptions {0S.D}, alternative interpretations depending on sets of assumptions such as {B, C, D), {A, B, £>}, {A, B, Ct D\, ... are removed. In addition, the CME keeps records of contradictory sets of assumptions to prevent any interpretation depending on them from being considered in future.

Unlike ATMS whose control regime is bread-first, our CME uses a tree called the environment tree, or the E-tree for short, to guide the search process. Each node of the E-tree represents an environment, a set of assumptions. Each arc of the E tree represents that a lower node is derived from the upper node by making one more assumption. Thus in figure 1, Eq is the root node, and it represents an environ-mnet without any assumption. Nodes below &o represent environments with one or more assumption added to its parent node's environment. Thus, Ei -~ Eq U Mi} = En ~ Et U {An} = {/ii, /In}, and so on.

We assume that a set of assumptions made at the same parent node are mutually exclusive. Although this is a rather strong assumption, it makes sense in natural language understanding .since many assumptions being made during the natural language understanding process are mutually exclusive. Even if this is not the case, any set of assumptions can be transformed into a set of mutually exclusive assumptions by adding appropriate conditions. Although this is a cumbersome solution, it does not often take place in natural language understanding and most importantly it saves the amount of computation.

ho= {yli,^ii}-{Ai,Aini} ~{An,Anl} = {An,Anm,}

E0Pi^^/r^~~^-J,K/PiEx={Ai}E2= {A2}En= {An}= {Ai,An} ={Ai,/iini} ={A„,Anl} = {An,AnmJ

Note that the CME alone cannot determine which way to go when there is more than one possibility of extending the set of beliefs. This information is provided by the PME, as described in the next section.

The PME (Plausibility Maintenance Engine) maintains estimations of how plausible each environment is. This information is given as conditional probabilities and it is kept as annotations to each arc of the E-tree. Thus, in figure 2, which is a slightly more precise version of figure 1, pi stands for P(Ei), pij for P(Ej\Ai), Pijk for P{Ek\Ai, Aj), etc.

It follows from the property of conditional probability that if i -fi j and E, and Ej are immediate children

P(Ei\...Ej...) = 0,4B3E0

(a) initial B-tree Linguistic and Nonlinguistic Pioblem Solver (b) The E-tree after -iEt is observed.

1/2^--"\l/2

E3 E4 EsFigure 3: A Sample E-tree with Annotation

of the same parent. Furthermore,P(Ei\..^Ej...)=,0,

if Ej is a parent node of .

Initial value of p^s are to be given from the external problem solver. The PME's role is to maintain estimation of plausibility by taking into account given observations. Currently we only take -i ./■/', the event of environment E running into contradiction, as an observation. We use a Bayes' law to modify P(A) into P(A\-iE). Thus, if Ei and Ej aie brothers, (l) is further simplified to:

P(EihE3)Pj^EAEj)•P(Ej)P(-,Ej)(1-PjEAPj)).P(Ej)l-P(Ej)(1)l-P(Ej)-(2)

For example, suppose it has turned out that environment e4 is in contradiction and hence ~<Ei is observed (figure 3(a)). The annotations to the E-tree are updated as in figure 3(b). Notice that the update of conditional probability can be done based on local information.

Knowledge Base

Problem Solving Engine (PSE)

Working Memory

Associative Networks Previous Topic

The Integrated Parsing Engi

The Integrated Parsing Engine (CME) Plausibility Maintenance Engine (PME)

"Ni Pi

E-true ft .

Ik Hi

Figure 4: The Structure of a Natural Language Understanding System with the lute grated Parsing Engine as a subsystem

4 Natural Language Understanding System Using the Integrated Parsing Engine as a Subsystem,

The integrated parsing engine consists of the CME and the PME. The architecture of a natural language understanding system with the integrated parsing engine as a subsystem is shown in figure 4.

The knowledge base contains various types of information for language comprehension, in eluding lexicon, morphology, syntax, semantics, discourse, pragmatics, commonsenses, and so on. The whole system is controled by the problem solving engine (PSE). The PSE can access to the knowledge base and use the integrated parsing engine as an aid to seek for most plausible interpretation. Input texts are analyzed in a sentence-by-sentence manner. The discourse structure is maintained as a previous topic in the working memory.

When it scans a new sentence, the PSE first initialize the E-tree with only the root node. Then the PSE repeats the following cycle:

(step 1) choose a leaf node with the high est probability as a working enviri ornent (step 2) repeatedly derive conclusions from

the library the xerox the meeting room room room

icy Ikey 2 key 3

believed propositions until either (a) the goal is achieved, (b) contra diction is derived, or (c) no more conclusion is derived unless malting more assumption.

In case (a), the process halts.

In case (b), the process is passed to the PME, which modifies current estimation of plausibility so that this fact is reflected, then an alternative of mexhuuui plausibility is chosen and is suggested to the CME.

In case (c), the process also is passed to the PME, which assigns plausibil ity to new nodes, and working environment is chosen again.

The integrated parsing engine has been written in Lisp. It is running with a small exmeri-mental grammar for Japanese. The next section shows how it works.

Suppose a dialog environment in which a pro lessor speaks to a clerk to borrow a key of some rooms (figure 5) and utters the following Japanese sentence:

(3) KA SH I TBKUDA SA Ï(a/the) key Object> lend could you...?"could you lend (me) (a/the) key?"1/3^-^^2/3 { @ weird-1} {@word-2}

The referential meaning of this sentence is ambiguous if there is more than one key in a given situation. Suppose three keys are there: keyl for a library room, key2 for a xerox room, and keyS for a meeting room.

Although sentence (3) is ambiguous in normal contexts, it becomes much less so if it follows sentences like:

(4) HO N WO KO PI I SHI TA I NO DE SU GA "I'd like to xerox some books."

Even if no previous sentence is spoken, sen tence (3) is acceptable in a situation where the speaker and the hearer mutually believe that the xerox room is accessed so often that "the key" is usually used to refer to key2, the one for the xerox room.

Note that the omission of the patient case does not matter in usual situations, since there is a strong default that the filler of this case is the speaker.

Now let us show how sentence (3) is analyzed in a context where sentence (4) was previously uttered. The task of analyzing input starts from recognizing words. Lots of ambiguities arise in this phase. For sentence (3), *KA' might be a single word lKA' (postposition marking interrogative) or a part of a longer word 'KAGI' (key). Since longer match is considered to be more plausible in general case in Japanese analysis, we assign larger number of probability to the latter possibility. Following this analysis, the PSE makes the assumptions to the integrated parsing engine:

<§>word~j (take the sequence 'KA' as a word): probability 1/3.

®\nord-2 (take the sequence 'KAGI' as a word): => probability 2/3.

Accordingly, the CME extends the initial E-tree as in figure 6. Since, the environment E\ has the highest plausibility, the CME chooses it for the next environment and control is returned to the PSE.

*U5key I —■key3-

the library ~ room xeroxing ..the meeting--meeting

room

Now the PSE tries to derive further conclusion in the chosen environment. After having recognized that the part of speech of the word 'KAGI' is noon, the PSE tries to find out the referent of the noun and realizes that three ambiguities arise in this situation. Again, the PSE calls the CME to make assumptions. At the same time, the PSE is called for to assign estimated conditional probabilities to each assumption.

Currently, the system uses an associative network as shown in figure 7 to determine plausibility. Nodes of this network represent either a concept or an instance, and arcs mean that the two concepts or instants at its both ends have a certain relation. Those items which have dense connections to previous subjects are considered to be plausible as a referent. In our example, since the node xerox is marked as the previous subject key 2 is considered most plausible, while keyl is less plausible and key3 much less. Thus, the following assumptions are made:

@re/eren<-2 (consider 'KAGI' to refer to key2): => probabiliy 1/2.

@referen1-3 (consider 'KAGI' to refer to key3): => probabiliy 1/6.

In case no previous utterance is given, the PSE will consult information given as a priori measurements.

The E-tree now becomes as in figure 8, and {@word.~2, @referent~2}, which is the mostCurrently we use a very simple algorithm for assigning those value: when there are three alternatives, the densest connection receives the value (1/3), the second (1/2), and the third (1/6), regardless of how closely they are related to each other. We plan to develop a much more precise method in a near future.

1/3.

meaning--2 meaniüg-3 mcan'mg-4

referent-2-ßrcferent-2 X meaning--1 word-J word-2 WO KA SHI TE KU DA SAÏ Notice that all part of this network is not explored in actual processing.

plausible environment at this point, is chosen as the next environment. The analysis is continued this way until the semantic representation is obtained for the whole sentence. The inter pretation obtained this case is:

event — asking-for actor = <the speaker> object = key2

Notice that the efficiency of the analysis is significantly improved when strong expectation exists. For example, although character 'SHï' in sentence (3) has many possible interpretations in Japanese, the system is not annoyed by those ambiguities, since this part of the sentence just goes as expected. The system may come to suspect it only when most of its expectation fails.

4 8 6

object-ctise-1 objcct-casv-2 object-case- referent-1

addition, the integrated parsign engine provides a concise and high level mechanism for abd active reasoning. We have carefully chosen a set of reasonably high-level functions necessary for abductive reasoning. This serves, to much simplifying natural language understanding system than otherwise.

Now suppose the above interpretation is rejected for some reason, say by explicitly negated by the speaker. Then the system will eventually produce an alternative interpretation taking keyl as a referent, by changing annotations to the E-tree as in figure 10.

This paper was inspired by a number of works. A massively parallel parsing by Waltz and Pollack IWP85] has demonstrated the effect of integration through a uniform computation mechanism (marker passing) in context-dependent comprehension of discourse. They have pointed out the importance of non-logical, associative relation between concepts. Char-niak has pointed out the abductive nature of language comprehension. Charniak's Wimp [Cha86] uses a marker passing mechanism as a basis of abductive inference engine for language comprehension. But it is not used alone; it is augmented by a logical process called path proof. In a parser used in Lytinen's MOP-TRANS [Lyt86], a mechanism is provided to allow close interaction between syntax and semantics, while keeping the modularity of the system. Another thing to note is that Lytinen's integrated parser makes use of strong semantic expectation to constrain the search.

The integrated parsing engine presented in this paper takes advantages of these preceding works. Unlike Waltz and Pollack, and like Charniak and Lytinen, our integrated parsing engine has a hybrid architecture for logical and non-logical inferences. What is novel with our integrated parsing engine is the method of integrating and maintaining logical and non-logical information obtained from various source. In

We have presented an inference engine for integrated natural language understanding, based on a characterization of natural language understanding as an abductive process. The essence of our approach is connecting consistency maintenance engine and plausibility maintenance engine closely enough to allow their dense interaction. Although we have shown rather "low level" issues, we believe the same idea is applicable to "higher level" problems such as inferring speaker's intention and plan.

[Cha86] Eugine Charniak. A neat theory of marker passing. ïn. Proceedings AAA l-86, pages 584-588, 1986.

[de 86] Johan de Kleer. An assumption-based has. Artificial Intelligence, 28:127162, 1986.

[Lyt86] Steven Lytinen. Dynamically combining syntax and semantics in natural language processing. In Proceedings AAAI-86, pages 574-578, 1986.

[WP85] D. Waltz and J. B. Pollack. Massively parallel parsing: a strongly interactive model of natural language interpretation. Cognitive Science, 9:51-74, 1985.

4 LI 7