FeaMble Learnability of Formal Grammars and \[~?he Theory of Natm'al Language Acquisition

Naoki ABE l)epartment of Computer and Information Science

University of Pem~sylvania

Philadelphia, PA 19104-6389 A bst;ract We propo;:e to apply a. complexity theoretic notion of feasible learnability called "polynomial learnability" to the evaluation of grammatical formalisms for linguistic de.~;criptiol). Polylm-. mill h;arnability was originally defined by Valiant in the context of bo,llean concept t(!arniiig and sul)scquetltly generalized hy Blumec el, al. to i~llinita.cy domains. We give a clear, intuitive exposition el' this notion (/l' k'arnability au(l what characteristics of a collection of hmguages may or many not help feasible learn-ability under this paradigm. In particular, we preset,t a novel, nontrivJal ::onstraint on the degree of "locality" of grammars which allows a ri& class of mildly context sensitive languages to be feasibly learnable. We discuss pos,';ihle implications of this observati(m to the theory of natm'al language acquisition. t. Introduct, ion A central i~sue o\[ linguistic theory is the "t)~'ojectio~l prohhml", which was origblally prol)osed by Noam Chomsky \[?\] and sub sequ(mtly l.?d to much of the development in modern linguistics. This probh,.m pose~ the question: "i\[ow is it posslbk~ for human infants to acquire thei,' native language on the basis of casual exposure to limited data in a short amount of t, ime?" The proposed solulion is that the human infant in ell\;ct "knows" what the natura{ language that it is trying to learn could possibly be. Another way to look at it is that there is a re.latively small set of possible grammars that it would be able to learn, and its learmng stratergy, implicitly or explicitly, takes adwmtage of this apriori knowledge. The goal of linguistic theory, then, is to &aractedze this set of possible grammars, by specifiying the constraints, often cMled the "Uniwwsal (Irammar". Tile theory of inductiw~' inference oilers a precise solution to this problem, by characterizing exactly what collections of (or its dual "constraints ou") languages atisfy tile requirement for being the set of possible grammars, i,e. are learnable? A theory of "feasible" inference is particularly interesting because the language acquisitkm process of a human infant is feasible, not to mention its relewmce to the technological counterpart of such a pwbh'.m.

In this paper, we investigate the learuability of formal grammars for linguistic description with respect to a complexity theoretic notion of feasible lea.rnability called 'polynomial earnability'. Polynomial learnabillty was originally developed by Valiant \[?\], \[?\] in the context of learning boolean coitcei)t from examples, artd subsequently generalized by I llumer et al. for arbitrary concepts \[?\]. We apply this criterion of feasible lcarnability to subclasses of formal grammars thai, are of considerable linguistic interest. Specifically, we present a novel, nontrivial constraint on gramma,:s called "k. locality", which ena\])k~s a rich ehlss of mildly context sensitive grammars called l{ank<~d Node Rewriting G'rammars (RNI{.( 0 to be limsibly lear1~able. \'Vc discuss possible implications of this result to thc Lheory of natural Inn guagc acqui:~ition.