<?xml version="1.0"?><!DOCTYPE article SYSTEM "/project/take/software/searchbench_offline_processing/paperxml_generator/aclextractor/src/python/../resource/dtd/paperxml.dtd"><article><header><firstpageheader><page local="1"/><title>WORKING ON THE ITALIAN MACHINE DICTIONARY: A SEMANTIC APPROACH</title><author surname="ZAMPOLLI" givenname="ANTONIO"><org  name="general framework"/></author><author surname="PECCHIA" givenname="LAURA"><org  name="general framework"/></author><author surname="CALZOLARI" givenname="NICOLETTA"><org  name="general framework"/></author></firstpageheader><frontmatter><p>Nicoletta Calzolari - Laura Pecchia - Antonio Zampoixi *</p><p>WORKING ON THE ITALIAN MACHINE DICTIONARY: A SEMANTIC APPROACH</p><p>1.    general framework</p></frontmatter><abstract></abstract></header><body><section title=""><p>1.1. <i>Foreword.</i></p><p>The work described by the two co-authors of this article is pre­sented with a double objective: apart from giving specific details on a particular project they also wished to provide a concrete example of the type of research which has been made possible by the Italian Machine Dictionary (dmi).</p><p>The dmi is, in fact, one of the principal projects of the Linguistics Division (di) of cnuce. Other articles in the first volume of the Pro­ceedings also refer to the dmi.<footnote anchor="1"/> In this introduction I intend to indicate briefly how the dmi project, and, in particular, how the research de­scribed in the article has been inserted into the framework of the whole complex of activities of the dl and into our general conception of linguistic data processing (ldp).</p><p>As I have already stated in my introduction to these Proceedings,<footnote anchor="2"/>it is my conviction that, at this moment, special attention should be taken in order to promote, both on the theoretical and on the practical level, systematic and ordered interaction among the many different lpd activities. In particular, this cooperation should be realized between those activities which focus on the construction of theoretical models and those focussing on the processing of large corpora of linguistic data. The activity of the dl, especially in recent years, has been in­creasingly directed towards this goal.</p><p><b>* A. Zampolli is the author of Part. 1., N. Calzolari and L. Pecchia are the authors of Part 2.</b></p><p><b>1 </b><b>See Vol. I, 1, pp. 257-262 and 297-301.</b></p><footnote label="2">See Vol. I, 1, pp. xx-xxi.</footnote><doubt alpha="0.0" length="1" tooSmall="False" monospace="0.0">4</doubt><page local="2" global="50"/><p>1.2.  <i>Activities of the Linguistics Division </i>(dl).</p><p>For approximately 10 years all, or almost all, of the research projects in the different fields of the linguistic data processing in Italy have been worked out with the collaboration of the dl in the computational side of their work.<footnote anchor="3"/></p><p>In the field of lexicography, large corpora of texts have been pro­cessed in order to produce the lexical archives necessary to construct extensive historical language dictionaries (see, for example, the <i>Tesoro delta lingua italiana delle origini </i>of the <i>Accademia della Crusca), </i>or diction­aries of " languages for special purposes " (e.g. the <i>Dizionario Giu-ridico </i>of the <i>Istituto per la Documentazione Giuridica).<footnote anchor="4"/></i></p><p>In both modern and classical philological research, the computer is now used with increasing frequency in Italy in order to automate the customary and traditionally time consuming task of indexing texts and producing concordances from them (e.g. the project for the analysis of the corpus of <i>Grammatici Latini, </i>ed. Keil),<footnote anchor="5"/> and also for a number of more specific, complex operations, such as the automatic com­parison of different editions of the same text (e.g. the project for the ' contrastive concordances ' of <i>Orlando Furioso </i>of L. Ariosto).<footnote anchor="6"/></p><p>Literary criticism and the history of literature are also beginning to make use of similar procedures, employing, in particular, statistical  - laura pecchia - antonio zampolli processing as an auxiliary tool in the study of the style of individual authors, schools, or literary genres.<page local="3" global="51"/><footnote anchor="7"/> Linguistic statistics is also adopted in psycho-linguistic studies, for example to " measure " the linguistic alterations introduced by certain nosological categories.<footnote anchor="8"/></p><footnote label="3">For a more detailed description and the relative bibliography see Zampoixi , 1973a, 1973b, 1977a. It is necessary to emphasize an important consequence of this fact. Firstly, almost all the projects underway in Italy in this sector adopt the standards introduced by the dl . In addition, an automatic library containing over 5000 texts in more than 20 languages has been established. This archive may be processed with general-purpose standardized programs because all the texts have been stored using the same scientific and technical criteria. Thus it is possible to perform some linguistic research operations which would otherwise be impossible. For example, one of our projects aims at con­structing a new model of the quantitative aspects of the language, on the basis of the data provided by this archive. The earlier models have been falsified by the new quan­titative data produced by the increasing number of text-processing projects underway in different countries. As a first step, we aim at identifying those linguistic facts which have a stable frequency in the texts of a language, those which have a frequency which is stable only within certain subsets of a language (literary genres, single authors, par­ticular themes, etc.), those whose frequency does not show appreciable regularity. In a second stage, an attempt will be made to construct and verify quantitative models to describe the regularities actually found and to identify the contextual factors connected with such regularities.</footnote><footnote label="4">See A. Duho (1973), C. Ciampi (1973) and F. Dimitrescu (1973). 6 See Griixi and others (1978).</footnote><footnote label="6">See Segre-Zampolli (1974).</footnote><p>A combination of statistical processing and algorithms of the " pat­tern recognition " type are used in a heuristic way on traditional oral texts to identify clauses, formulae, and, in general, the various elements of the popular repertory.<footnote anchor="9"/></p><p>In all the above quoted types of projects, the electronic data pro­cessing essentially aims at organizing, in computer storage or in printed form, all the linguistic units of a certain level (words, syntagms, syntactical structures, etc.) occurring in a text, in order to enable a more efficient, rapid and economic retrieval of them. In other words, the processing basically consists in the following types of operations : to input, store, manipulate texts of different kinds (which may be con­sidered as facts of <i>la parole) </i>; to recognize and explicitly represent in the text the occurrence of linguistic units (phonemes, lemmas, affixes, syntagms, syntactical types, etc.: these units may be considered to be at the level of <i>la langue) </i>; to execute some canonical operations (re­trieval, ordering, counting, comparing, etc.) on such units, in batch or conversational form.</p><p>We also cooperate with some projects in the field of full-text information retrieval, which also uses lexicographical-type processing for documentary purposes, mainly on juridical and historical texts.</p><p>All the above mentioned activities make use of closely inter-related procedures which the dl has developed and put into operation with the collaboration of various Italian Universities and cnr Institutes.</p><p>More exactly, it could be said that the dl has realized, or is in the process of realizing, a certain number of basic processing " components " and that each of the procedures so far developed consists in the con­catenation of some of these components.</p><p>The functions of each of these components are well-known within the ldp environment: the acquisition of texts in machine readable form ; the production of the typical results of lexical analysis (different types of concordances, context-cards, etc.) ; the representation of the large variety of characters typical of the ldp; morphological analyses and consultation of Machine Dictionaries (dms); syntactical parsers; phonological transcription; etc.<page local="4" global="52"/></p><footnote label="7">See A. Zampoixi (1975).</footnote><footnote label="8">See Castrogiovanni (1973). » See Ciresb (1973).</footnote><p>I feel that the following three characteristics of these components should be emphasized.</p><p><i>a) </i>They are conceived so as to be, as far as possible, generalized (i.e. applicable to all the texts processed at the dl, whatsoever their nature, language, or the purpose of the processing),<footnote anchor="10"/> flexible (the user can activate, within the set of rules which constitute the " al­gorithmic linguistic knowledge " of the program, those rules which best respond to his particular needs),<footnote anchor="11"/> and modular (the components must be inter-compatible and open to the inclusion of any eventual new components: the inter-compatibility is ensured by exchange-interfaces between the various components; these interfaces consist in a formalism which provides structures, organizations and codes for the representation of linguistic units both at the text and at the linguistic system level).</p><p><i>b) </i>These components may be used - at least in principle - with the same basic functions both in lexicographical-philological type ap­plications and in translation, documentation, question-answering, etc.<footnote anchor="12"/></p><footnote label="10">For example, the component proposed for the acquisition of texts in machine readable form performs the following functions: accepts, as input, texts in any natural language (as long as they can be transcribed alphabetically) of any period, or literary form or genre (scientific texts, recorded dialogs, protocols, interviews, novels, inventories, etc.); stores the texts on auxiliary memory; produces listings which reproduce the text as near as possible to its original form; supplies text editing facilities for checking and correction of eventual errors. At the basis of this component is an encoding system which is designed to represent all the different graphemes and graphic features which can appear in printed texts or can be inserted in them in the preediting stage.</footnote><footnote label="11">For example, the context of a word can be constructed and delimited by acti­vating and ordering diversely a suitably chosen subset of the available rules from a gen­eral contextualisation algorithm (see Zampoixi, 1971): to coincide the context with a structural unit (verse, strophe, etc.); to delimit the context exclusively on the basis of the punctuation immediately preceding or following it; to assign a specific portion of the syntactic structure as context, etc.</footnote><footnote label="12">In particular, at the beginning of the 60s, attempts were made to classify the dif­ferent systems for LDP according to the so-called ' depth-parameter ' of the linguistic level of operation. Such classifications selected a certain " depth " level along this pa­rameter, and drew in correspondence to this level the demarcation line between the uses of the computer in linguistics which merit the name computational linguistics (CL) and those which do not.</footnote><p><b>Our viewpoint is different. All computational systems functioning for linguistic researches or which operate on linguistic data belong to the CL. Besides, at least in prin­ciple, the majority of those systems, independently from the fact that they are considered either below or above an established demarcation line, have a number of components</b></p><p> - laura pecchia - antonio zampolli<page local="5" global="53"/></p><p>c) They are, as far as possible, the result of studies which are both research and operationally oriented.</p><p>1.3.   <i>The Italian Machine Dictionary </i>(dmi).</p><p>The dmi has also been realized in accordance with these criteria. It has been conceived and is used as a means for semi-automatic lem-matisation, i.e. for the recognition of the occurrences of the various units of the Italian lexical system within a text. It is used in lexicogra­phical, statistical, philological text processing and is utilized in full-text information retrieval systems in order to identify in the documents all the diffèrent forms which belong to the same lemma of a specific form appearing in the " question " asked by the user. It will be used to associate to the words from a text the information requested by syntactical and semantical parsers (morpho-syntactical categories, syn­tactical " valences ", semantical markers, etc.).<footnote anchor="13"/></p><p>In the lemmatisation stage, the dmi can be adapted by the user to obtain lexical analyses at different levels of complexity. We think of the definitions of a lexical unit (lemma) as a set of pertinent features (morphological, syntactical, graphical, etc.). Different inflected forms</p><p><b>in common. For example, a procedure for lexical analysis necessitates: the acquisition in machine readable form and the computer printing of a variety of texts and gra­phemes; a morphological analyzer and the consultation of a DM for semiautomatic lem-matization; syntactic and semantic parsers for homograph disambiguation. An auto­matic translation system requires all these features (in addition to the transfer and generation components).</b></p><footnote label="13">Of course, we have considered whether it would be possible and convenient to compile a DM without having first defined in detail the components which will use the linguistic information contained in it. As an example, let us consider the choice and the formalization of grammatical information (morpho-syntactical categories, valences, specification of possible constructs, etc.) to be coded in the dictionary as " input " of a syntactic parser. Obviously, this depends on the grammatical model and the strategy used by the parser. This does not necessarily mean, however, that once a DM has been compiled with specifically chosen grammatical information, it is necessary to substitute the grammatical part of the DM if the grammatical model should change. Although there are a number of different opinions on this important point, our experience has suggested that, eventually, it will be necessary to extend and complete the already ex­isting information rather than substituting it. In the majority of cases, independently of the definition of their theoretical status, the basic syntactical properties of a lexical unit may be formulated in a neutral way with respect to the model and systems which use them. This affirmation can be largely verified, at least for models within the same " scientific paradigm ", e.g. the generative-transformational ones. Nevertheless, there is perhaps enough evidence to assert that the basic information, at the morpho-syntactical level, is still, to a large extent, valid, even when considering other paradigms such as the so-called " artificial intelligence paradigm ".</footnote><page local="6" global="54"/><p>of a text are considered to belong to the same lemma if and only if they have in common all the pertinent features which identify a lemma, distinguishing it from all other lemmas. We have constructed an in­ventory of features which may be used in the definition of a lexical unit. Such an inventory is based upon a survey of the features used both in lexicographic practice and in linguistic theories. Each entry of the dmi is associated with the set of all the possible features of the inventory which may be used in its definition. The user is allowed to disactivate those features which he does not wish to utilize: for example, the differences between nominal and verbal use of participles or those between adjectival pronouns and pronouns, etc. Obviously, if some distinctions are neutralized, the number of lexical units which constitute the dmi, as defined by the user, and very often the number of possible homographs, are reduced. In other words, if we consider the dmi a concrete representation of the Italian lexical system, in which the lexical units are defined using all the features proposed by the dif­ferent lexicological and lexicographical traditions, the user can modify the structure of this system and the inventory of its lexical units in accordance with his specific linguistic requirements (Zampolli, 1973a). In this perspective, the dmi is used not only as a tool for text proces­sing but also as an object of studies and research in itself.</p><p>While in studies at the level of <i>la parole </i>the object is given immedi­ately for the ldp in the form of corpora of texts, the object in studies on <i>la langue </i>must be specifically constructed. An example which can be given is the first step in a research on the functional load of the phonological oppositions of a phonematic system. This step consists in the inventory of the minimal pairs existing in the lexicon for each opposition and therefore it presupposes the existence of an inventory of all the different forms of the studied language in phonological tran­scription. The burden of creating an inventory of this type and dimension, and the complexity of the operations required in order to discover and count all the minimal pairs are such that all those tasks are im­possible without a computer. Another example could be a study on the " rendement " of the different suffixes, which requires an inventory of all the words in which each suffix appears.</p><p>In order to make research work of this type possible, the dmi has been conceived diversely from most of the other dms in existence. These have usually been realized exclusively as components in trans­lation procedures, information retrieval systems, etc. Such dms, almost always, include only a limited number of lexical items.</p><p> - laura pecchia - antonio zampolli<page local="7" global="55"/></p><p>The dmi has a structure and dimensions that allow us to consider it as an exhaustive, automatically processable representation of the lexical component of the Italian linguistic system. The dmi is, therefore, intended as an instrument for research studies at the level of <i>la langue </i>where exhaustive inventories, data and observations are necessary.</p><p>1.4.   <i>Theoretical background.</i></p><p>The research project described below by N. Calzolari and L. Pecchia is an example of how the dmi can be used in this direction.</p><p>The actual situation of linguistic theory is that of constant change and development. Not only are the traditional models being con­tinuously modified but some researchers affirm also that the debate is now between theories which belong to different scientific " paradigms ". Examples usually quoted are the number of different generative-trans­formational schools (interpretive semantics, generative semantics, etc.), relational grammar, cognitive semantics. In this situation, some re­searchers present the following alternatives: whether the scope of the research work conducted in ldp must, of necessity, be directed towards a specific linguistic theory, or whether ldp can produce results which can be utilized by different linguistic schools.</p><p>For the sake of simplicity we will examine certain examples from the syntax field. A clear example of ldp activity directed at a specific linguistic theory is, in my opinion, offered by the so-called ' grammar testers ', i.e. those computational systems which apply a lexicon and a grammar for automatic sentence generation.<footnote anchor="14"/></p><p>These systems, at least in the intention of their creators, constitute a concrete and precise specification at the computational level of a determined linguistic theory; the grammar is considered as a program used to produce sentences; the algorithms which interpret the rules are considered as a part of the meta-theory; the production of concrete sentences serves to verify the coherence of the rules, the completeness and lack of contradiction of the formal apparatus and to indicate, practically, the extension of the subset of language generated by the grammar.</p><p>Evidently, these systems are intentionally strictly connected with the corresponding linguistic theory, constituting, it could be said, the computational " transcription " of it.<page local="8" global="56"/> The rapid evolution of the theories, the models, the formal apparatus require a continual updating of the corresponding computational systems, which does not seem very easy to realize, at least in practice. Furthermore, the generative-transformational schools whose theories are usually incorporated in these systems have so far only described isolated regions of the linguistic structure, aiming at verifying the adequacy of descriptive methods rather than at de­scribing coherently and exhaustively a language. As a consequence of this, anyone wishing to use the results of their researches in a compu­tational system would face a set of isolated observations distributed in different regions of a language, not systematically linked to each other, but divided by so far unexplored regions.</p><footnote label="14">This is not the place to enter into a discussion on the complex and well-known problem of the relation and the differences between " generation " as an abstract cal­culus of all the possible grammatical objects and the automatic " production " of concrete sentences.</footnote><p>On the other hand, however, the analytical methods produced by the generative-transformational theories have revealed a very efficient heuristic power, and have considerably increased the precision and subtlety of the observations. The number of new phenomena that have been revealed has grown notably in the last 20 years.</p><p>In front of this situation, the behaviour of ldp researchers may range between two alternatives.</p><p>The first position is usually characterized as the rise of a linguistic " computational paradigm ", which is distinct from, if not directly in con­trast with, the generative-transformational paradigm, and tends to assume the computational aspect among the principal characteristics of a lin­guistic theory. The conviction is expressed that the primary " focus " of linguistic research must be shifted from the description of the com­petence as formal abstract mechanisms towards the simulation-like studies of the processes which underlie the production and the com­prehension of the utterances. The " natural language understanding " computational system could constitute a powerful experimental and heuristic tool for the study of the complexity and the constraints of these processes, making it easier to emphasize the mechanisms of interac­tion between the components which are involved in these processes.</p><p>The scantiness of the results obtained so far (some of the devotees of this approach have likened the situation to that of medieval alchemy as opposed to modern chemistry) makes it impossible to formulate even a summary judgement. Nevertheless, it is quite clear that this type of research is limited, and will be probably limited, at least for some time in the future, to the consideration of extremely limited language subsets.</p><p> - laura pecchia - antonio zampolli<page local="9" global="57"/></p><p>The second position seems to prefer, in the actual situation of linguistic theory, a systematic examination of the data to an im­mediate construction of a formal global model. Obviously, the use of abstractions or notions (e.g. those of transformation or of com-ponential analysis) whose theoretical state may vary depending on the global evolution of the theory itself, but which have been seen to be experimental devices of extraordinary efficiency in the analysis, is not rejected. The complex formal mechanisms proposed by the generative-transformational school is not implemented into a computational system as a representation of a language theory " but some of their characteristics (form of the rules, relationship between the rules, etc.) are utilized to store, handle and organize the data accumulated in the inductive moment of the research.</p><p>ldp essentially offers two complementary contributions to this ap­proach. Firstly, it supplies techniques which permit the automatic handling of the data. Secondly, ldp studies algorithms which permit the data to be structured conveniently, organizing them so that their regularity, diversity, correlations, etc., can be evidenced without it being necessary to make this organisation dependent on the " a priori " choice of a general global theoretical model.</p><p>The inventories of linguistic units recorded in " machine readable form " must be considered within this framework and, in particular, those lexical inventories in which each lexical unit is supplied with an explicit, suitably coded, representation of its linguistic behaviour should be considered.</p><p>In addition, the use of a lexical inventory would facilitate the defini­tion of the degree of exhaustivity of the descriptions and the evaluation of the extension of the phenomena studied. (The term ' extension ' must be here understood obviously not as frequency of appearance in texts but as frequency of appearance in the system).<footnote anchor="15"/></p><p>At the same time, it seems that the time has come to systematize and put at the public disposal the linguistic data accumulated in machine readable form (texts, dictionaries, descriptions, rules, etc.<page local="10" global="58"/>) and the computational tools (software packages, integrated systems, mid level and high level languages for ldp, etc.) produced in different institutes of different countries in different ways, but on the basis of similar method­ological assumptions and of a general common sum of knowledge.</p><footnote label="15">The information is often represented by binary matrixes in which a line cor­responds to a lexical unit, a column to a specific linguistic property. This organization obviously facilitates the identification of identical or similar configurations, the veri­fication of the coherence between the contents of the interelated columns, etc. (see Jossblson , 1969). The work of M. Gross (1975) and his group in the construction of a grammatical lexicon of French certainly constitutes the most important example. Fur­thermore, the role which the lexicon and its description have assumed within the most recent developments of the generative-transformational school (Bresnan, etc.) should not be neglected.</footnote><p>It is within this framework, and not only for applied and oper­ational purposes, that since 1968 (Zampolli, 1968), I have promoted the construction of the dmi as one of the principle projects of the newly constituted dl.</p><p>The project described in the following pages by N. Calzolari and L. Pecchia is an original development in the field of semantics along these general planning lines.</p><p>2.    towards a formalization of lexical definitions 2.1.  <i>Preliminary steps.</i></p><p>This part of the article describes an attempt to formalize all the noun-definitions in the Italian Machine Dictionary (dmi). The defi­nitions recorded in the dmi were taken from the <i>Zingarelli </i>Dictionary (1970) after having undergone a first process of normalization and shortening. Part of the normalization process was to classify the Zin-garelli definitions into 9 different types and to mark each of these with a particular code.</p><p>The main types of definitions are:</p><p>1) the relational (coded as 1), which is composed of <i>a) </i>a fixed part representing a function, and <i>b) </i>a variable part, the basis;</p><p>2) the synonymous (coded as 2), which is made up of one or more single words which are referred to for an explanation of the meaning considered;</p><p>3) the one per ' genus et differentia ' (coded as 3), which is made up of <i>a) </i>a fixed word considered as a classifier (the ' generic part ' of the definition), and <i>b) </i>a descriptive or predicative phrase of the clas­sifier (the ' specific part ' of the definition).</p><p>The framework of our research is typical of componential analysis, according to which even that which appears to be " a list of basic irregularities " (Bloomfield, 1933, p. 162), i.e. the lexicon, could become a well-structured and therefore formalizable set, in other words,  - laura pecchia - antonio zampolli a system.<page local="11" global="59"/> We were first given the idea for the analysis by the theory of componential analysis, but we have attempted to expand its field of application which, up to now, has been dedicated only to well struc­tured sets, as shown in the work of componential anthropologists, (the domains of words of kinship), or to lemmas isolated from the rest of the lexicon (the well known example of Katz: 'bachelor'). Our intention has been that of extending the application of this theory to all nouns of the Italian lexicon. We are helped in this by the great quantity of material at our disposal in the dmi. As we are well aware of the li­mitations of componential analysis, we have used it only as a tool, not as an end, in achieving our purpose.</p><p>From the entire corpus of lemmas and definitions in the dmi we have excluded those lemmas and definitions which are marked as archaic or rare. We have analyzed, up to now, all those definitions classed under codes 3 and 5, i.e. those with one generic and one specific part. These are the most numerous groups of definitions. After this selection had been made, the total number of lexical items on which we are actually working is 28,873, among which 20,453 are monosemic and 8,420 polysémie; the total amount of their definitions is 44,051. We have worked on this corpus of lemmas and definitions using programs and checks of different kinds, working in two main directions which will be discussed later in more detail. Firstly, we have extracted a considerable number of markers which would be assigned to the highest possible number of lemmas. Secondly, we have started an analysis of prepositions, of prepositional groups and of other syntagms which can be considered as grammatical in a very generic sense. These syn­tagms have been chosen because they satisfy, simultaneously, the fol­lowing two criteria:</p><p><i>a) </i>that of occurring with a high frequency in the definitions;</p><p><i>h) </i>that of showing well defined semantic relations existing be­tween noun and noun, or between verb and noun, or between noun and proposition.</p><p>2.2. <i>Markers.</i></p><p>In the first phase of our work, the aim was to extract a certain number of ' markers ', starting mainly from the definitions; in other words, working in an inductive way. We obtained the first basic work­ing elements from a control of the frequency-list of the forms found in the corpus of noun-definitions.<page local="12" global="60"/> This list helped us to make a first purely provisional inventory of lemmas which might be used as ' markers '.</p><p>Then, by looking up the concordances of these definitions, we were able to test the validity of these basic elements. In fact we have ascertained that the most frequent lemmas in the set of noun-definitions, (i.e., the lexical entries which will be most probably used as ' semantic markers ') almost always appear in the context in a generic sense and in the first position, only occasionally assuming a specific sense in dif­ferent positions. The fact that, as expected, with the exclusion of syn­tactic words such as prepositions, conjunctions, articles, etc., the highest frequency-indexes pertain to the grammatical category of nouns has also been relevant.</p><p>We shall use the name ' markers ' to refer to these most frequent lemmas : but there is a difference between our ' markers ' and the markers referred to in componential analysis; although our markers function as markers usually do, i.e., they describe a meaning or part of it, they remain essentially lemmas. It is thus not necessary to use a metalanguage different from the language which is being described; the elements of the lexicon can be given a metalinguistic function.</p><p>These markers have been grouped into lists on the basis of different semantic criteria such as synonymy, antonymy, etc. We have also made a distinction between markers behaving as one-place predicates and markers behaving as two- or «-place predicates.</p><p>A first group of 450 semantic markers was extracted and matched by a program with the generic part of all the definitions. We have verified that 22,146 definitions out of 47,291 were covered, in their generic part, by these markers. This first part of the work is described in more detail in Calzolari, Moretti (1976).</p><p>In the prosecution of the work, through further additions or sub­stitutions of semantic markers which were either provided by literature on this subject, or resulted from our intuition, or by other successive analyses on the corpus, we have covered 40,135 definitions with 407 markers.</p><p>We have ascertained that, in almost every case, the generic part of the definitions of the dmi (and therefore of the Zingarelli) gives the word whose level is immediately higher with respect to that of the defined lemma (considering a hierarchical classification moving from the more specific to the more general, i.e. from a greater to a smaller intension). This homogeneity in the definitions justifies the validity  - laura pecchia - antonio zampolli of the method we have adopted to refer all the lemmas back to the markers.<page local="13" global="61"/></p><p>In practice, for the lemmas not covered by markers after the first procedure of matching, i.e. for those lemmas which are defined, in their generic part, by words which are too specific to be used as markers, we have established some chains which refer back to more and more generic words until at least one marker is reached. In order to construct these chains we used a procedure to convert all the lemmas into numbers : this seemed to be the simplest way to keep in main storage the great quantity of data we had to work with. Using a program which works on these numbers, we have simulated a path for each lemma. This path starts from the lemma itself, and the program examines the generic part of the definition of the lemma. The program checks if this generic part is included in the list of markers, and in any case examines this generic part itself as a lemma to be defined and looks for its definition; the procedure continues in this way until the generic part of a definition is found to be a marker without any other more generic marker above it. By this procedure, 91% of the definitions have been reconducted to the markers, i.e. 40,135 out of 44,051.</p><p>By means of these chains, we have given the noun-dictionary a resemblance to a tree-structure. This tree-structure has been formed using the definitions of the dmi for almost all the lemmas; the hierar­chical structure we have given to the markers has, on the contrary, been partly taken from the definitions, and partly imposed by us ac­cording to the traditional rules of class inclusion.</p><p><b>Fig. 1. <i>Examples of chains from lemmas to markers.</i></b></p><table class="main" frame="box" rules="all" border="0" regular="False"><tr class="row"><td class="cell"></td><td class="cell"></td><td class="cell"></td><td class="cell"></td><td class="cell"></td><td class="cell"></td><td class="cell"></td><td class="cell"></td><td class="cell"></td><td class="cell"></td><td class="cell"></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p><b>400</b></p></td><td class="cell"><p><b><i>ACCORDO </i>1</b></p></td><td class="cell"><p><b>-&gt;</b></p></td><td class="cell"><p><b><i>ARMONIA</i></b></p></td><td class="cell"><p></p></td><td class="cell"><p><b><i>CONCORDANZA</i></b></p></td><td class="cell"><p><b>—<i>&gt;</i></b></p></td><td class="cell"><p><b><i>RELAZIONE</i></b></p></td><td class="cell"><p><b><i>—&gt;</i></b></p></td><td class="cell"><p><b><i>QUALITY</i></b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p></p></td><td class="cell"><p><b>accord</b></p></td><td class="cell"><p></p></td><td class="cell"><p><b>harmony</b></p></td><td class="cell"><p></p></td><td class="cell"><p><b>concordance</b></p></td><td class="cell"><p></p></td><td class="cell"><p><b>relation</b></p></td><td class="cell"><p></p></td><td class="cell"><p><b>quality</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p><b>200</b></p></td><td class="cell"><p><b><i>ACCORDO</i>2</b></p></td><td class="cell"><p><b>-&gt;</b></p></td><td class="cell"><p><b><i>UNIONE</i></b></p></td><td class="cell"><p></p></td><td class="cell"><p><b><i>ATTO</i></b></p></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p></p></td><td class="cell"><p><b>accord</b></p></td><td class="cell"><p></p></td><td class="cell"><p><b>union</b></p></td><td class="cell"><p></p></td><td class="cell"><p><b>act</b></p></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p><b>200</b></p></td><td class="cell"><p><b><i>ACCRESCIMENTO</i></b></p></td><td class="cell"><p></p></td><td class="cell"><p><b><i>SVILUPPO</i></b></p></td><td class="cell"><p><b>—<i>&gt;</i></b></p></td><td class="cell"><p><b><i>ATTO</i></b></p></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p></p></td><td class="cell"><p><b>increase</b></p></td><td class="cell"><p></p></td><td class="cell"><p><b>development</b></p></td><td class="cell"><p></p></td><td class="cell"><p><b>act</b></p></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p><b>800</b></p></td><td class="cell"><p><b><i>BARIBAL</i></b></p></td><td class="cell"><p><b>—<i>&gt;</i></b></p></td><td class="cell"><p><b><i>ORSO</i></b></p></td><td class="cell"><p></p></td><td class="cell"><p><b><i>MAMMIFERO</i></b></p></td><td class="cell"><p><b><i>-&gt;</i></b></p></td><td class="cell"><p><b><i>CLASSE</i></b></p></td><td class="cell"><p><b>-&gt;</b></p></td><td class="cell"><p><b><i>GRUPPO</i></b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"><p><b>bear</b></p></td><td class="cell"><p></p></td><td class="cell"><p><b>mammal</b></p></td><td class="cell"><p></p></td><td class="cell"><p><b>class</b></p></td><td class="cell"><p></p></td><td class="cell"><p><b>group</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"><p><b><i>INSIEME</i></b></p></td><td class="cell"><p><b><i>~&gt;</i></b></p></td><td class="cell"><p><b><i>TOTALITÀ</i></b></p></td><td class="cell"><p><b>-&gt;</b></p></td><td class="cell"><p><b><i>QUANTITÀ</i></b></p></td><td class="cell"><p></p></td><td class="cell"><p><b><i>ENTITÀ</i></b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"><p><b>set</b></p></td><td class="cell"><p></p></td><td class="cell"><p><b>totality</b></p></td><td class="cell"><p></p></td><td class="cell"><p><b>quantity</b></p></td><td class="cell"><p></p></td><td class="cell"><p><b>entity</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p><b>031</b></p></td><td class="cell"><p><b><i>BARIO</i></b></p></td><td class="cell"><p><b>—&gt;</b></p></td><td class="cell"><p><b><i>ELEMENTO</i></b></p></td><td class="cell"><p><b>-&gt;</b></p></td><td class="cell"><p><b><i>PARTE</i></b></p></td><td class="cell"><p><b>-&gt;-</b></p></td><td class="cell"><p><b><i>PEZZO</i></b></p></td><td class="cell"><p><b>—&gt;</b></p></td><td class="cell"><p><b><i>PARTE</i></b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"><p><b>element</b></p></td><td class="cell"><p></p></td><td class="cell"><p><b>part</b></p></td><td class="cell"><p></p></td><td class="cell"><p><b>piece</b></p></td><td class="cell"><p></p></td><td class="cell"><p><b>part</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p><b>051</b></p></td><td class="cell"><p><b><i>DUC A</i></b></p></td><td class="cell"><p><b>-&gt;</b></p></td><td class="cell"><p><b><i>TITOLO</i></b></p></td><td class="cell"><p><b>—&gt;</b></p></td><td class="cell"><p><b><i>NOME</i></b></p></td><td class="cell"><p></p></td><td class="cell"><p><b><i>VOCABOLO</i></b></p></td><td class="cell"><p><b><i>&gt;</i></b></p></td><td class="cell"><p><b><i>PAROLA</i></b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p></p></td><td class="cell"><p><b>duke</b></p></td><td class="cell"><p></p></td><td class="cell"><p><b>title</b></p></td><td class="cell"><p></p></td><td class="cell"><p><b>name</b></p></td><td class="cell"><p></p></td><td class="cell"><p><b>item</b></p></td><td class="cell"><p></p></td><td class="cell"><p><b>word</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"><p><b>-»■</b></p></td><td class="cell"><p><b><i>TERMINE</i></b></p></td><td class="cell"><p><b>-&gt;</b></p></td><td class="cell"><p><b><i>PAROLA</i></b></p></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"><p><b>term</b></p></td><td class="cell"><p></p></td><td class="cell"><p><b>word</b></p></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"><p></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"></td><td class="cell"></td><td class="cell"></td><td class="cell"></td><td class="cell"></td><td class="cell"></td><td class="cell"></td><td class="cell"></td><td class="cell"></td><td class="cell"></td><td class="cell"></td></tr></table><page local="14" global="62"/><p>In setting these chains (see Pig 1), we discovered that some chains of <i>definienda </i>and <i>definientio </i>are circular, e.g. <i>PARTE </i>is defined in the dmi as <i>PEZZO, </i>and <i>PEZZO </i>as <i>PARTE </i>(see also Calzolari, 1977). In the example given in Fig. 1, the asterisk indicates the presence of at least one marker in the chain; the first number indicates the length of the chain; the second the length of the chain if it is circular; the third the distance between the two identical lemmas in the circular chain.</p><p>It has been possible, using these chains, to assemble the entire dic­tionary around some essential cores of more inclusive meanings. These cores are the tops of the trees, and from there thick branches lead off to the more particular and specific levels of the lexicon. The final data concerning the number and depth of the chains are shown in Table 1.</p><p><b>Table </b><b>1.</b></p><doubt alpha="65.5" length="29" tooSmall="False" monospace="0.0">Number of definitions: 44,051</doubt></section><section title="Number of definitions which lead to a marker: 40,135"><p>Moreover, for every marker (see Fig. 2), we have counted the number of times it occurs in all the chains (second column), and the number of times it appears in the chains which stop at the first marker (third column). In both of these cases, we have computed separately the occurrences of the marker at all the levels (1st, 2nd, 3rd, etc. ; the 1st column indicates the levels).</p><p> - laura pbcchia - antonio zampolli<page local="15" global="63"/></p><table class="main" frame="box" rules="all" border="0" regular="False"><tr class="row"><td class="cell"></td><td class="cell"></td><td class="cell"></td><td class="cell"></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p><b>length</b></p></td><td class="cell"><p><b>chains</b></p></td><td class="cell"><p><b>circular chains</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p><b>1</b></p></td><td class="cell"><p><b>6960</b></p></td><td class="cell"><p><b>474</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p><b>2</b></p></td><td class="cell"><p><b>4495</b></p></td><td class="cell"><p><b>2734</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p><b>3</b></p></td><td class="cell"><p><b>7960</b></p></td><td class="cell"><p><b>3576</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p><b>4</b></p></td><td class="cell"><p><b>5375</b></p></td><td class="cell"><p><b>3659</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p><b>5</b></p></td><td class="cell"><p><b>2153</b></p></td><td class="cell"><p><b>2029</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p><b>6</b></p></td><td class="cell"><p><b>1165</b></p></td><td class="cell"><p><b>1601</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p><b>7</b></p></td><td class="cell"><p><b>422</b></p></td><td class="cell"><p><b>576</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p><b>8</b></p></td><td class="cell"><p><b>181</b></p></td><td class="cell"><p><b>306</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p><b>9</b></p></td><td class="cell"><p><b>42</b></p></td><td class="cell"><p><b>244</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p><b>10</b></p></td><td class="cell"><p><b>7</b></p></td><td class="cell"><p><b>83</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p><b>11</b></p></td><td class="cell"><p><b>0</b></p></td><td class="cell"><p><b>6</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p><b>12</b></p></td><td class="cell"><p><b>0</b></p></td><td class="cell"><p><b>3</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p><b>total</b></p></td><td class="cell"><p><b>28760</b></p></td><td class="cell"><p><b>15291</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"></td><td class="cell"></td><td class="cell"></td><td class="cell"></td></tr></table><p><b>Fig. 2. <i>Examples of computations on markers' occurrences.</i></b></p><p>2.3. <i>Definition-Structures.</i></p><p>As far as the structure of the definitions is concerned, we wanted to start the analysis again from the definitions themselves (not trying to test some preconceived structures), with a careful checking of the corpus of definitions.</p><p>We have extracted prepositions, and prepositional or grammatical syntagms, on the basis of a frequency-criterion, placing together under the term ' locution ' or ' prepositional syntagm ' (even if this term is not a very exact one) expressions of this kind:</p><p><i>a forma di </i>(in the form of); <i>dal colore </i>(of colour) ; <i>provvisto di </i>(provided by); <i>munito di </i>(furnished with); <i>in contrasto con </i>(in opposition to); <i>consistente in </i>(consisting in); <i>simile a </i>(similar to); <i>originario di </i>(originating from); <i>che serve per </i>(which serves for/as); etc.</p><table class="main" frame="box" rules="all" border="0" regular="False"><tr class="row"><td class="cell"></td><td class="cell"></td><td class="cell"></td><td class="cell"></td><td class="cell"></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p></p></td><td class="cell"><p><b>depth</b></p></td><td class="cell"><p><b>occurrence</b></p></td><td class="cell"><p><b>occurrence</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p></p></td><td class="cell"><p><b>of the marker</b></p></td><td class="cell"><p><b>as 1st marker</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p><b><i>ESSERE</i></b></p></td><td class="cell"><p><b>1</b></p></td><td class="cell"><p><b>0</b></p></td><td class="cell"><p><b>0</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p><b>(being)</b></p></td><td class="cell"><p><b>2</b></p></td><td class="cell"><p><b>93</b></p></td><td class="cell"><p><b>90</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p></p></td><td class="cell"><p><b>3</b></p></td><td class="cell"><p><b>135</b></p></td><td class="cell"><p><b>43</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p></p></td><td class="cell"><p><b>4</b></p></td><td class="cell"><p><b>33</b></p></td><td class="cell"><p><b>10</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p></p></td><td class="cell"><p><b>5</b></p></td><td class="cell"><p><b>4</b></p></td><td class="cell"><p><b>4</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p></p></td><td class="cell"><p><b>total</b></p></td><td class="cell"><p><b>265</b></p></td><td class="cell"><p><b>147</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p><b><i>ANIMALE</i></b></p></td><td class="cell"><p><b>1</b></p></td><td class="cell"><p><b>1</b></p></td><td class="cell"><p><b>1</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p><b>(animal)</b></p></td><td class="cell"><p><b>2</b></p></td><td class="cell"><p><b>42</b></p></td><td class="cell"><p><b>40</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p></p></td><td class="cell"><p><b>3</b></p></td><td class="cell"><p><b>139</b></p></td><td class="cell"><p><b>53</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p></p></td><td class="cell"><p><b>4</b></p></td><td class="cell"><p><b>11</b></p></td><td class="cell"><p><b>2</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p></p></td><td class="cell"><p><b>5</b></p></td><td class="cell"><p><b>1</b></p></td><td class="cell"><p><b>1</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p></p></td><td class="cell"><p><b>total</b></p></td><td class="cell"><p><b>194</b></p></td><td class="cell"><p><b>97</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p><b><i>MAMMIFERO</i></b></p></td><td class="cell"><p><b>1</b></p></td><td class="cell"><p><b>0</b></p></td><td class="cell"><p><b>0</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p><b>(mammal)</b></p></td><td class="cell"><p><b>2</b></p></td><td class="cell"><p><b>126</b></p></td><td class="cell"><p><b>124</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p></p></td><td class="cell"><p><b>3</b></p></td><td class="cell"><p><b>241</b></p></td><td class="cell"><p><b>52</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p></p></td><td class="cell"><p><b>4</b></p></td><td class="cell"><p><b>32</b></p></td><td class="cell"><p><b>8</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p></p></td><td class="cell"><p><b>5</b></p></td><td class="cell"><p><b>6</b></p></td><td class="cell"><p><b>0</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"><p></p></td><td class="cell"><p><b>total</b></p></td><td class="cell"><p><b>405</b></p></td><td class="cell"><p><b>184</b></p></td><td class="cell"></td></tr><tr class="row"><td class="cell"></td><td class="cell"></td><td class="cell"></td><td class="cell"></td><td class="cell"></td><td class="cell"></td></tr></table><page local="16" global="64"/><p>These phrases which we will call, arbitrarily, ' prepositional syn­tagms ' have been divided into various categories. This subdivision was made possible through an introspective examination of the asso­ciations of analogous meanings. The criterion was the individualization of the recurring semantic functions which have a similar meaning, even though these functions have been expressed lexically and/or syntactically in a completely different way.</p><p>One example of such grouped functions is the category <i>SCOPO </i>(aim), for which we have individualized the following set of lexi-calizations (when necessary, with relative flection):</p><p><i>tendente a </i>(tending to); <i>diretto a </i>(aimed at); <i>volto a </i>(directed to); <i>con lo scopo di </i>(with the purpose of); <i>a scopo di </i>(for the purpose of); <i>che ha lo scopo di </i>(which has the purpose of); <i>che mira a </i>(which aims at); <i>chi mira a </i>(who aims at); <i>mirante a </i>(aiming at); <i>rivolto a </i>(turned to); <i>per conseguimento d* </i>(for achieving); etc.</p><p>We have grouped these lists of prepositions and prepositional syn­tagms into files on the basis of their affinity of meaning. This has been possible through the analysis of the functions and of the different possi­bilities of their expression, following inductive and deductive methods.</p><p>The validity of these associations of meaning, made intuitively, was afterwards verified empirically: various procedures for the ex­traction of the definitions in which each function appears, provided the material to be analyzed for this checking. For instance, in the analysis of various relations, such as those we called <i>ATTITUDINE </i>(aptitude), <i>COLORE </i>(colour), <i>FORMA </i>(form), <i>CONTENUTO </i>(content), <i>ORI­GINE </i>(origin), <i>SCOPO </i>(aim), <i>USO </i>(use), <i>SOMIGLIANZA </i>(simi­larity), <i>COMPOSTO </i>(composed of), <i>MUNITO </i>(furnished with), <i>RE-LATIVO A </i>(relative to), the check of all the definitions in which elements of the corresponding lists appear has shown the validity (about 80-90%) of our groupings made on the basis of our intuition. In addition, from this careful examination of different groups of defi­nitions, we obtained some data which made it possible for us to for­mulate some interesting considerations.</p><p> - laura pecchia - antonio zampolli<page local="17" global="65"/></p><doubt alpha="0.0" length="1" tooSmall="False" monospace="0.0">5</doubt><p>We have observed, for instance, that the definitional structure based on the relation <i>ATTITUDINE </i>(aptitude) has a quantitatively high homogeneity of application with respect to the lemmas in whose defi­nitions the relation is used. In fact, in 50% of the definitions in which this relation appears, it is applied to lemmas whose generic part, i.e. whose main semantic marker, is included in the list of homogeneous markers we have called <i>STRUMENTO </i>(instrument) (see Fig. 3). Ex­amples of the recurring generic parts with a high frequency are: <i>Mec-canismo </i>(mechanism); <i>Organo </i>(organ); <i>Congegno </i>(contrivance); <i>Appa­rate </i>(apparatus); <i>Attrezzo </i>(implement); <i>Strumento </i>(instrument); <i>Arnese </i>(tool); <i>Dispositive) </i>(device); <i>Apparecchiatura </i>(apparatus); <i>Macchina </i>(machine); <i>Attrezzatura </i>(equipment); <i>Apparecchio </i>(apparatus).</p><p><b><i>ACCIARINO        = Dispositivo atto a determinate Vaccensione </i>Flint-lock Tool apt to cause accension</b> <b><i>ARCHIPENDOLO = Strumento atto a rendere orizzontale una retta </i>Plumb-line Instrument apt to make a straight line horizontal</b> <b><i>CARICATORE     = Attrezzatura atta al carico e alio scarico di materiali </i>Loader Machinery apt to load and unload materials</b> <b><i>SPEZZATRICE    = Macchina del panificio atta a tagliare la pasta in pezzi </i>Cutter Machine of the bakery apt to cut the dough into pieces</b></p><p><b>Fig. 3. <i>Examples in which the function ATTITUDINE (aptitude) selects the particular marker STRUMENTO (instrument).</i></b></p><p>It is interesting to point out the way in which a certain definition structure can be frequently associated to a certain kind of marker. Other definition structures linked to other functions can make it possible to delimit, within the lexicon, sufficiently homogeneous se­mantic fields. Since these associations between markers and functions occur in several groups of definitions, we think that this correspondence ' marker-relation ' is not random, but is established for semantic reasons of affinity at a syntagmatic level. It seems possible for us to assert, at this point, that some markers effect a preferential selection toward certain types of defining relations rather than others, and vice versa. If this hypothesis is tested extensively on the lexicon, it can help in reaching a formalization of the semantic information which is in the dmi.</p><p>We think that a more complete formalization, in comparison to that obtained by the simple hierarchical organization of the markers, can be achieved by also identifying the other kinds of relations which are different from the hierarchical one. Functions such as those described above will allow:</p><page local="18" global="66"/><p><i>a) </i>the linking of markers: for example, the pertinence relation <i>PARTE </i>(part) makes it possible to link the markers <i>PERSONA </i>(person), <i>UOMO </i>(man), <i>DONNA </i>(woman), with a set of markers such as <i>MANO </i>(hand), <i>CAPELLI </i>(hair), <i>BOCCA </i>(mouth), <i>TESTA </i>(head), <i>CAPO </i>(head), etc.; and/or</p><p><i>b) </i>the joining of the generic to the specific part of the definitions, for example in the definition of <i>ACCHIAPPAMOSCHE = Strumento atto a catturare mosche </i>(Fly-swatter = instrument apt to catch flies) the function <i>SCOPO </i>(aim), in its lexicalization <i>ATTO A </i>(apt to), links the marker <i>STRUMENTO </i>(instrument) to its specification.</p><p>For the final structure of the definitions, we think that the markers can either be considered as n-place predicates joined to their arguments by these various types of functions, or as nodes of a semantic network linked to the specific part of the definitions, i.e. the other nodes, by arcs which express these various types of functions.</p><p>Such relations can be used as the starting point in the study of the use of prepositions and prepositional syntagms in the Italian language and, particularly, in the language of vocabulary definitions.</p><p>Unifying these functions is also of great help in structuralizing the definitions, at a higher level of formalization, assisting greatly in the extraction of all the data linked by the same function.</p><p>We have also noticed that some types of sentence-structure occur more frequently in the definitions. Besides considering the functions in isolation, we have been working on a quantitative examination of the various possible matchings of these functions among themselves; this has been done with the aim of also identifying the kinds of sentence-structures more frequently used by lexicographers in the compilation of dictionaries. A practical goal for us is to work further towards the unifying of the definitions, by leading them back, as far as possible, to the more frequent and common structures.</p><p>2.4. <i>Perspectives.</i></p><p>Our research had a number of different aims but was principally directed towards the lexicographic aspect. This aspect consists in an attempt to analyze the defining method adopted by Italian lexicographic tradition as shown by the <i>Zingarelli. </i>This analysis has been developed in two different stages:</p><p>66nicoietta  - laura pecchia - antonio zampollinicoietta  - laura pecchia - antonio zampolli<page local="19" global="67"/></p><p>1) An analysis of the terminology used in the definitions, through the enucleation of markers. We have seen that, among the most frequent lemmas in the definitions (i.e. among those words whose extension is greater or, in other words, whose intension is smaller), those words considered as markers by literature on this subject appear.</p><p>2) A check of the definitions considered from the point of view of their structure. This emphasized the very high frequency of certain types of functional syntagms as being more suitable in compiling defi­nitions. It will be interesting to have a comparative examination with dictionaries of other languages.</p><p>The semantic aspect is very closely related to the lexicographic aspect of this study. Our aim was to give a hierarchical type of organ­ization, even if provisional, to the large set of Italian nouns at our disposal. In doing so, we have taken what in our opinion is the first step towards a decomposition of a meaning into distinctive markers, i.e. the attribution as main semantic marker of the lemma which is at an immediately higher level in a hierarchical scale. Many hierarchical scales can be individualized in the lexicon, or more precisely among the meanings of the lexical items.</p><p>We have also begun, through the study of prepositional functions, the second step in the decomposition of a meaning into markers: the linking of markers with other markers, the individualization of the different kinds of relations which exist among markers, and of those relations which exist between primary and secondary markers expressed respectively by the generic and the specific part of the definitions.</p><p>There is also an important practical aspect of this work: that of making the definitions of the dmi more uniform from a semantic point of view. This is achieved by indicating the semantic uniformities which are latent under the different lexicalizations of the same markers or of identical relations, and by reducing these diversities of lexical forms to one single symbol reflecting their uniformity. This will make the looking up of the dmi easier.</p><p>This work should also be of relevance, at a future date, in connection with an analysis of the verb which takes into consideration the above mentioned analyses of the noun at a level of selectional restrictions at first, and, later, extends these analyses to the level of " knowledge of the world ". Thus, we feel that our work can provide a first step for a future utilization of the dmi in syntactic and semantic analyses of the Italian language.</p><page local="20" global="68"/></section><references><p><b>M. </b><b>Alinei, </b><b><i>La struttura del lessico, </i>Il Mulino, Bologna, 1974.</b></p><p><b>M. </b><b>Bierwisch, </b><b><i>On certain problems of se­mantic representations, </i>in « Foundations of Language*, V (1969), pp. 153-184.</b></p><p><b>L. </b><b>Bloomfield, </b><b><i>Language, </i>New York, 1933.</b></p><p><b>N. </b><b>Calzolari, </b><b><i>An empirical approach to circularity in dictionary definitions, </i>in « Cahiers de Lexicologie », XXXI (1977) 2.</b></p><p><b>N. </b><b>Calzolari</b><b>, L. </b><b>Moretti, </b><b><i>A method for a normalization and a possible algo­rithmic treatment of definitions in the Italian Dictionary, </i>presented at </b><b>iccl, </b><b>6th International Conference on Com­putational Linguistics (</b><b>coling </b><b>'76), Ottawa, 1976.</b></p><p><b>P. Castrogiovanni</b><b>, A. </b><b>Telara, </b><b><i>Primi risultati di unanalisi statistica morfologica e lessicale delle risposte al test di Rorschach nella prospettiva di uno studio del rap­ports tra psicologia e linguaggio, </i>in A. </b><b>Zampolli </b><b>(ed.), 1973a, pp. 307-324.</b></p><p><b>E. Charniak</b><b>, Y. </b><b>Wilks </b><b>(eds.), <i>Compu­tational Semantics, </i>North-Holland, Am­sterdam, 1976.</b></p><p><b>C. </b><b>Ciampi, </b><b><i>Les projets de recherche auto­matique des informations juridiques dans l'Institut pour la documentation juridique du Conseil National des Recherches, </i>in A. </b><b>Zampolli </b><b>(ed.), 1973a, pp. 249­268.</b></p><p><b>A. M. </b><b>Cirese, </b><b><i>Inventaires et répertoires lexicaux, formulaires et métriques des chants populaires italiens, </i>in A. </b><b>Zam­polli </b><b>(ed.), 1973a, pp. 209-231.</b></p><p><b>P. Cole, J. Sadock </b><b>(eds.), <i>Syntax and</i></b> <b><i>Semantics:</i></b><b><i> Grammatical Relations, </i>Aca­demic Press, New York, 1977.</b></p><p><b>F. </b><b>Dimitrescu, </b><b><i>Projet d'un dictionnaire de la langue roumaine du XVI siècle, </i>in A. </b><b>Zampolli </b><b>(ed.), 1973a, pp. 41-48.</b></p><p><b>A. </b><b>Duro, </b><b><i>Elaborations électroniques de textes effectuées par l'Accademia délia Cru-sca, pour la préparation du dictionnaire historique de la langue italienne, </i>in A. </b><b>Zampolli </b><b>(ed.), 1973a, pp. 33-76.</b></p><p><b>C. </b><b>Fillmore, </b><b><i>The Case for Case, </i>in </b><b>E. Bach</b><b>, R. T. </b><b>Harms </b><b>(eds.), <i>Universals in Linguistic Theory, </i>Holt, Rinehart &amp; Winston, New York, 1968, pp. 1-88.</b></p><p><b>C. </b><b>Fillmore, </b><b><i>Scenes-and-frame semantics, </i>in A </b><b>Zampolli </b><b>(ed.), 1977b.</b></p><p><b>J. Greenberg, </b><b><i>Some universals of grammar with particular reference to the order of meaningful elements, </i>in J. </b><b>Greenberg </b><b>(ed.), <i>Universals of Language, </i></b><b>mit </b><b>Press, Cambridge (Mass.), 1966.</b></p><p><b>A. </b><b>Grilli</b><b>, N. </b><b>Marinone</b><b>, A. </b><b>Zampolli, D. </b><b>A. </b><b>Brogna</b><b>, V. </b><b>Lomanto</b><b>, L. Fioc-</b><b>chi, </b><b><i>Concordanza del grammatici latini, </i>in <i>Supplemento agli Atti dell'Accademia delle Scienze, </i>Torino, vol. 112, 1978.</b></p><p><b>M. </b><b>Gross, </b><b><i>Méthodes en Syntaxe, </i>Paris, 1975.</b></p><p><b>R. </b><b>S. Jackendoff, </b><b><i>On some questionable arguments about quantifiers and negation, </i>in « Language*, XLVII (1971) 2, pp.. 282-297.</b></p><p><b>R. </b><b>S. Jackendoff, </b><b><i>Semantic Interpretation, </i></b><b>mit </b><b>Press, Cambridge (Mass.), 1972.</b></p><p><b>H. H. </b><b>Josselson, </b><b><i>The Lexicon: a System of Matrices of Lexical Units and their Properties, </i>in « ICCL », 1969.</b></p><page local="21" global="69"/><p><b>J. J. Katz, </b><b><i>Semantic Theory, </i>Harper &amp; Row, New York, 1972.</b></p><p><b>J. J. Katz, J. Fodor, </b><b><i>The structure of a semantic theory, </i>in « Language », XXXIX (1963), pp. 170-210.</b></p><p><b>G. </b><b>Leech, </b><b><i>Semantics, </i>Penguin, London, 1974.</b></p><p><b>J. D. </b><b>Mc </b><b>Cawxey, </b><b><i>The role of semantics in a grammar, </i>in E. </b><b>Bach</b><b>, R. T. </b><b>Harms, </b><b><i>Universals in Linguistic Theory, </i>Holt, Rinehart &amp; Winston, New York, 1968, pp. 125-169.</b></p><p><b>J. Macnamara, </b><b><i>Parsimony and the lexicon, </i>in « Language*, XL VII (1971) 2, pp. 359-374.</b></p><p><b>J. Petöfi, </b><b><i>Lexicology, Encyclopaedic Know­ledge, Theory of Text, </i>in « Cahiers de Lexicologie », XXIX (1976) 2, pp. 25­41.</b></p><p><b>R. RusTiN (ed.), <i>Natural Language Pro­cessing, </i>Algorithmic Press, New York, 1973.</b></p><p><b>C. </b><b>Segre</b><b>, A. </b><b>Zampolli, </b><b><i>La concordanza diacronica del « Furioso </i>», in <i>Atti del Con­vegno di Studi « Lingua, stile e tradizioni delle opere dell'Ariosto </i>», (Reggio Emi-lia-Ferrara), 1974.</b></p><p><b>R. </b><b>Simmons, </b><b><i>Semantic networks: their computation and use for understanding English sentences, </i>in R. </b><b>Schank, K. Colby </b><b>(eds.), <i>Computer Models of Thought and Language, </i>Freeman, San Francisco (Calif.), 1973.</b></p><p><b>D. D. Steinberg</b><b>, L. A. </b><b>Jakobovits </b><b>(eds.), <i>Semantics, </i>Cambridge University Press, Cambridge, 1971.</b></p><p><b>U. </b><b>Weinreich, </b><b><i>Explorations in semantic theory, </i>in T. A. </b><b>Sebeok </b><b>(ed.), <i>Current Trends in Linguistics, </i>vol. Ill, Mouton, The Hague, 1966.</b></p><p><b>A. </b><b>Zampolli, </b><b><i>Projet d'un dictionnaire italien du machine, Intervention, </i>in R. </b><b>Busa </b><b>(ed.), <i>De lexico electronico latino, </i>Pisa, 1968.</b></p><p><b>A. </b><b>Zampolli, </b><b><i>Nota tecnica, </i>in A. M. </b><b>Bar-toletti Colombo </b><b>(ed.), <i>La Costitu-zione délia Repubblica Italiana, Testi, In-dici, Concordanze, </i>Firenze, 1971.</b></p><p><b>A. </b><b>Zampolli </b><b>(ed.), <i>Linguistica matematica e calcolatori, </i>Firenze, 1973a.</b></p><p><b>A. </b><b>Zampolli, </b><b><i>Humanities Computing in Italy, </i>in « Computers and the Huma­nities)), 7 (1973b) 6, pp. 343-360.</b></p><p><b>A. </b><b>Zampolli, </b><b><i>La section linguistique du CNUCE, </i>in A. </b><b>Zampolli </b><b>(ed.), 1973a, pp. 133-199.</b></p><p><b>A. </b><b>Zampolli, </b><b><i>L'elaborazione elettronica dei dati linguistici : stato delle ricerche e pro-spettive, </i>in <i>Atti del Convegno sul tema « Le tecniche di classificazione e loro ap-plicazione linguistica </i>», Accademia Na-zionale dei Lincei, Roma, 1975.</b></p><p><b>A. </b><b>Zampolli, </b><b><i>Trattamento automatico di dati linguistici e linguistica quantitativa, </i>in </b><b>Società di Linguistica Italiana, </b><b><i>Died anni di linguistica italiana (1965­1975), </i>Roma, 1977a, pp. 349-370.</b></p><p><b>A. </b><b>Zampolli </b><b>(ed.), <i>Linguistic Structures Processing, </i>North-Holland, Amsterdam, 1977b.</b></p><p><b>N. </b><b>Zingarelli, </b><b><i>Vocabolario délia Lingua Italiana, </i>X ed., Zanichelli, Bologna, 1970.</b></p><page local="22" global="70"/></references></body></article>