<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0" xmlns:ns2="http://www.tei-c.org/ns/Examples">
    <teiHeader>
        <fileDesc>
            <titleStmt>
                <title>Chlstering Verbs Semantically According to their Alternation Behaviour</title>
            </titleStmt>
        </fileDesc>
    </teiHeader>
    <text>
        <front>
            <div type="abs">
                <head>Abstract</head>
                <p>Verbs were clustered seInantically on the basis of their alternation behaviom:, as characterised I)y their syut;actic sul)caI;e.gorisation franms extrael:ed from lllllXillllllll proba.bili(;y parses of a robu,st sI;atisl;ical pa.rser, aald eOml~leted by assigning \'VordNe(; classes as se\]ecl, ional preferences (;o the fl:ame arl~uments. The clustering was achieved (a) iteratively by measu):ing the lelal;ive enl;rol)y b(,tween (;he verbs' l)rOl)ability dis(;ribut, ions ()vet' the. franle (,yl)eS, and (1)) l)y ul;ilising a lateni; (:lass m/a.lysis t)ased on the joint frequencies of verbs and flmne. (,yl)eS.</p>
            </div>
        </front>
        <body>
            <div>
                <p/>
            </div>
            <div1>
                <head xml:id="sec1">Introduction</head>
                <p>This paper eml)irieally investiga(;es the proposition that ve, rl)s can 1)e seman(;ically classilied according to their synl;a(:(;ic alterna.tion 1)ehaviour (:()n(:ernint,; subca.(;(%orisation frames ;rod their seleetional i)refer(;nces tbr the, arguments within the frames. The idea is l(;lal;ed Ix) (lx'.vin, 1993) who de.lined verl&gt; classes on the basis of verl) atterl)al;ion beh~aqour. For exmnl)le , (;he seman(;ic (:lass of l&amp;h, icle Names coni, ains verbs like lmlloo'n, bicycle, ca.'n, oe, skate, ski which agree in (;he prol)erties (1)-(4) below. (1) 1NTltAN.qlTIVI.; IJSI,:, possibly followed by ;t path: a. 'lPhey skated. l&gt;ri(lge. (2) INI)UCI'H) ACTION AI:rEIuqA'rIoN (~ome verl&gt;s): a. Ite skated Penny around the rink.</p>
                <p>(causing the aetiolt nanled 1)y the verb;</p>
                <p>tyl)ieal eausee is m~ alfimal;e volitional</p>
                <p>entity) b. Penny skated a.round the )'inl¢. (3) LO(3A'rivI,; PI{.1,;I'OSITION \])l{Ol &gt; A~:r),m~ATION (some verbs): a. They skated along the canals. b. They skated the canals. (~) I{.I,;S 1.11 &quot;IWI'I VI,; IqlIIASE: Pem~y skated her skate blades bhmt (an XI ) describing (,he sl:a~e achieved by lhe referent of (;l)(~ nora) l)hrase as a resu\]l; of the acl,iou mmw.,.l by the verb)</p>
                <p>Levin's work rel)resenCs th(~ basis foc a tani,;e &lt;)f ,ecent invesl;ig;d,ions veril)qng (\])orr :m(1 .lone.% \] 996), evaluating (SI;e.\,ensoll aud Merl% 199! 0 or utJi;::. ing (\]mpata., 1999) the proposed elas.qitlcation-,~ well as transferring i(, 1.o other lanw, ta.ge',~ tha:l }~;uglish (Jones et al., 7199.\[)~</p>
                <p>Generally, the definition of a verl/s selllaIii.i(: (:\];/:~.q can be considered as part of its lexic.al entry, next io idiosyncratic intinmation: the Sellla,lll;ic CI;ts~; gelleralises as a. l.ype definition over ~t range o\[ s.ynl,acl.ic nnd s(mmnl:ic \])tOl)ertie:; , Lo .qUl)por(; Nai.u.':d l~a.nguat,;e Processinp; in V;/\['\].()lI~4 .~II'(~;/S like. h',xicor;c~tphy (llapl)ai&gt;ort l\]ov'av and I,evin, t99~;), wo){I :;(:it~:e disaml)igual:i(m (l)orr aml .\]oneea, ?l!)!)i;), or ()'.~+nlelll; cla,~silh:al.ion (Klavm~.~; and Kaa, 199,~0o I al;l;enlpLed l;o aul;omaLica\]ly ch,sicr verb&gt;~ h~i;o selna.lti, ie elas.'-;c~; Oll l.be b;.tsi'A Of i,}\]e vel\]),-; ~ ~ll{;(~P,;Ition I)ehavioltr. 'l)lm iulm 1; into i:he at/i;(ii\[Nti;it; i I J(!!!(: tion pr()(:e,~q:4 w;m characi:eri:;ed 1)y (J~e \;erl)',4' di&gt;:l.~;Im(,ion eve\]&quot; ,qyJil,;tc{,\](; ~;ul&gt;ca.l;egori&gt;;;tl;i(m I!~1. ' U. 'A .:~; ~!\]KLrael;ed from )u:txinnlm pvol&gt;alfil}t 5'(Vil.crlfi) !:, ~c~e:-', ()f ~/ l'O})ll,ql; 'r;I;t, ll;i'At,\]C~!\] l);/l':;¢;!'~ ;ll!(t c:()i(l\]~i(;i,,.,d }.,\- ;t!;siffltill~ \Vor(\[)~,:;(; (:l:is',',(z:; ;~.~; ;;(~\]('~(:{;i.o~l,lJ l~i:eI'ei'~!; ,:c:; {:o the frame a.rg~m~e**i;:',. 'Fbe, clmd:eti~i~ a~;~:', ;,el,}: ,4d (a) iteratively 1)y me;mminig Lhe relai,ive ,;~{,~,')p3 '.ml,ween t,he verbs' probability dist):it)c,i,io,~:~ ; ,,,(:, the frame tyl)e, % m~d (I)) l) 3, uLi/i'.-;ing a. Ld:e,t cD;;;.i c,:tal.3'sis ba,q(:d on (,l)e joi .i; J'requen(:ies of verl&gt;~s a:,&gt;d ~,;::,1~'~¢'~ tyl)eS. U,~;itu~ Ii,evin':; \,c)'b elas,',ific:d;\]o~) ;',.~'~ c,,ai~;~;ion basi,q, 61% of the \,ert)~; were classified (;o1;(;&lt;5;13 ~&gt;!;o ~;emm~tic cla:;5;e; l&gt;y met,l:,ut (a), ;rod 540/0 b3 :~.~i.!..~d (1,).</p>
                <p>Section 2 de:;cribe',; (;l)e three. :de,!,:~ h~ i!w, ~:~i.,')-marie aC(luisit;ion of ,qen)an(;i&lt; verb cla::;sc~;; i.l)e ~ .,h&gt; ation takes l)\]a(:e, in ,~eel;ioJl 2{, ;lAd .~;oel.ioil d (tJ:,,.l~&gt;;~;{~; (,he re:mll:s.</p>
                <div2>
                    <head xml:id="sec2">Automatic Acquisition of</head>
                    <p>Semantic Verb Classes Tile first step was the induction of purely syntactic subcategorisation fi'ames for verbs from the heterogeneous British National CoTpus (BNC). I used the robust statistical head-entity parser as described in (Carroll and Rooth, 1998) which utilises an English context-free grammar and a lexicalised probability model to produce parse forests, and extracted the maximum probability (Viterbi) parses, for a total of 5.5 million sentences. The trees were mapped to subcategorisation frame tokens consisting of a inain verb and its argmnents. Each syntactic category was accompanied by the lexical head, the pret)ositional phrase by the lexical prepositional head plus the head noun of the subordinated noun phrase. Proper names were accompanied by the identifier pn. The head information in the frames was lemmatised. For example, the sentence Samrout handled the plaudits during the awards ceremony would be represented by the frame token handle subj*pn*sammut obj*plaudit pp*during*ceremony.</p>
                    <p>To generalise over the verbs' usage of subcategorisation frames, I defined as 88 frame types the most frequent frames which appeared at least 2,000 times in total in the BNC sentence parses, disregarding the lexical head information. On the basis of the frame types I collected information about the joint frequencies of the verbs in the BNC and the subcategorisation frame types they appeared with. These frequency counts then represented the syntactic description of the verbs.</p>
                    <p>Tim next step was to refine the subcategorisation frame types by a preferential ordering on conceptual classes for the argument slots in the fl'ames. The basis I could use for the selectional preferences was provided by the lexical heads ill the fi'anm tokens. For example, the nouns appearing in the direct object slot of the transitive frame for the verb drink included coffee, milk, beer, indicating a conceptual class like beverage tbr this argument slot.</p>
                    <p>I followed (Resnik, 1993)/(Resnik, 1997) who defined selectional preference as the amount of information a verb provides about its semantic argument classes. He utilised the WordNet taxonomy (Beckwith et al., 1991) for a probabilistic model capturing the co-occurrence behaviour of verbs and conceptual classes, where the conceptual classes were identified by WordNet synsets, sets of synonymous nouns within a semantic hierarchy. Referring to the above example, the three nouns coffee, milk, beer are in three different synsets -since they are not synonyms-, but are all subordinated to the synset {beverage, drink, potable}. The goal in this example would therefore be to determine the relevant synset as the most selectionally preferred synset for the direct object slot of the verb drink.</p>
                    <p>Redefined fbr iny usage, the selectional preference of a verb v tbr a certain semantic class c within a subcategorisation franm slot s was deternfined by the association ass between verb and semantic lass: =des Pl, C, lV~pOg ~ (5) with the probabilities estimated by maxinmnl likelihood: f(v,, P(C*lVs) - f(vs) p(Cs) = f(c.,) _ f(cs) f(c's) /(8) and the following interpretation: (6) (7) 1. f(v,, c,): number of times a semantic class ap-</p>
                    <p>peared in a fi'ame slot of a verb's fi'ame type 2. f(v,): frequency of a verb regarding a specific</p>
                    <p>fi'ame type, i.e. the joint Dequency of verb and</p>
                    <p>frame type 3. f(Cs): numl)er of times a semantic class ap-</p>
                    <p>peared in a fi'ame slot of a frame type disre-</p>
                    <p>garding tim verb</p>
                </div2>
                <div2>
                    <head xml:id="sec4.">~¢'c,~'**,,s f(c'~) equals f(s), the frequency of the</head>
                    <p>argument slot within a certain frame type, since summing over all possible classes within a subcategorisation fl'ame slot equals the lmlnber of tinms the slot; appeared 5. f(s): uulnber of times the franle type appeared,</p>
                    <p>since the frequency of a. frame type equals the</p>
                    <p>frequency of that frame with a certain slot</p>
                    <p>marked The fi'equencies of a semantic class concerning an argument slot, of a frame type (dependent or independent of a verb) were calculated by all approach slightly difl'erent to Resnik's, originally proposed by (Ribas, 1994)/(Ribas, 1995). For each noun appearing in a certain argument position its fi'equency was divided by the nmnber of senses the noun was assigned by the WordNet hierarchy, t to take account of the uncertainty about the sense of the noun. The fi'action was allocated to each conceptual class in the hierarchy to which the noun belonged and accumulated upwards until a top node was reached. Tile result was a numerical distribution over the WordNet classes: s(c,/-- E /(noun) (8) I restricted tlm possible (:onceptual classes within 1;he fl'ames' argmnent slots to 23 Wor(tNet nodes, 2 1;o facilitate generalisation and comI)arison of the verbs' seleetional preference behaviour.</p>
                    <p>On the basis of the inforlnation al)out subcategorisation frame types and their arguments' concet)tual classes Iclustered 153 verbs from Levin's classitica(;ion. I chose (i) some l)olysemous verbs to investigate how this l)henomenon could be handled 1)y the clustering algorithms, and (ii) high and low frequent verbs to see the intluence of frequency on th(; algorithms: the 1~3 verbs had 226 verb senses which belonged to 30 different semantic lasses. D)ur of the verbs were low-Dequeney verbs with a total corpus frequency below 100.</p>
                    <p>To cluster the verbs I applied two different algorithms, and each algorithm clustered the verl)s bot, h (h) according to only the syntactic information about tlm subcategorisation frames, and (B) according to the intbrmation at)out the subcategorisation ti'ames including their selectional 1)referelmes. ,. lterative clustering based on a dcfinition</p>
                    <p>by (Ilugh, es, 109/,): In the l)eginning, each vert) represent;ed a singleton cluster. Iteratively, the distances between tim clusters were lneasure(l and the closest chlsters merged togel;her. For the rel)resentation of the. verbs, each verl) v was assigned a distribution over the ditfere.nt tyl)es of subcategorisatioll fl'anms i, according 1;o the. maximum likelihood estimate, of (k) the. verb apl)earing with the frame tyl)e: f(v,/,) f(,,,) (9) with f(v,t) the joint fi'equency of verb and frmne type, and f(v) the fl'e(tuency of the verb, and (B) the verb appearing with the frame tyt)e mid a selectionally t)refe.rred (:lass coml)ination C for the m'gmnent )osil;ions .s in t: i,(~,, ely ) =,,ef p(tl v) * J,(Clv , t) (10) with p(/,lv) defined as in equation (9), and p(C\]v, t) =&amp;/ Ec:6,:l,,.~, \[Isct a.s.s'(v.~, c') (11) which intuitively estimates the probability of a certain class combination by comparing its association value with the sum over all possible class combinations, concerning the respective verb and frame. Starting out with each verb representing a singleton cluster, I iteratively determined the two closest chlsters by applying tim informationtheoretic measure relative cutropy :~ (Kulll)ack mid Leibler, 1951) to comi)are the distributions. The nearest clusters were merged into one cluster, and their distributions were merged 1)y calculating a weighted average. Based on test runs I defined lleuristics about how many elusl, eriug iteral;ions were pertbrmed. In addition, i liraire(1 the maximum mnuber of verbs within one (:luster to four elements because otherwise the. verbs showed the tendency to cluster together in a few large clusters only; so after the overall clustering process was finished, each cluster with more tlmn four members initialised a fllrther clustering pass on itself. Unsupervised latent, class aualysis as described in (l~ooth, 1998), based on the cxpcetation'maximisation al.qorithm: The algorithm identified categori(:al types among indirect, ly observed multinomial distributions 1) 3, apl)lying the EM-algorithm (\])elnpsteret al., 1977) to maximise the joint prol)ability of (h) t;he verb and frmne tyl)e: p(v, t), and (B) the verl) and frame type considering the selectional I)referenees: p(v, t, C). \]TUl)Ut to the algorithm were absolute, frequencies of the verl)s at)l)earing with the sul)categorisation frames. Test runs showed that 80 clusters modelled the semantic verl) classes best. To 1)e able to comI)a.re the analysis wit;h the iterative clustering al)proach , I also limited tim numb(~r of verbs wit;hin a (:lus|;er 1;o four considering that; generally all verbs ai)l)ear within each (:lusl;er when using this apl)roach , the verbs wil;h l:he highest l)rol)abilities where chosen. D)r version (h) the frequencies were provide.d by the joint frequencies of verbs and frame tyI)es, for version (B) I used the association va.lues of the verbs with tile frame tyl)eS considering seleetional preferences, as described 1)y equation (10). The unsupervised algorithm then classified joint events of verbs and subeategoris~tion frmncs with 200 iterations of the EM-algorithm into 80 clusters r, based on the iteratively estimated vahles v(v, 0 = v, l,) = T T (12)</p>
                    <p>aConcerning the two typical prol)lems one has with this measure, (i) zero frequencies were smoothed 1)y adding 0.5 to all frequencies, and (ii) since the measure is not symmetric, the resl)ective smaller vahm was used as distance. Information Clusters</p>
                    <p>Total Correct SFs 31 20 SFs + Pretls 30 14 Total</p>
                    <p>90</p>
                    <p>81 Verbs</p>
                    <p>Recall Precision</p>
                    <p>36% 61%</p>
                    <p>20% 38% Correct</p>
                    <p>55</p>
                    <p>31 Figure 1: Evaluation based on Iterative Clustering hfformation Clusters</p>
                    <p>~lbtal Correct SFs 80 36 SFs ~1-Prefs 80 22</p>
                    <p>Total 107(159) 153(226)</p>
                    <p>Verbs(Senses) Correct Recall Precision 58(9O) 38(4O)% ,54(57)% 47(56) 31(25)0/o 31(25)% Figure 2: Evaluation based on Latent Classes I,(,,, t, c) = v, t, c) = T T Cl ) (13) for versions (h) and (B), respectively. 3 Evaluation The evaluation of the resulting clusters was based on Levin's classification. Figures 1 and 2 present the success of the two clustering algorithms, considering tim two difl'erent informational versions (/~) and (B). They contain the total mnnber of clusters the algorithms had formed (clusters containing between two and four verbs in the iterative algorithm, and the fixed immber of 80 clusters in the l&amp;l;ent (:lass rarelysis), the prol)ortion of correct clusters (non-singleton clusters which were subsets of a Levin (:lass, for example the cluster conl;aining the verl)s need, like, ,want, desire is a subset of the Levin (:lass Desire), and the numl)er of verbs wMlin those clusters. In figure, 2 the nulnl)er of verbs in brackets rethrs to the respective number of Lheir senses, since a verb could be clustered several times according to its senses. For examl)le, the verl) want could t)e meml)er of the (:lasses Desire and Declaration.</p>
                    <p>Recall was define(l by the I)ercentage of verbs (verb senses) within the correct clusters compared to the total munber of verbs (verb senses) to be clustered: I,,e,'bs ......... , ,.,,.,, ..... I ?*C'C = 153</p>
                    <p>(Iv ,.b .......... , ........... l)</p>
                    <p>. 226 and precision was defined by the percentage of verbs (verb senses) apl)earing in the correct clusters compared to the numl)er of verbs (verb senses) apl)earing in any cluster:</p>
                    <p>\[ ve.rbs..o,..~,.,t ~.t,~t,~,.~ \[ wee = Ive,.r,s,,, ~,,.,,~,., I (iv-+ .................. , ........... I) Concerning t)recision, the assignntent of verbs into semantic classes was most successfifl when using the il;erative distance clustering method; 61% of all verbs were clustered into correct classes. Clustering the verbs into latent classes was with 54% less successtiff. With both clustering methods the results became worse when adding information about the selectional preferences tbr the arguments in the subcategorisation fl'ames.</p>
                    <p>A baseline ext)eriment was performed in order to determine how hard the task of verb clustering was: each verb was randomly assigned another verb as &quot;closest neighbour&quot;, which resulted in only 5% el the, verl)s being paired with a verb Don1 the same Levin (:lass. Performing the same experiment by assigning the closest neighbour on the basis of moasm'ing the relative entropy between two verbs' distributions over subcategorisation fl'ames resulted in 61% of the verbs pointing to a verb flom the same Levin class. 4 Discussion d The classitications of both clustering approaches illustrate the close relationship between alternation behaviour and semantic classes, lYor exalnple, the common preferences of verbs (see the tlve most probable frames) ill the iteratively crea.ted Desire (:lass were towards a sul)ject followed by an infinitival phrase (subj :to). Alternatively a l;ransitive subj :obj flame was used, partly followed by an additional infinitival phrase indicated by to: s Verl) Frame need subj:to</p>
                    <p>subj:ol)j</p>
                    <p>subj</p>
                    <p>subj:obj:to</p>
                    <p>subj:obj:pp.for</p>
                    <p>sul)j:to</p>
                    <p>subj:ol)j</p>
                    <p>subj</p>
                    <p>sul)j:obj:adv</p>
                    <p>sub.i:obj:obj</p>
                    <p>subj:to</p>
                    <p>subj :obj</p>
                    <p>subj</p>
                    <p>sul)j :ol)j :to</p>
                    <p>subj :to:adv desire subj :obj</p>
                    <p>subj</p>
                    <p>sul)j:to</p>
                    <p>sul~j:obj:to</p>
                    <p>sul)j:sent Adding ilfformation about the selectional preferenees of the verbs' argmnents hell)s to gel; a deeper idea about their lexical semantics. D:)r exalnple, mar~,'n, er of Motion verbs 1)referably appeared with a subject only, sometimes with a following adverl). The subject was an inanimate ol)ject, for move it might also be a part (such as a body part like fin_ ger) or a grout), roll and fly alternatively used the transitive frmne type subj :obj, preferal)ly with a living entity as subject, followed by an inanimate ob.iecl;: roll fly Fl'itllle sub.i (l'hysObject) subj (l'hysObject):adv subj(Agent):obj (lq~ysObject) subj (IJ fel,'orm) :ol)j ( lq C,'sObject) subj(Agent):obj (lhu't) subj(l'hysOI)ject) subj (l'hysOI)jcct):adv sub.i(Lifel,'orm) :obj (l'hysObjcct) subj (l,illaForm) :pp.to (1A fel&quot;or n 0 subj(Lifeleorm):l)p.to (Agent) sul).i(l 'hysObject) subj (l)hysOl~ject):adv sul)j (1 're'i,) sul~j(Groul)):adv subj(Part) :adv Parallel examples created by the latent class analysis present the clusters with the most probable verbs and frmnes, according to cluster membershi I) (first column). The dot indicates whether the verb-fi'mne combination was seen in the data, the mmtber next to the verb frame gives the probability of the verbfrmne combination. Some verbs of Telling were clustered mainly according to their similar transitive use combined with an infiifitival phrase: ~°_ Clusi;er d</p>
                    <p>o</p>
                    <p>,, o (}.17 advise • 0.12 teltch • 0.12 instruct • g o &lt; • • • g o • • l)rol)ability 0.38 (I.32 0.10 0.05 0.02 0.34 0.34 0.14 0.(14 0.03 0.53 0.15 0.11 (1.10 0.02 0.25 0.24 0.20 0.(17 (I.02 l'rob~dfility 0.24 0.10 0.07 0.07 0.05 0.3d 0.12 0.(17 0.05 0.0,1 0.20 0.11 0.09 0.0,1 (1.0,1 g o • • • o'° cb g o = o • • • The verl)s of Aspect alternate between a subject only, realised by an action, an inanimate subject followed by an infinitiwfl phrase, and a living subject followed by a gerund: g', ClHster o &lt;</p>
                    <p>Both approaches established a relationship between alternation behaviour and semantic class by only considering information about the syntactic usage of the subcategorisation Dames. The refinement by the frames' selectional preferences allowed fllrther demarcations by the identifying (:onceptual restrictions on tile use of the frames. Since tim latent class analysis is a soft; clustering method, it additionally distinguishes between the dith;rent verbs' senses and the resl)ective uses of subcategorisation Dames. For example, the verb play was clustered with meet 1)ecause of tile common strong tendency towards a transitive ti&quot;ame illustrating a gen(;ral meeting, and it, was clustered with figh, t t)eemlse of their colnmon preference for an intransitive fi'ame together with a prepositional phrase headed 1)y against, illustrating a more aggressix'(; me.eting like a fight: Cluster I~ g Cluster 5 meet l)lay ~L</p>
                    <p>An extensive investigation of tile linguistic reliability of the clustered verbs and frames showed that l;he character(sing usages could be under\](ned by corpus data, for example the above cited transitive use ~ d © ;~ • • g o b0 ;5 • • • g o bO • • of the verb fly concerning the subj : obj frame type with a living subject and ml inanimate object can be illustrated by the BNC-sentence In March the manufacturer's test pilot flew the aircraft for its annual inspection check flight. The clusters were therefore created on a reliable linguistic basis representing (a selective part of) the verbs' properties. Comparing the two informational versions, however, showed that refining the fralnes with selectional preferences points to a problem caused by data sparseness in the verb description. Investigating the automatically created distribution of the verbs over the enriched fl'ame types revealed that, for example, even the high fl'equent, alternating verb move contains 97% (smoothed) zeroes within its distribution. In accordance with this fiuding even subtle similarities, e.g. the sole fact that two verbs have non-zero wflues for certain fl'ame types, highly correlates the two verbs. For example, a semantic cluster contained the two verbs promise and love, because both have non-zero attribute values for the subj :to frame, demmlding an agent for the subject slot; in their alternation behaviour (including selectional preferences) the two verbs differ, however, so they should not be packed into one cluster. A possible suggestion to handle the problem of data sparseness could be to formulate the conceptual class types in a way which ensures an increased data potential for each type.</p>
                    <p>Concerning the polysemy of verbs, the (hard) iterative distance clustering failed to model verb senses; a polysemous verb was either not at all assigned to any cluster, or assigned to a cluster describing one of the verb's senses. The (soft) latent (:lass analysis was able to filter the multiple senses and assign them to distinct (:lusters, but tended to split senses. Low-frequency verbs presented another problem, because the verbs' distributions contained mostly zeroes. They were assigned to clusters nearly randomly.</p>
                    <p>An investigation of selected WordNet conceptual classes revealed that the selectional preferences within the subcategorisation frames were donfinated by a few WordNet classes, mainly LifeForm and Agent. The demarcation between these two concepts was not obvious when referring to actually appearing nouns within the frames, since both contain a large number of common subordinated nouns. In contrast, some WordNet classes were not chosen at all, e.g. Unit or Anticipation. Since the WordNet hierarchy in general had turned out to define intuitively correct seleetional preferences, an improved classification utilised for my conceptual classification should be substituted by finer synsets, i.e. one should consider using a different cut through the WordNet hierarchy.</p>
                </div2>
                <div2>
                    <head xml:id="sec5">Conclusion</head>
                    <p>I proposed two algorithms for automatically classif~,ing verbs semantically, based on their alternation behaviour. Taking Levin's classification as a standard for 153 manually chosen verbs with 226 verb senses and their assignment into 30 semantic classes, the iterative distance clustering succeeded for 61% of the verbs considering the syntactic usage of the fl'ames only, and for 38% when adding information about the frmne arguments' selectional preferences. The latent class analysis succeeded for 54% and 31%, respectively.</p>
                    <p>An investigation of the resulting clusters showed that the assignment of the verbs was actually based on their shared linguistic properties: the verbs in a cluster presented common alternation behaviour, refined by adding selectional preferences to the syntactic description of the subcategorisation frmnes.</p>
                    <p>It is impressive that as little lexical idiosyncratic verb information as the syntactic use of subcategorisation fl'ames like subj : to or subj : pp. against suffices as a basis for a semantic class distinction towards Levin's narrow classification system including fine concepts as Desire or Manner of Motion. The potential is partly characterised by specific frames, but in the majority of cases by successflflly combining the frames in order to define the syntactic alternation, hnproving the definition and demarcation of conceptual classes should provide further potential concerning the inclusion of selectional preferences into the syntactic description. References Richard Beckwith, Christiane Fellbaum, Derek</p>
                    <p>Gross, and George A. Miller. 1991. Wordnet: A</p>
                    <p>Lexical Database Organized on Psycholinguistic</p>
                    <p>Principles. In Uri Zernik, editor, Lcxical Acqui-</p>
                    <p>sition - Exploiting On-Line Resources to Bnild a</p>
                    <p>Lczicon, chapter 9, pages 211 232. Lawrence Erl-</p>
                    <p>baron Associates, Hillsdale - New Jersey. Glenn Carroll and Mats Rooth. 1998. Valence In-</p>
                    <p>duction with a Head-Lexicalized PCFG. In Pro-</p>
                    <p>ceedings of the 3rd Confcrcncc on Empirical Meth-</p>
                    <p>ods in Natu~nl Language Processing, Granada,</p>
                    <p>Spain. A. P. Dempster, N. M. Laird, and D. B. Rubin. 1977.</p>
                    <p>Maximum Likelihood from Incomplete Data via</p>
                    <p>the EM algorithm. Journal of the Royal Statistical Society, 39(B):1-38. Bonnie J. Dorr and Doug Jones. 1996. Role of Word</p>
                    <p>Sense Dismnbiguation in Lexical Acquisition: Pre-</p>
                    <p>dicting Semantics from Syntactic Cues. In Pro-</p>
                    <p>ceedings of the 16th International Conference on</p>
                    <p>Comp'utational Linguistics, Copenhagen. John Hughes. 1994. Automatically Acquiring Clas-</p>
                    <p>sification of Words. Ph.D. thesis, University of</p>
                    <p>Leeds, School of Computer Studies. Douglas A. Jones, Robert C. Berwick, Franklin Conference of thc European Chaptcr of the Associ-</p>
                    <p>Cho, Zeeshan Khan, Karen T. Kohl, Naoyuki No- ation for Computational Linguistics, pages 45-52.</p>
                    <p>mura, Anand Radhakrislman, Ulri('h Sauerlan(1,</p>
                    <p>and Brian Ulicny. 1994. Verb (,'lasses and Al-</p>
                    <p>ternations ill Bangla, German, English, and Ko-</p>
                    <p>rean. Technical l{el)ort MIT AI MEMO 1517,</p>
                    <p>Massachusetts Institute of Technology. Judith L. Kla.vans and Min-Yen Kan. 1998. The</p>
                    <p>Role of Verbs in DOeulnent Analysis. In Pwceed-</p>
                    <p>ings of thc 17th Intcrnational Co~@rcncc on Com-</p>
                    <p>putational Linguistics, Montreal, Canada. S. Kullback and R. A. Leibler. 1951. On Infl)rmation</p>
                    <p>and Sufficiency. Annals of Mathematical Statis-</p>
                    <p>tics, 22:79-86. Maria Lapata. 1999. Acquiring Lcxical Generaliza-</p>
                    <p>tions from Corpora: A Case Study for Diathe-</p>
                    <p>sis Alternations. In Proceedings of the 37th An-</p>
                    <p>nual Mccting of the Association for Computa-</p>
                    <p>tional Linguistics, pages 397 404:. Beth Levin. 1993. English Verb Classes and Al-</p>
                    <p>ternations. The University of Chi(:ago Press,</p>
                    <p>Chicago, 1st edition. Malka Rat)i)al)ort Hovav and Beth Levin. 1998.</p>
                    <p>Building Verb Meanings. In M. Butt and</p>
                    <p>W. Geuder, editors, Lcxical and Compositional</p>
                    <p>Factors, pages 97-134. CSLI Publications, Stan-</p>
                    <p>ford, CA. Philip Resnik. 1993. Selection and Information:</p>
                    <p>A Class-Based AppTvach to Lexical Relationsh, ips.</p>
                    <p>Ph.D. thesis, University of Pennsylvania. Philip Resnik. 1997. Selectional Preference and</p>
                    <p>Sense Disambiguation. In Proceedings of the ACL</p>
                    <p>SIGLEX Workshop on ~hflginfl ~::ct with, Lcxical</p>
                    <p>Semantics: Wh, y, Wh, at, and llow? l~5&quot;ancesc Ribas. 1994. An Experiment on Learn-</p>
                    <p>ing Appropriate SelectionM Restrictions fi'om a</p>
                    <p>Parsed Corpus. In Procecdings of the 15th Inter-</p>
                    <p>national Conference on Computational Linguis-</p>
                    <p>tics, pages 769 774. Francesc Ribas. 1995. On Learning Mot'e Appropri-</p>
                    <p>ate Selcctional Restrictions. In Pwcccdings of the</p>
                    <p>7th Conference of the Eurot)ean Chaptcr of the As-</p>
                    <p>sociation for Computational Linguistics, Dublin,</p>
                    <p>Ireland. Mats Rooth. 1998. Two-Dimensional Clusters in</p>
                    <p>Grammatical Relations. In Inducing Lexicons</p>
                    <p>with th, c EM Algorithm, AIMS Report 4(3). Insti-</p>
                    <p>tut ffir Maschinelle Si)raehverarl)eitung, Univer-</p>
                    <p>sitgt Stuttgart. Sabine Schulte im Walde. 1998. Automatic Se-</p>
                    <p>nmntic Classification of Verbs According to Their</p>
                    <p>Alternation Behaviour. Master's thesis, Institut</p>
                    <p>ffir Maschinelle Sprachverarbeitung, UniversitSt</p>
                    <p>Stuttgart. Suzamm Stevenson and Paola Merlo. 1999. Auto-</p>
                    <p>Inatic Verb Classification Using Distributions of</p>
                    <p>Grammatical Features. hi P~vcccdings of the 9th</p>
                </div2>
            </div1>
            <note n="1)." place="below">They skated a\]()ng Llle canal/over t:}le</note>
            <note n="1For" place="below">example, when considering the noun coffee isolated from its context, we do not know whether we are talking about the beverage coffee, the plant coffee or a coffee bean. Thero.fore, a third of the frequency of the noun was assigned to each of the three classes.</note>
            <note n="2I" place="below">chose l.he 11 tel) level nodes of the 11 WordNet l,ierarchies as conceptual classes. 'Phe top level node Entity seemed too general as concel)tual class, so it was replaced by its 13 sulml'dinal, ed synsets.</note>
            <note n="4For" place="below">a more detailed discussion see tile original work (Schulte im Walde, 1998). Note that the (wrongly chosen) intransitive fl:ame is listed as well. This is {Ill('. t,o underlying sentences containing an NP ellipsis, parsing mistakes and Dame extraction.</note>
            <note n="9." place="below">• • •</note>
            <note n="751" place="below"></note>
            <note n="0.3,1" place="below">start • 0.19 finish • 0.18 stop • 0.16 begin •</note>
            <note n="0.49" place="below"> 0.20</note>
            <note n="5" place="below"></note>
            <note n="0.22" place="below">fight • • 0.20 play • •</note>
            <note n="753" place="below"></note>
        </body>
        <back/>
    </text>
</TEI>
