<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0" xmlns:ns2="http://www.tei-c.org/ns/Examples">
    <text>
        <front/>
        <body>
            <div>
                <p>TRADITIONAL MEANS IN MACHINE TRANSLATION Zden~k KIRSCHHER</p>
                <p>Matematicko-fyzik~ln~ fakulta UK 118 00 Praha 1, Malostransk@</p>
                <p>Czechoslovakia machine transa fully automa-</p>
                <p>is inevitable. Abstract:</p>
                <p>The chronic problems of lation cannot be solved in tic way. Human intervention The development of &quot;traditional&quot; means in connexion with advances of computer technology represent most substantial contribution to further progress in the field of machine translation. Some of the problems are illustrated using the example of the APAC32 proJect. I. The hopes for a successful solution of</p>
                <p>the chronic problems of machine transla-</p>
                <p>tion (MT) have long been set on two</p>
                <p>fruitful and mutually dependent pros-</p>
                <p>pects: research in artificial intelli-</p>
                <p>gence (AI) and advances in the computing</p>
                <p>technology. The importance of the latter</p>
                <p>contribution is beyond dispute. As re-</p>
                <p>gards the former domain, some reserva-</p>
                <p>tions must be voiced. 1.1~ It can be stated with some tolerance</p>
                <p>that the missing information required</p>
                <p>for automatic understanding (or desam-</p>
                <p>biguation) of natural language (NL) is</p>
                <p>supposed to be supplied by a computer</p>
                <p>model of the knowledge correspomding to</p>
                <p>the universe of discourse. The context</p>
                <p>of the analysed message constitutes an</p>
                <p>important part of this universe. There-</p>
                <p>fore, an essential component of such a</p>
                <p>model must draw on the texts processed.</p>
                <p>Thus, irrespective of the contingent</p>
                <p>form, organisation, etc., of the whole,</p>
                <p>the model would at least partially de-</p>
                <p>pend on the results of the analysis for</p>
                <p>which it is supposed to provide neces-</p>
                <p>sary information. This means that cir-</p>
                <p>cularity is imminent. Even if the al-</p>
                <p>most inevitable occurrence of elements</p>
                <p>not covered by any device in the sys-</p>
                <p>tem is disregarded, it is obvious that</p>
                <p>the model can be neither complete nor</p>
                <p>consistent. 1.2. Since there will always remain threats</p>
                <p>of failure caused not by accidental</p>
                <p>factors but by the intrinsic inadequa2. 2.1. n~m~st~ 25 cy of any system of MT, human intervention is inevitable, and the ideal of &quot;fully automatic high quality trans~ lation&quot; (FAHQT) (which, we suspect, is no longer believed to be able to ever come true, anyway) is impossible° While not denying potential merits of the contribution of AI, the above discussion should suggest that the develop~ ment of means called &quot;traditional&quot; is equally important for MT. An example of an approach based on such means is our experimental system APAC32.</p>
                <p>we refer to our system here, it is not APAC32 is a descendant of the Montreal TAUM series. It has been implemented on computers of the type of IBM 370 to translate into Czech English abstracts in microelectronics and, later, pumping machinery. Using Colmerauer~s Q-systems the main part of the program builds linearized rooted-tree-llke structures, which stepwise identify and interpret elements or groups of elements of the input units stating their character and function, dependency relations and position in the sentential context. Strings with multiple interpretations which had not been eliminated are rs~ presented by parallel structures giving multiple parses in the final stage of the analysis, but not necessarily mul~ tiple translations at the final output. If to boast that we have achieved any extraordinary success or that we have long duly appraised the above conclusions and reacted on them in an original way, etc. It is only to illustrate our conviction that there is still a fairly wide and long path open ahead of us within the confines of the traditional means. To say the truth, it has been our material situation that forced us to rely exclusively on them and to dispense with anything more sophisticated. This had to be said to clear us of a suspicion that we are making a virtue of necessity. Bas~o or fully accomplished structures~ which resemble predicate calculus pat-° te~Ds, have a finite verb at their root and individual participants in depen~ dent pesitions~ The sense (direction to the left or right) of an oriented edge (an arrow) representing a dependency relation ~ an information perraining to the mutual projective position of the incident nodes - as well as the function of a dependent participant are ~Lndicated in a way that simulates the marking of edges in a graph. The synthesis starts by disintegrating the stru~tures that result from the analy~ sis~ At the output of this stage, re~ latlvely simple trees representing indivldual words appear, with all the infer~atlon necessary for generating form~ of the target language. This proeeed~f in steps in which occasionally additional target-language-specific formation has to be derived to render the synthesized structures complete and acceptable. Such adjustments are usually connected with the operations of transfer: while the action of its general rules mostly coincides with inthe opening phases of the synthesis~ the ~nformation concerning the parti~ cular changes is contained in the dic~ tionariss to be exploited in the concluding parts of the program. The absence of any accomplished model of the universe of discourse and the temporary abandonment (for technical reasons) of any device alowing the involvement of hypersentential context in the analysis have, of course, endowed the system with a typical probabilistic character. In this connexion, especially the tactics occasionally ferred to as &quot;preferential&quot; must be mentioned: some rules are applied repeatedly in subsequent stages, each time with conditions less rigid. The combinatorial power of the Q-systems had to be reduced by introducing several stages - partial grammars - operat~ ing before the syntactic analysis propete Thus, e.g., a (partial) analysis of nor, linal complexes precedes that of verbal structures. Therefore, a special device registers schematically the context (,f each element in the sentence. re2~3~ In simulatiz~ne functions sf a model</p>
                <p>of the universe Of d~scou~se, the sys~</p>
                <p>tom of dictionaries represents the mo~</p>
                <p>important tool. 2o3~o 2~3.1~I. 2o3oi .2. The basic dictionary information in APAC32 is a complex which consist~ of two main parts= information conce1~ ing the source language and that pe~=~ raining to the target language° These structures can be separated~ ~hey have been put together whenever pos~ sible with respect to the efficiency of the system° The internal structure of both these parts is almost the same and can be briefly described as follows: ca~egorial information~ le~, xical value~,pa~adigmatic information~ pointers to parallelmeanings, valen~ oy frame, combinatory frame (preposi~ tional, phrasal9 special~liaison, etc., patterns), terminological spe~ cifications, special syntactic inforo~ mation, semantic features~ Extensive though this apparatus may be, it should be stated that theze are still possibilities ~ and a need~ of course</p>
                <p>to add further data° For lack of space, let us confine ourselves to three poi~ts only® The apparatus of semantic featu~'e~ consists of four &amp;lasses of feaotures: a) features concerning the text vs. metatext structure~ b) general semantic features~ c) domain specific features, and d) features concerni~ terminologi~cal status° The number of features is limited for reasons of which the most important is that excessive detailedness leads to unwanted ~i~ gidity. However, a number of poten . tially very useful candidate featu~ res can be added. Assigning weights to features might be a solution to this dilemma, especially in the framework of the &quot;preferential&quot; tactics. Some classes of words have been further classified to highlight their intrinsic properties in thG translation environment° E~g., a special classification of verbs makes it possible to solve, at least in part, the problems of as.~ pest in Czech in relation to Eng~</p>
                <p>329</p>
                <p>fish verbal adjectives (-ED, -ING</p>
                <p>forms). Much more can be done in</p>
                <p>this direction. Unfortunately, this</p>
                <p>will imply extensive empirical work</p>
                <p>including excerption and, if possi-</p>
                <p>ble, organization of a usage-panel-</p>
                <p>like inquiry.</p>
                <p>As concerns combinatory frames,</p>
                <p>also more information will be added</p>
                <p>on the possibilities of adverbial</p>
                <p>modification of nouns. Some changes</p>
                <p>and additions to the present orga-</p>
                <p>nisation and contents of the dic-</p>
                <p>tionaryentries ar e considered with</p>
                <p>a view to structures suggested in</p>
                <p>the Mel%huk-Apresyan's model</p>
                <p>&quot;meaning - text&quot;. 2.3.2. A specific dictionary device has</p>
                <p>been introduced in the terminologi-</p>
                <p>cal section of the dictionary system.</p>
                <p>Special rules control, or rather,</p>
                <p>guide the analysis of terminological</p>
                <p>complexes, making it possible to de-</p>
                <p>cide frequent ambiguous structures</p>
                <p>(e.g., INTEGRATED CIRCUIT SYSTEM as</p>
                <p>((INTEGRATED CIRCUIT) SYSTEM) rather</p>
                <p>than (INTEGRATED (CIRCUIT SYSTEM))).</p>
                <p>In this way partial quasi-model of</p>
                <p>the specific domain can be formed</p>
                <p>whose elements are capable o~ recur-</p>
                <p>sive application to new combinations. 2.3.3 • Another dictionary device deals wlth</p>
                <p>unrecognized elements - the so-called</p>
                <p>transducing dictionary (TD). TD re-</p>
                <p>lies on derivational morphology, as-</p>
                <p>signing categorial information, and,</p>
                <p>in some cases, semantic status and</p>
                <p>other information to words hitherto</p>
                <p>&quot;unknown&quot; to the system, on the basis</p>
                <p>of their endings (e.g., -ING, -ED,</p>
                <p>-ESS, -ITY, -ION, -LY, -WISE, -PY,</p>
                <p>etc.); for some of them even success-</p>
                <p>ful adaptation to the target language</p>
                <p>is possible. The remaining unrecognl-</p>
                <p>zed.elements are regarded as nouns:</p>
                <p>first as proper, then, if this fails</p>
                <p>to be confirmed, common. A more ver-</p>
                <p>satile practice is planned, which</p>
                <p>will take into consideration other</p>
                <p>possible interpretations as well.</p>
                <p>TD, as well as some other devlces and</p>
                <p>rules can be also regarded as special</p>
                <p>fail-soft measures, though another</p>
                <p>component called &quot;emergency rules&quot; is 2.3.1.3. 2.3.4. 330 !</p>
                <p>included which performs this f~Lucti0~</p>
                <p>as a specialized set of rules design~</p>
                <p>ed to reconstruct, complete or integ~</p>
                <p>rate into a (would~be_.)_.me_animgf.u!</p>
                <p>whole those structures that failed %0</p>
                <p>reach the stage of an accomplished</p>
                <p>parse. In some respects, the role o£</p>
                <p>such measures is problematic in zela~</p>
                <p>tion to h~an intervention. Our sys~</p>
                <p>tem offers possibilities to introduce</p>
                <p>a special diagnostic device to recog~</p>
                <p>nize and classify the s~mptoms of a</p>
                <p>failure, so that more than the pre-</p>
                <p>sent simple marking of &quot;suspicious&quot;</p>
                <p>or &quot;underdone&quot; outputs can be presen=</p>
                <p>ted to aid the postedition. Ambiguities are treated in the usual way. It should be pointed out that in the translation between the languages in question, the principles of agreement so widely applied in Czech unmer~ cifully reduce the chances to get over some types of unsolved ambiguities in an &quot;unperceptible&quot;~ i.e., accidental, way. These principles, as a rule, obstinately insist upon rendering impli~ cit information explicit. That is why in some cases structures with ambiguous reference are translated by equivalents equally ambiguous or vague. E.g., with some classes of verbs, (clausal) parti~ cipial modification with ambiguous de~ pendence is replaced by prepositional or other constructions without any di~ rect dependence: e.g., USING -, WITH USING, CAUSING -~ WHICH (referring to the whole of the preceding or pertinent clause) CAUSES, etc. This concerns also contrastive ambigui~ ties and other asymmetrical relations between the two languages. In this con . nexion, it should be pointed out that one of the criteria for the classifica~ tion of English verbs is the classific~ tion of their Czech counterparts. Th,~s~ 2.4. 2.5. e.g., the verb SUPPOSE must be assigned information that the construction SOME~ ONE IS SUPPOSED TO... must be transformed to IT IS SUPPOSED (ABOUT SOMEONE) THAT SOMEONE... to make it correspond to the structure acceptable in Czech. Similarly, constructions like SEAT SAT ON BY,.. must be transformed with the aid of correspondi~ relative clauses~ Much remains to be done for the domain of conversion. Its productive aspects po~e serious problems. 3o To come back to the opening paragraphs:</p>
                <p>the ~dvances of computer technology,</p>
                <p>while not offering ultimate solution of</p>
                <p>problems detrimental to the efforts to</p>
                <p>achieve the ideals of PAHQT, will un-</p>
                <p>doubtably liberate the MT from the curse</p>
                <p>entailed by its usually more or less im-</p>
                <p>mediate subservience to various practical</p>
                <p>applications - the strict limitations of</p>
                <p>computer time and storage - which so oft-</p>
                <p>en represented the only obstacles in in-</p>
                <p>trodu,~ing many a useful and, sometimes, even very necessary device, process or approach. Most of the prospective extensions, innovations and other changes require profound empirical examination and more linguistic fleld-work than, up to now, we were able to expends Kirschner, Z. (1982) A</p>
                <p>Analysis of English</p>
                <p>Machine Translation.</p>
                <p>fyzik~ln~ fakulta UK.</p>
                <p>-- (1987) APAC3-2: An En~lish-tc~</p>
                <p>Czech Machine Translation System. Praha,</p>
                <p>Matematicko-fyzik~In~ fmkulta UK. Dependency-Based for the Purpose of</p>
                <p>Praha, Matematicko~ 331</p>
            </div>
        </body>
        <back/>
    </text>
</TEI>
