Summary: interfacing with lexicogrammars

Next: NLG Methodologies Up: Lexicogrammatical realization and its Previous: Grammar-based realization Contents

Summary: interfacing with lexicogrammars

Whereas most NLG systems agree that there should be some component corresponding to the lexicogrammatical stratum described above, the components involving more abstract information vary greatly. The input to lexicogrammatical components must also therefore range widely. One of the earliest lines of approach is to consider NLG as the conversion of a `conceptual' representation into its natural language equivalent (e.g., [Shapiro: 1979,Simmons and Slocum: 1972,Boyer and Lapalme: 1985,Iordanskaja, Kittredge and Polguère: 1991,Nogier: 1991]). Such conceptual representations are usually networks analogous to the semantic networks from Artificial Intelligence. Finding natural language re-expressions for information drawn from such semantics involves a wide range of issues, including determination of the areas of the network that are to be expressed in single sentences, the transitions between different areas of the network, and the conversion of subportions of the selected network areas into grammatical and lexical expressions. This therefore shares many similarities with the task of interfacing with application knowledge described above--the only difference is that the semantic levels of representation are maintained by the NLG system among its own internal levels of representation. Since such levels of representation can be motivated on linguistic grounds and serve the function of helping the NLG system achieve its goals, more general and theoretically motivated mappings between levels can be investigated here than is the case with application/domain levels.

One significant direction here draws on the possibilities for linguistic expression provided by a language for `covering' an area of the semantic network. This can support very flexible realization and is sometimes suggested on the grounds that it is unlikely that inputs to a generator will already come neatly packaged in sentence-sized `chunks'; it is therefore necessary for the generator itself to be able to select appropriately sized areas for expression. A recent development in this direction involves the complete removal of the otherwise rather common requirement that input specifications to lexicogrammatical components should already delimit the `content' to be realized lexicogrammatically. An algorithm by which a semantic graph is successively covered by generated syntactic units and the content of any particular syntactic unit can vary as needed by the details of lexicogrammar and intent has been formulated in detail by Nicolov [Nicolov, Mellish and Ritchie: 1996].

A reoccuring strategy in a different tradition of relating levels of representation relies on the use of a `discrimination network'. This typically associates with concept classes discrimination networks that distinguish between possible linguistic realizations of those classes. Probably the first use of such a method was Goldman's [Goldman: 1975] lexical selection procedure for actions in semantic representations drawn in terms of Schank's conceptual dependency framework [Schank: 1972]. A similar approach, extended to apply to further concept classes, was adopted in the German generator VIE-GEN [Buchberger, Steinacker, Trappl, Trost and Leinfellner: 1982,Buchberger and Horacek: 1988]. This can now be re-interpreted in terms of the object-oriented programming paradigm: the discrimination nets in fact represent hierarchies of `realization methods' that are applied when particular concepts are to be expressed; this brings out certain similarities with the use of an Upper Model described above. However, despite occasional proposals that the grammatical realizational possibilities be `folded into' the `conceptual' or `semantic' representations in order to achieve a more economic representation, this is rarely beneficial. The organizations of lexicogrammar and semantics are sufficiently distinct, and the relation between them sufficiently complex, as to require more sophisticated treatments. A pure conceptual inheritance organization is not appropriate for lexicogrammatical organization.

A different line of development going back to discrimination nets is the `chooser and inquiry' semantics approach adopted in the Penman text generation system (e.g. [Matthiessen: 1981,Mann: 1983a]). Here discrimination procedures (the `choosers') are no longer associated with concepts, but with grammatical alternations. The chooser thus offers a `grammatical choice expert' that is responsible for investigating whatever knowledge sources are necessary in order to select a situationally appropriate grammatical alternative. This approach is natural where grammatical resources are organized in terms similar to an inheritance hierarchy, as is the case with the systemic grammars for which choosers were developed, and generalizes earlier approaches such as the `syntax specialists' of Hovy [Hovy: 1988a], and McKeown's [McKeown: 1985] attachment of focus conditions to grammar rules. One principle motivation for the chooser approach was modularity: by interposing choice experts between knowledge representations and the grammatical component, it was possible to develop and maintain the grammar independently of the details of any particular knowledge organization. Such an approach is still strongly motivated in areas where the semantic representation remains less well understood--which includes most non-ideational areas. Indeed, whereas the semantic representation of the Penman system was non-existent when the grammar was begun, by 1986 there was a large-scale semantic organization (the Upper Model introduced above) which could then be integrated into the system without changes to the, by then very substantial, grammar. The chooser and inquiry interface then provided a natural home for flexible mappings between SPL semantic input specifications and the lexicogrammar. The availability of non-ideational chooser and inquiry categories (cf. Figure 2) also readily supported an enrichment of the inputs to the lexicogrammar so as to minimize the negative consequences of employing a fixed pipeline-style temporal ordering between the semantic and lexicogrammatical modules.

With increased understanding of the complexities of the semantics-lexicogrammar mapping, the chooser and inquiry interface is now itself a candidate for replacement: for example, by replacement within a modern typed unification formalism (e.g. [Bateman, Emele and Momma: 1992]), or by different mechanisms entirely (e.g., [O'Donnell: 1994,Zeng: 1995]). In each case the change can be undertaken without changes to grammar or semantics. It is partly this kind of large scale modularity that has ensured that continuous theoretical and descriptive development within the Penman-style model has been possible from 1980 until the present time; a degree of longevity extremely rare for large-scale systems in computational linguistics as a whole.

A further set of problems is raised by the consideration of interactions with text planning. It is not straightforward to guarantee that the results of text planning are immediately commensurate with the requirements of a level of semantic representation or of lexicogrammar. One proposal here is that of Meteer [Meteer: 1992], where a particular kind of Text Structure is proposed that mediates between application and surface generator and which is so constructed as to guarantee that the surface generator is able to create an appropriate surface expression. This also serves as a style of input enrichment whereby the possible negative consequences of fixed sequencing of the decisions from distinct modules are minimized. Meteer argues for such an enrichment on the basis of the observation that speakers are generally fluent and therefore must have techniques for creating texts on-the-fly without backtracking or expensive interaction between modules. The text structure allows the deterministic construction of appropriate sentence structures, and permits equivalences to be defined between sequences of sentences and sentence-internal realization possibilities: thereby easing the transition from text organization to sentence realization. One price paid for using this form of text structure is that the text planner must do more work; the specification of the text structure possibilities also requires fine attention to the lexicogrammatical possibilities offered by a language. While Meteer suggests that her text structure is an alternative to rhetorical organization, how this might work is actually unclear; a combination of approaches would probably be more beneficial.

The relation between lexicogrammar and semantics constitutes a very general problem requiring a properly general solution. Special case generators might operate from `domain knowledge' of the kind shown in Figure 5 whereas general generators require more linguistically motivated specifications, ranging from semantic specifications (Penman/KPML) to strongly grammatically influenced partial structures (MUMBLE and REALPRO). In general, the less grammatical information an input contains (i.e., the more `abstract' the input), then the easier a task the rest of a generation system has in order to produce that input, but the more work the grammatical component has in order to convert this into surface strings. Relating a semantic specification to a lexicogrammatical specification can remain straightforward as long as the `input' specification (i.e., in this case, the semantics) includes enough constraints concerning the desired constituent structure of the result to be generated. An example would be if a semantic event always maps into a grammatical clause, while a semantic object always maps into a grammatical nominal group, a semantic quality into a grammatical adjective, etc. This is a valid simplification for certain text types, but not for language in general.

Grammatical realization components which do not have a particularly high degree of abstraction in their input (e.g., MUMBLE or Surge/FUF) require other components to mediate between application or domain levels of abstraction and their inputs. Extensive components have been developed by Meteer [Meteer: 1992] for MUMBLE and by Elhadad and colleagues [Elhadad, McKeown and Robin: 1996] for Surge/FUF. These components fall with the area now commonly termed micro-planning, following the labeling, if not the theory, of language production set out by Levelt [Levelt: 1989]. Microplanning as an area has developed in response to the increasing recognition given to the problems raised by the gap between semantic and lexicogrammatical descriptions. It is a necessary extension to the simple strategy-tactics division, which does not call sufficient attention to the complexity of the required mapping. There are not yet standardized approaches to the microplanning problem and differing components are difficult to compare. Many of the the apparent differences between these components are attributable to differences in the research emphases of the researchers concerned. The addition to Surge/FUF (called the Lexical Chooser) is mostly to facilitate fine lexical choices involving collocational constraints with substantial lexicogrammatical consequences. The Text Structure approach of Meteer facilitates the deterministic representation of realization alternatives that span diverse grammatical `ranks' (i.e., combinations of sentences, combinations of clauses, individual clauses, nominal groups, etc.). The chooser and inquiry approach is mostly motivated by the desire to guarantee the mutual independence of the lexicogrammatical and the semantic levels of representation: one kind of organization cannot, in general, be subordinated to the other and so their respective modularity must be maintained.

Despite these differences, there are many theoretical overlaps and building on the commonalities would undoubtedly advance the state of the art in NLG significantly. A suggestive line of development would be to combine the broad functional semantics of an SPL-like input form from Penman with a combination of a metafunctionally refined version of Meteer's Text Structure and a more formally well developed approach to discourse structure such as that of Asher's [Asher: 1993] `Segmented Discourse Representation Theory'; partial first steps towards such a synthesis have been taken work that enriches Meteer's Text Structure so as to distinguish the contributions of the various metafunctions more effectively [Panaget: 1994,Huang and Fiedler: 1996], and in, for example, combinations of and combination of systemic generation and Discourse Representation Theory (DRT) [Gagnon and Lapalme: 1996].⁷ This would be helped further by ongoing moves within semantic representations (particularly motivated by multilingual generation and machine translation applications where lexicogrammatical divergence between possible realizations has received the most study: [Bateman: 1992b,Copestake, Flickinger, Malouf, Riehemann and Sag: 1995]) to decommit more effectively from lexicogrammatical issues by employing various kinds of underspecification. This improves modularity, simplifies text planning, and provides a more accurate representation of the linguistic stratification involved.

Finally, we should note that there is continuing debate in NLG concerning how and when words get selected during generation; there are several overviews of the treatment of the `lexicon' and `lexical choice' in NLG [Cumming: 1995,Stede: 1995,Wanner: 1996,Zock: 1996]. Many examples of `lexical entries' from a variety of systems--including some NLG systems--are given by Ingria [Ingria: 1995]. Although earlier generation work posited a responsible `lexical choice module', we have not separated out this activity here as a task in its own right since word choice depends on domain classifications, situation, context of use, interpersonal relationships, collocational patterns in a language, `underlying' metaphor systems in a language/culture, grammatical structures, technicality, and much else besides. Trying to construct a `module' to do this is poor engineering (among other things) since the complexity of the module is equivalent to the task being attacked as a whole: generating the correct word turns out not to be that much more straightforward than generating an appropriate text. Lexical selection turns out not so much to be a `module' but rather a label for a collection of different activities situated at different places in an NLG system that must be coordinated appropriately.

More recent work in this area reflects this realization by discussing the influences on lexical selections rather than a single component with this functionality (cf. [Wanner: 1996]). In addition, while lexical organization is no doubt useful for analysis, providing a word-driven index, it is less clear to what extent this is useful for generation. Particular cases where `words' (or their immediate conceptual correlates) can be used to populate the domain or pre-generation data structures (cf., e.g., [McDonald: 1992]) would be appropriate target applications. The theoretical consequences here of applying so-called `lexicalized' frameworks as common in parsing or natural language understanding to NLG problems in grammar [Carroll, Copestake, Flickinger and Poznanski: 1999,Nicolov and Mellish: 1999] and discourse [Webber, Knott, Stone and Joshi: 1999] also need further clarification.

Next: NLG Methodologies Up: Lexicogrammatical realization and its Previous: Grammar-based realization Contents

bateman 2002-09-21