Next: Which text? Which methods? Up: ATG01 Previous: Introduction Contents

Natural language generation: Setting the scene

The concrete task of NLG can be characterized very simply: representations of some information maintained in some computationally accessible form are taken as input and appropriate natural language re-expressions of that information are produced as output. From the perspective of NLG we can speak of an `application system' that requires natural language output of some kind, and the 'generation system' that is responsible for producing that output. The originating information is the responsibility of the application system; the final natural language produced the responsibility of the generation system. Thus, NLG concerns itself with the construction of systems, or of modules of such systems, or of theories for such systems.

Generation systems are accordingly often described as `natural language front-ends' for application systems. However, this simplicity dissolves somewhat when we consider the extreme variability in both the forms of the required (or available) `input' and those of the desired `output'. For example, the `computer-internal information' to be treated as input can be minimally structured numeric data--e.g., for weather reports [Kittredge, Polguère and Goldberg: 1986], statistical reports [Iordanskaja, Kittredge, Lavoie and Polguère: 1992,Fasciano and Lapalme: 1996], etc.; the contents of databases [Androutsopoulos, Ritchie and Thanisch: 1995,Dale, Green, Milosavljevic, Paris, Verspoor and Williams: 1998]; the results of natural language analysis components--as in, e.g., machine translation [Hutchins and Somers: 1992, Chapter 7]; internal representations derived from a dialogue model--as in natural language interfaces with meta-information concerning the dialogue or task at hand [Guida and Tasso: 1983,Dilley, Bateman, Thiel and Tissen: 1992,Haller and Shapiro: 1996,Jokinen: 1996,Hagen and Stein: 1996]; knowledge bases such as are maintained by, for example, expert systems [Moore and Swartout: 1991] or information systems [Horacek: 1990,Dobeš and Novak: 1991,Bateman and Teich: 1995]; representations of visual scenes produced by and for robotic visual processing [Herzog, Sung, André, Enkelmann, Nagel, Rist, Wahlster and Zimmermann: 1989], and many more. The places where the responsibilities of an application system end and those of an NLG system start thus appear different in different contexts of use.

This is symptomatic of one of the most problematic discussion areas in NLG of all: in opposition to parsing, where it is generally assumed that the input is some string of characters or list of presumed recognized phonemes, in generation there is no established general, externally motivated source of generation. Different systems can have quite different views of what is to function as their input and discussions of some systems might not even mention `input' at all. Formal approaches to generation often assume the input to be a `logical form' representing the semantic content of a sentence (cf. [van Noord and Neumann: 1997]), whereas other systems simply take the output or internal representations of their intended application systems as input whatever those might be; both of these kinds of input can, as we shall see, lie a considerable distance away from the information that is necessary to produce appropriate surface strings. The input for a system may also be constructed piecemeal `on-the-fly' as required during so-called `incremental' generation (cf., e.g., [de Smedt: 1990,Reithinger: 1991,Harbusch, Finkler and Schauder: 1991,Abb, Günther, Herweg, Lebeth, Maienborn and Schopp: 1996]), which in extreme cases may mean that an `input' does not even exist as an isolateable level of representation in a system. Given this range of variation, we need to provide an abstract specification for the range of possibilities so that we are able to evaluate particular cases and situations within that range as they occur.

The situation is similar when we consider the desired outputs, i.e., the `texts' to be generated. Such texts may be written (the majority) or spoken (cf. [Danlos, Laporte and Emerard: 1986,Fawcett: 1990,Abb, Günther, Herweg, Lebeth, Maienborn and Schopp: 1996,Grote, Hagen and Teich: 1996,Kay, Gawron and Norvig: 1994]); they may be extended in length [Granville: 1994], be single paragraphs (the majority), or consist instead of single phrases or sentences--as often in database responses, diagram caption generation (e.g., [Mittal, Roth, Moore, Mattis and Carenini: 1995]), machine translation, etc.; they may end up as simple character strings (the majority) or invoke more sophisticated text formatting, punctuation or page layout (e.g., [Sefton: 1990,Hovy and Arens: 1991,White: 1995,Pascual: 1996,Bouayad-Agha, Scott and Power: 1996,Bateman, Kamps, Kleinz and Reichenberger: 2001]); they may exhibit a linear structure (the majority), be part of a dialogue or interactive setting [Fawcett, van der Mije and van Wissen: 1988,Cawsey: 1990,Youd and McGlashan: 1992,Fischer, Maier and Stein: 1994,Hagen and Stein: 1996,Maier, Mast and Luperfoy: 1997], or be (partially) organized as hypertext [Reiter, Mellish and Levine: 1992,Stock: 1993,Gruber, Vemuri and Rice: 1995,Milosavljevic, Tulloch and Dale: 1996,Oberlander, Mellish and O'Donnell: 1997,André, Rist and Müller: 1998,de Carolis, de Rosis, Andreoli, Cavallo and De Cicco: 1998]; they may be aimed at audiences differing in expertise, knowledge, interest, or cognitive load [Paris: 1993,Peter and Rösner: 1994,Zuckerman and McConachy: 1995,Wahlster, Jameson, Ndiaye, Schäfer and Weis: 1995,Jameson, Schäfer, Weis, Berthold and Weyrath: 1999]; they may be required in a variety of natural languages [Kittredge, Iordanskaja and Polguère: 1988,Matthiessen, Nanri and Zeng: 1991,Rösner and Stede: 1994,Kittredge: 1995,Bateman, Matthiessen and Zeng: 1999,Coch, Dycker, García-Moya, Gmoser, Stranart and Tardieu: 1999]; they increasingly combine natural language (sentences, paragraphs) with non-linguistic material such as graphs, pictures, or diagrams [Feiner and McKeown: 1990,Wahlster, André, Finkler, Profitlich and Rist: 1993,Fasciano and Lapalme: 1996,Kerpedjiev, Carenini, Green, Moore and Roth: 1998,Bateman, Kamps, Kleinz and Reichenberger: 1998]; and they may even target non-verbal languages such as sign language (e.g., [Veale and Conway: 1994]). Again, for a general view, we need to propose generic specifications that allow us to position and describe any particular case.

Faced with this wide range of goals and contexts, there is considerable diversity of opinion concerning the components and architectures best adopted for building an NLG system. Attempts to provide overall systematicity or standardization in the field are very recent and still of arguable generality; we return to one of them below. The approach taken in this review is to view all NLG systems from the perspective of the range of language phenomena that they deal with: this allows us to address the variability found in a general way without being drawn into less central discussions of implementational or application-driven differences. In particular, wherever information is expressed by sequences of linguistic elements such as `sentences' (however they are realized: i.e., as strings of characters, sign language sequences, or even as composite graphics), there are some fundamental properties that stand as prerequisites for successful communication. Many of these can be subsumed under the general property of textuality: that is, what it is that makes a text a text rather than a sequence of unrelated signs. The component tasks of NLG can then be seen in this light--that is, as attempts to support the basic linguistic phenomena that give rise to textuality. Unless appropriate textuality is achieved in the output generated by an NLG system, that system cannot be considered to be achieving the fundamental goal of producing `natural' language. We therefore consider a linguistically sophisticated view of the product of generation--i.e., a view which describes the textuality of produced text--as already constraining and defining the basic problems that NLG as a field must address.

Subsections

Next: Which text? Which methods? Up: ATG01 Previous: Introduction Contents

bateman 2002-09-21