Which text? Which methods?

Next: Variation: how to describe Up: Natural language generation: Setting Previous: Natural language generation: Setting Contents

Which text? Which methods?

As indicated above, the benefits of generating informational texts automatically are now widely appreciated, and systems producing texts on the basis of some set of input parameters are firmly established. But, also as mentioned above, few of the systems actually fielded to date adopt NLG components or techniques. Semi-personalized letters, for example, have been with us for a long time; such personalization ranges from the insertion of the addressee's name in the salutation (``Dear Ms Smith'') to the conditionalized selection of paragraphs depending on some directly specified input parameters (e.g., new customer vs. old customer; first payment of bill vs. reminder letter vs. final demand; etc.). Such uses of `natural language', falling under the `mail merge' rubric mentioned above, are quite inflexible; the language produced is a direct result of some specified condition arising in the application program or provided by a user. And, indeed, as long as the text to be produced by such an application remains fixed, there is little need for the `generation' component to be more sophisticated. However, and crucially for NLG, there are now an increasing number of applications, or at least potential application situations, where it is desirable to produce text exhibiting considerably more flexibility. In these cases, natural extensions of mail merge techniques are inherently unsuited.

This need for flexibility can be motivated by at least two factors. First, the information to be presented may be sufficiently complex that a simple re-rendering of its content and structure in `natural language' may produce text that is difficult to understand, even though it may, in some formal sense, be `correct'. An early example of this from the history of NLG was in the addition of natural language generation functionality to expert systems (e.g., [Swartout: 1977,Swartout: 1983]). It has been convincingly argued that the recommendations made by an expert system are more `believable' if the system is able to `explain' how it came to its decisions. Producing a straightforward rendition of the internal deduction process responsible for the decision very rarely resulted in usable text however; the system-internal deduction processes do not usually correspond in detail or in level of abstraction to that employed in human reasoning. This requires that a far more sophisticated approach to explanation be taken, with explicit models of the process of human explanation and the corresponding text structures appropriate for achieving successful explanation. Whenever the data to be expressed in natural language is complex, then more sophisticated techniques, such as those pursued in NLG, are going to be necessary.

The second factor arises not necessarily because of complexity in the input, but because of desired complexity in the output. Since this is the more common situation for most of the application systems that attempt currently to render information in natural language, we illustrate it in more detail. Consider the example of data gathered from a string of meteorological stations--generally in numerical form and made up of measurements of windspeed, wind direction, atmospheric pressure, and so on; this `raw data' forms the basis for a range of different kinds of texts, among them: weather forecasts. While it is possible to produce a weather forecast by concatenating a sequence of conditionalized print statements of the form ``It will rain from ?hh:mm at ?location with probability ?p%'', such sequences will never produce a fluent weather report. The reason for this is that a weather report is not just a sequence of unrelated statements; it is, instead, a natural language text and as such it must necessarily exhibit textuality. Natural language achieves this by deploying a variety of linguistic phenemona whose role is precisely to bind texts together into coherent structured wholes: for example, information from different sources is combined into single sentences or groups of sentences, continuation or change of reference is signalled by pronominalization and definite reference, temporal relations and textual structure (e.g., foreground/background) are communicated by the selection of appropriate tenses, and basic grammatical patterns and lexical choices indicate particular perspectives selected according to intended context. In order to produce such texts, a generation program must represent explicit linguistic features of the elements it is using to compose its texts. Simple unanalyzed patterns of the ``Dear Ms Smith'' of ``it will rain...'' varieties do not provide access to the points of linguistic variability that need to be controlled and so much currently produced text appears stereotyped or clumsy.

We can make this more explicit with reference to the following simple `text':¹

Anni Albers is American, and she is a textile designer, a draughtsman and a printmaker. She was born in Berlin on 12 June 1899. She studied art in 1916 - 1919 with Brandenburg. Also, she studied art at the Kunstgewerbeschule in Hamburg in 1919 - 1920 and the Bauhaus at Weimar and Dessau in 1922 - 1925 and 1925 - 1929. In 1933 she settled in the USA. In 1933 - 1949 she taught at Black Mountain College in North Carolina.

This is not a sophisticated text; indeed, it could be improved in many ways. Its sole point of interest for us here is that it was generated fully automatically; that is, the text has no human author. Instead it was produced by an NLG system [Bateman and Teich: 1995] on the basis of information acquired by a prototype information system concerning art history. The NLG system contains information about words and grammar, but no pre-built phrases or templates. Each sentence is fully generated, piecing together the necessary structures to express the intended content. This particular output text was intended as a short biographical note that offers an overview of the education and career of the Bauhaus artist Anni Albers as background to the fact that, at some point, she moved to the U.S. to continue her work. As such, it represents an answer produced automatically in response to a query from a user asking about the role of designated artists in the spread of the Bauhaus movement to the U.S. [Kamps, Hüser, Möhr and Schmidt: 1996] Since the information system in question had acquired information on a substantial number of artists (around 10,000) and the precise questions raised by a user could call for a range of answers--even, in later experimental work, in several languages--the motivation for some kind of automation of the production of these texts was a natural line to follow.

Given the text alone, however, it is not possible to know which techniques were used by the NLG component in order to produce it. The entire text could have been the result of a single print statement; or the text as a whole may have been fixed with only the names, times and locations being fitted in dynamically. As we shall see again below, there is nowadays considerable interest in presenting flexible versions of information in the context of the World-Wide Web: the use of style sheets and transformations of underlying `presentation-neutral' representations in languages such as the Extensible Markup Language (XML: [Consortium: 1997]) invite precisely this kind of approach to information presentation. But, although the utility and desirability of flexible presentation is now clear, the appropriate techniques for achieving it when natural language is concerned are not. The pasting together of a few sentence fragments can give an impression that `natural language' has been produced dynamically. But it is actually the human readers of the provided text fragments that are doing the work--readers are generally very skillful at providing intepretations of almost any text fragments and so perform `robustly' even with clumsy, unnatural or just deficient `texts'. This `success' can lead to a drastic oversimplification in the conceptualization of the task of producing real natural language texts on the part of application or interface designers; and this in turn leads to a lack of consideration or awareness of the further technologies available for producing texts in more sophisticated ways. There are several significant drawbacks to this apparent `success'; to mention just two: first, when the onus of interpretation of non-generated text fragments is placed on the reader, it also becomes more likely that those interpretations differ from those intended by the designer--this leads to general feelings of dissatisfaction with the information offered as not being maximally `relevant' or confusing; second, as an application grows, or the users of an application become more diverse, more varied language behavior can be required--and in this case the technological basis provided by simple text production methods does not provide a suitable foundation for scaling-up.

We can only get a sense of the actual techniques used, and their generality and flexibility, in the construction of a generation system by considering the range of texts that a given generation component can, or will need to, produce. The techniques to be employed for producing text therefore need to be matched and evaluated against the linguistic diversity that is required. And the linguistic diversity that is necessary can itself only be evaluated against the targetted users of the texts to be produced. The step of explicitly considering the role and form of language to be produced by an application is still very often overlooked or handled intuitively or impressionistically according to the mistaken assumption that `anyone' can understand language and so is automatically in a position to evaluate some application's textual offerings. This naive approach to language leaves much to be desired; but it will no doubt be tempered as more comparative evaluations of systems that deploy language in sophisticated ways and systems that do not become available.

Returning to our Bauhaus example, we now consider the following text, which was produced automatically from the same set of data by the same generation system with the same lexicon and grammar:

Anni Albers. American textile designer, draughtsman and printmaker. Born in Berlin 12 June 1899. 1916 - 1919 studied art with Brandenburg. 1919 - 1920 studied art at the Kunstgewerbeschule in Hamburg. 1922 - 1925 and 1925 - 1929 studied at the Bauhaus at Weimar and Dessau. Albers settled in the USA in 1933. 1933 - 1949 taught at Black Mountain College in North Carolina.

The intended purpose of this text is slightly different to that of the previous example. Here we are producing a biography sketch in note form; it might be more appropriate laid out as a table, or as additional information to a diagram or `info-graphic'. This difference in function is most appropriately accompanied by linguistic variation. Thus, in this text we see a range of linguistic variation when compared with the first text example: elements are in different places, the connections between sentences are different, and so on. Whereas the production of an alternative text in response to differing communicative goals and user requirements would represent a major overhead for a non-NLG-based text production system, this fine-grained matching of text and function is precisely what an NLG system undertakes to provide. The variations shown in this second text could be produced by some transformation of the first version, but it should be clear that making such a transformation function reliably for a wide range of artists with differing amounts and kinds of information held about them would be an endeavor requiring careful attention.

When we change the intended function slightly again and go on to consider a further variant, we see another different set of `transformations' again:

Anni Albers, the American textile designer, draughtsman and printmaker who was born in Berlin on the 12 June 1899, taught at Black Mountain College in North Carolina from 1933 until 1949 after settling in the USA in 1933. Previously, she had studied art with Brandenburg in 1916 - 1919, at the Kunstgewerbeschule in Hamburg in 1919 - 1920 and at the Bauhaus at Weimar and Dessau in 1922 - 1925 and 1925 - 1929.

Any one of the three example texts could probably be constructed (although not particularly easily) with carefully handcrafted transformations of some `underlying' information, but this approach does not naturally lend itself to the creation of the alternatives. For example, a set of templates providing a chronological biography as in our Albers texts above will not support the generation of a comparison of the places studied and taught by two artists should a user ask that they be contrasted (for a generation system providing such explicit contrasts and a discussion of the techniques employed, see [Milosavljevic and Dale: 1996 a]); neither will a set of templates for a public weather report provide a suitable text for the generation of a marine forecast. Even less will a set of templates providing an explanation of some problem in language that is appropriate for a computer novice be appropriate for a description of the same problem for a software engineer. Hence when variable text production is going to be an aim, it is particularly important that system builders become aware of the more sophisticated NLG techniques that are available. And, as many of these dimensions of variability in fact correspond precisely to directions currently being investigated in the pursuit of message/text `personalization', the requirement that text be produced flexibly is by no means an unlikely situation.

The range of variation shown in our three example texts shows a very small selection from the linguistic variations possible. This is, however, precisely the kind of variation that is targetted by full NLG systems; each of the texts consists equally of grammatically and semantically correct sentences arranged so as to exhibit textuality. The variations are therefore by no means random; they are orchestrated so as to indicate the textuality of the resulting product; we will contrast this situation to a `text' where this orchestration has not taken place below. In much of NLG work, then, the point of interest has accordingly moved away from what kind of variations are possible--large-scale generation systems should in any case be able to produce the grammatical variation exhibited here--and towards a close consideration of when and why any one variation is to be selected rather than another. Generation is fundamentally a matter of choice and of uncovering the reasons why one choice may be better than another in any particular context of use.

This establishes a good criterion for the evaluation of candidate techniques for particular applications that may require natural language output: the more flexibility demanded of the texts for a particular application, the more general linguistic knowledge is going to be required in order to get from the specified `input' to the range of desired linguistic products. Conversely, the more variability exhibited in the texts to be generated, the less likely it is that simple pattern-filling is going to be a useful strategy and the more motivation there is for turning to NLG techniques proper.

Next: Variation: how to describe Up: Natural language generation: Setting Previous: Natural language generation: Setting Contents

bateman 2002-09-21