next up previous contents
Next: Summary Up: Variation: how to describe Previous: Textual decisions   Contents

Registerial and discourse semantic consistency

The appropriate control of a lexicogrammar requires many additional constraints over and above the propositional content to be expressed. Providing such information enables broad coverage lexicogrammars to produce textually appropriate lexicogrammatical structures. The next--and many researchers in NLG would consider the main--task in NLG is then to guarantee that sequences of such constraint specifications are produced in such a way that they are mutually consistent and together combine to create recognizeable text. Thus, as always in NLG, it is not sufficient that a full NLG system offers the means for controlling lexicogrammars to produce textually varied constructions, such a system must also be able to select from the theoretically available possibilities those that are most appropriate for the instantial text being produced. Although early attempts to organize texts relied on the intrinsic organization of the knowledge being expressed, this is inappropriately rigid and precludes effective presentation for differing purposes. More flexible presentation methods are essential.

Linguistically, we can say that any text reveals many `predispositions' of the speaker that pervasively influence both the information selected for expression and its form of expression. This consistency in selected semantic options is a common issue across each of the shades of meaning introduced above: ideational, interpersonal, and textual selections all exhibit a unity that is characteristic of textuality. One way of specifying this consistency more precisely in theoretical terms has been suggested by Jacobs [Jacobs: 1987]. This builds on a similarity between such predispositions and the systemic-functional notion of register [Halliday: 1978], which can can be used for further structuring the range of predispositions that are responsible for the kinds of selections illustrated above.4 Essentially the theory of register states that language varies systematically according to its situation of use: consistency in situation therefore calls for consistency in the take-up of semantic options. This also provides a theoretical interpretation of those NLG systems which do not support variations in the predispositions that their texts exhibit: they are single-register systems, creating texts that are (at best) appropriate for a single situation of use. Register may be considered as a further, more abstract stratum of the linguistic system. This permits statements of communicative intent to be made that are even further removed from particular linguistic realization, thereby supporting more flexibility of expression--such as, for example, tailoring texts according to varying degrees of expertise. Early attempts to build registerial modelling explicitly into the generation process include Bateman and Paris [Bateman and Paris: 1989] and Cross [Cross: 1992]; this has now also been carried out in larger-scale systems, such as in the `Agile' system for Computer-Aided Design instruction texts in Russian, Czech and Bulgarian [Kruijff, Teich, Bateman, Kruijff-Korbayová, Skoumalová, Sharoff, Sokolova, Hartley, Staykova and Hana: 2000].

Moving beyond the descriptions of semantics (even enriched in the ways described above) allows a system to motivate orientations in its deployment of semantic resources that would be difficult to motivate from within the semantics itself. For example, in the area of ideational meaning, the very predicates that are selected for expression do not vary randomly from one text-contribution to the next: consistency in `conceptualizations' and their granularity are maintained or developed across the life of a text. In a technical field, technical predicates may be selected whereas, describing the same state of affairs to a novice or in a non-technical situation, everyday predicates may be selected. Moreover, the allocation of `content' to semantic configurations is not predetermined and is equally subject to control: that is, different linguistic units can include more or less of the content. Some consequences of this for sentences appropriate to our Bauhaus texts are illustrated in the following examples.

(k) She studied art with Brandenburg in 1916--1919, and at the Kunstgewerbeschule in Hamburg in 1919--1920, the Bauhaus in Weimar in 1922--1925 and the Bauhaus in Dessau in 1925--1929.
(l) She studied art with Brandenburg. That was in 1916--1919.
(m) Albers studied with Brandenburg in 1916--1919. She studied at the Kunstgewerbeschule in Hamburg in 1919--1920. She studied at the Bauhaus in 1922--1925. ...
(n) Anni Albers was a text designer, draftsman, and printmaker, who was born in Berlin on 12 June 1899.

Here, sentence (k) includes both the original sentence and further content closely related to that of the original, while sentence (l) divides the original content over two sentences and uses `discourse deixis' (cf. [Webber: 1988,Martin: 1992]) to bind the sentences together into a text fragment. Sentence (m) then goes to the extreme and separates out the events grouped together in (k), each studying period is realized in a separate sentence. Sentence (n) shows a similar decision, where the first two sentences of the example text have been combined into one. The phenomenon at issue in all these sentences is termed aggregation in NLG and detailed rules and heuristics for controlling this grouping have been suggested by a number of researchers (cf., e.g., [Gabriel: 1988,Horacek: 1990,McKeown, Robin and Kukich: 1995,Dalianis and Hovy: 1996,Shaw: 1998,Bateman: 1999]). A similar phenomenon applies at all linguistic levels: sentences (k)-(m) could also be expressed as (o), while the first sentence of the first example text above would read better as (p) and the similar variations found in the latter two example texts.

(o) Anni Albers was an art student from 1916 to 1929.
(p) Anni Albers is an American textile designer, draftsman and printmaker.

In (o) many events have been combined and covered in a single property attribution: being an art student; similarly in (p) the nationality and professions have been combined in a single complex property. The decisions for these presentational forms depend again on the particular granularity goals of the text being produced, which in turn depends on the situation and hearers/readers involved.

The consequences of interpersonal positionings on the part of a speaker/writer also go well beyond grammatical and lexical selections. An interpersonal slant can be the motivation for including or not including some information in a text at all. The most extensive treatment of such interpersonal variation in a generation system is the system PAULINE built by Hovy [Hovy: 1988a]; PAULINE can generate short texts describing states of affairs from various types of interpersonally classified communicative situations, including: `partiality' (i.e., biasedness during argumentation and description), `formality', etc. A contrasting pair of examples generated by PAULINE is shown in Figure 3.


AS A REMINDER TO YALE UNIVERSITY TO DIVEST FROM COMPANIES DOING
       BUSINESS IN SOUTH AFRICA, A LARGE NUMBER OF CONCERNED STUDENTS ERECTED A
       SHANTYTOWN --- NAMED WINNIE MANDELA CITY --- ON BEINECKE PLAZA IN EARLY
       APRIL. THE LOCAL COMMUNITY EXPRESSED SUPPORT FOR THE STUDENTS' ACTION.
       THE UNIVERSITY TOLD THE STUDENTS TO ERECT THE SHANTYTOWN ELSEWHERE.
       LATER, AT 5:30 AM ON APRIL 14, THE SHANTYTOWN WAS DESTROYED BY OFFICIALS;
       ALSO, AT THAT TIME, THE POLICE ARRESTED 76 STUDENTS. THE STUDENTS
       REQUESTED THAT YALE UNIVERSITY GIVE THEM PERMISSION TO REASSEMBLE IT ON
       BEINECKE PLAZA; ALSO, AT THAT TIME, SEVERAL LOCAL POLITICIANS AND FACULTY
       MEMBERS EXPRESSED CRITICISM OF YALE'S ACTIONS.FINALLY, YALE PERMITTED
       THEM TO RECONSTRUCT THE SHANTYTOWN AND, CONCURRENTLY, THE UNIVERSITY
       ANNOUNCED THAT A COMMISSION WOULD GO TO SOUTH AFRICA TO EXAMINE THE
       SYSTEM OF APARTHEID IN JULY. 


Time: much, Depth of acquantance: strangers
Tone of interaction: formal, Goal to affect hearer's opinions: switch
Speaker's opinions: for protesters

IN EARLY APRIL, A SMALL NUMBER OF STUDENTS WERE INVOLVED IN A
         CONFRONTATION WITH YALE UNIVERSITY OVER YALE'S INVESTMENT IN COMPANIES
         DOING BUSINESS IN SOUTH AFRICA. THE STUDENTS CONSTRUCTED A SHANTYTOWN          ---
         NAMED WINNIE MANDELA CITY --- ON BEINECKE PLAZA IN ORDER TO FORCE THE
         UNIVERSITY TO DIVEST FROM THOSE COMPANIES. YALE REQUESTED THAT THE
         STUDENTS ERECT IT ELSEWHERE, BUT THEY REFUSED TO LEAVE. THE UNIVERSITY
         INTENDED TO BE REASONABLE. THE UNIVERSITY GAVE IT PERMISSION TO EXIST
         UNTIL THE MEETING OF THE YALE CORPORATION, BUT EVEN AFTER THAT THE
         STUDENTS STILL REFUSED TO MOVE. AT 5:30 AM ON APRIL 14, OFFICIALS HAD          TO
         DISASSEMBLE THE SHANTYTOWN. FINALLY, YALE, BEING CONCILIATORY TOWARD THE
         STUDENTS, NOT ONLY PERMITTED THEM TO RECONSTRUCT IT, BUT ALSO ANNOUNCED
         THAT A COMMISSION WOULD GO TO SOUTH AFRICA IN JULY TO EXAMINE THE SYSTEM
         OF APARTHEID.

Time: much, Depth of acquantance: strangers
Tone of interaction: formal, Goal to affect hearer's opinions: switch
Speaker's opinions: for university

Interpersonally contrasting texts generated fully
automatically by pauline from a single underlying representation of the situation (Hovy, 1988)

Finally, it is equally necessary to enforce consistent selections of textual possibilities over sequences of semantic specifications. Here we are back to the question as to what kind of control of linguistic variation is necessary; being able to produce grammatically varied sentences is not enough--a generation system must also know how to orchestrate the decisions that are, in theory, possible. This is demonstrated in the following `pseudo-text'5 which shows what can happen if appropriate control of the resources of a lexicogrammar is not provided.

It was Anni Albers that is American, and a textile designer, a draughtsman and a printmaker she is. Also, in Berlin she was born on 12 June 1899. Also, art was studied by Anni Albers in 1916 - 1919 with Brandenburg. It was at the Kunstgewerbeschule in Hamburg in 1919 - 1920 and the Bauhaus at Weimar and Dessau in 1922 - 1925 and 1925 - 1929 that art was studied by Anni Albers. In 1933 - 1949 that person was teaching at Black Mountain College in North Carolina. In the USA she settled in 1933.

Each sentence is grammatically correct but the individual choices made in each sentence do not combine to form a text: in short, `textuality' has not been achieved. It is textually inappropriate, for example, to vary the theme choice randomly (although just how inappropriate depends on text type); there must therefore be a plan for which semantic entities are going to be made thematic. Similarly, one cannot freely vary the anaphoric status of elements; even some word selections are textually motivated: e.g., the anaphoric use of general nouns such as `animal', `person', `creature'; this shows again that word selection cannot be decoupled from the general meaning constraints holding, and these constraints include interpersonal and textual meanings in addition to ideational. The tenses selected may also not vary randomly between generally `past' forms such as ``studied'', ``was studying'', ``had been studying'', etc. Neither, however, is it a `safe' strategy always to pick the most `neutral' form; this results equally in disfluent, unnatural texts that are less communicatively effective. Hearers and readers strongly expect both that their texts will unfold the information they are presenting in particular ways and that the lexicogrammatical forms selected will indicate precisely those textual develoments intended; when this does not occur, the text is judged as difficult, unhelpful, misleading or clumsy.

Since higher level (upstream) meanings expressing the textual development of texts are such a crucial component of textuality and hence of NLG, their control has been subject to more investigation than that of ideational and interpersonal meanings and there is now an extensive body of work in the area. Here it is usual to appeal to notions of `discourse structure' and the account most commonly used is Rhetorical Structure Theory (RST: [Mann and Thompson: 1988]). We give more details of RST below in our discussion of Techniques.


next up previous contents
Next: Summary Up: Variation: how to describe Previous: Textual decisions   Contents
bateman 2002-09-21