KPML>Documentation>Input specifications

BUILDING AN SPL FOR GENERATION

An introductory guide by Juan Rafael Zamorano Mansilla
May 2003

What’s an SPL?

An SPL is the form of non-linguistic input fed into KPML in order to obtain a sentence in a natural language as output.
In a more general way we can say that an SPL is the semantic specification of a sentence. 'SPL' stands for Sentence Planning Language, and was originally devised by Bob Kasper for the Penman text generation system; nowadays we commonly talk of an SPL or some SPLs to refer to the specifications themselves.

Writing an SPL

The usual way of using an SPL when testing a grammar or a generator is inside an example. This is the standard place where an SPL occurs for generation with KPML. The example specification shown below generates the sentence Men are not happy.
(EXAMPLE :NAME                                          A-G1
:GENERATEDFORM                  "Men are not happy."
:GLOSS                                          ( :ENGLISH "Men are not happy.")
:TARGETFORM                          "Men are not happy."
:LOGICALFORM
  (L / PROPERTY-ASCRIPTION :DOMAIN (M / PERSON :LEX man :number plural )
:RANGE (C / QUALITY :LEX happy )
:TENSE present
:POLARITY negative
:LEX be )
:SET-NAME                                  ADJECTIVAL-GROUP
)
The words :NAME , :GENERATEDFORM , :GLOSS , :TARGETFORM , :LOGICALFORM and :SET-NAME are slots that contain information required by the generator. The information that is inserted after each slot is explained in the following sections. The SPL proper is the expression that is found under the LOGICALFORM slot. A generation program that is itself driving generation might just produce SPLs without worrying about example structures; if you are writing the SPLs yourself then it is usually more advisable to wrap them up as examples so that they have names. KPML also provides extensive tools for working with sets of examples for testing grammars and exploring linguistic resources: all of the information in the Grammar Generation Bank is created from sets of examples.

:NAME

Here you must write the name of the sentence ("A-G1" in the example). This name can be a number or a text of some length, but remember that no spaces can be inserted. Many sentences are usually saved in one file set, so it is important that they have different names. If two sentences have the same name, one of them will be discarded by the program.

:GENERATEDFORM

Here is the output sentence generated by the program. It is not obligatory to fill in this slot and leaving a space between inverted commas (" ") will not affect the generation. If you save an example set after generating all the sentences, the program will automatically put the generated result in here.

:GLOSS

It is a translation into English of the sentence you want to generate. Again if you leave this slot empty the generation process will not be affected, but other users will appreciate it if you include a translation, particularly if you are generating in a language they can’t speak!

:TARGETFORM

It contains the sentence you intend to generate. The difference with :GENERATEDFORM is that what you include here will be used by the program to evaluate how accurate the result is. If the generated sentence does not match what you include in :TARGETFORM , KPML will notice and indicate it by displaying the sentence in red (instead of green).
It is also important that you fill in this slot because what you write here will be presented in the menu where the user selects the sentence s/he wants to generate among all the other sentences that are part of the same example set.

:LOGICALFORM

This is where the semantics of the sentence is. We begin by specifying what type of process we wish to generate. In the example presented above we had the following SPL specification:
 
(L / PROPERTY-ASCRIPTION :DOMAIN (M / PERSON :LEX man :number plural )
:RANGE (C / QUALITY :LEX happy )
:TENSE present
:POLARITY negative
:LEX be )
 
The process type can be seen in the first line: here we have a relational process, in which a property is ascribed to an entity. This kind of process is known as PROPERTY-ASCRIPTION by KPML. If you want to know what other concepts and keywords are used to refer to the rest of process types, consult the Semantics Guide or, once that is clear, see the very many examples maintained in the Generation Bank .
We also need to give a name to this process in case we need to refer back to it in future. In the example the process has the name "L", but it could be something else of your choice, like "BE", "VERB-1", etc. The name and the type of process are separated by a slash in this way:
 
[NAME] / [PROCESS TYPE]

Examples

help / material-process
process-1 / verbal-process
verb-a / mental-process
Once the process type is defined, KPML needs some information about the participants that are involved as well as some grammatical information about the sentence. The participants are specified after a colon like this:
:DOMAIN  (M / PERSON :LEX man :NUMBER plural )
:RANGE  (C / QUALITY :LEX happy )
In the example above there are two participants: DOMAIN and RANGE. To learn what participants are inherent to each process type and how they are referred to in KPML, consult the Semantics Guide or the Generation Bank .
After each participant we usually want to give more information about them; we include this information within their brackets. We start by giving a name to the participant. In the example, DOMAIN gets the name "M", whereas RANGE gets the name "C", but you are free to choose the names you prefer. After that, we insert a slash and then we specify semantic information about the participant. In the example, DOMAIN is a person, so we write PERSON . On the other hand, RANGE is a quality, so we write QUALITY.
:DOMAIN (M / PERSON :LEX man :NUMBER plural )
:RANGE (C / QUALITY :LEXhappy )
For a list of the possible choices here,  consult the Generation Bank . But in general terms it suffices to say that participants are either PERSON or THING. If you want to be more precise, you can specify if the person is MALE or FEMALE. On the other hand, if you want to be less precise you can write CONSCIOUS-BEING . Examples:
(participant-1 / PERSON :LEX people )
(poli / MALE :LEX policeman )
(a / CONSCIOUS-BEING  :LEX whale )
THING is what you use for the rest of entities. If you want to be more precise, you can also write NONCONSCIOUS-OBJECT .
(war / THING :LEX war )
(M / NONCONSCIOUS-OBJECT :LEX table )
Qualities always receive the term QUALITY .  Circumstances, on the other hand, usually involve complex semantics, so it is better to consult the Semantics Guide or the Generation Bank , where you will find examples of all the circumstance types available in KPML.
Once this has been done, we can include more specific information if we want. In principle only M / PERSON is obligatory, but we can say what exact word will realise the participant with the command :LEX.

 
:DOMAIN (M/ PERSON :LEX man :number plural )
:RANGE (C/ QUALITY :LEX happy )
 
 
If we don’t include :LEX, the program will select any lexical item from its lexicon that fits the semantics of the participant (PERSON, QUALITY, etc.) Other things such as number (:number plural ), determination, etc can also be specified. The best way to learn what other things you can specify is by studying the Generation Bank , where you will find sentences exemplifying a good variety of constructions.
Once the specifications about the participants are finished, it is convenient to say something about the properties of the sentence.
To learn how to express other sentence properties, consult the Semantics Guide .
Finally, you also may want to include circumstantials in your sentence, such as time locatives, spatial locatives, etc. For instance, we can change the philosophical message of the example above into something more mundane by adding a time locative: Men are not happy on Mondays. An example containing the appropriate SPL would be given like this:

 

(EXAMPLE :NAME                                          A-G1
:GENERATEDFORM                  "Men are not happy on Mondays."
:GLOSS                                          ( :ENGLISH "Men are not happy on Mondays.")
:TARGETFORM                          "Men are not happy on Mondays."
:LOGICALFORM

(L / PROPERTY-ASCRIPTION

:DOMAIN (M / PERSON :LEX man :number plural )
:RANGE (C / QUALITY :LEX happy )
:TEMPORAL-LOCATING  (D / ONE-OR-TWO-D-TIME :LEX monday  :number plural )
:TENSE present
:POLARITY negative
:LEX be )  
:SET-NAME                                  ADJECTIVAL-GROUP
                    )
It's easy to see that circumstantials are specified following the same principles as for participants. You only need to know the keyword to introduce them (:TEMPORAL-LOCATING) and the semantics (ONE-OR-TWO-D-TIME). You will find a list of them along with their realizations in the Semantics Guide .

:SET-NAME

SPL’s are stored in files called Example Sets. These Example Sets must have a name the program can recognise, and this is put under :SET-NAME. In the example presented above the sentence belongs to a set called ADJECTIVAL-GROUP. Of course you are free to name your sets as you please.

KPML>Documentation>Input specifications