next up previous contents
Next: Sentence planning Up: Overview Previous: Overview

Text planning

 

Natural language systems need to generate not merely text, but coherent text. It is not enough to decide on a collection of facts and string a description of those facts together. The facts must be organized so as to signal the causal, logical and intentional relationships between them. Often, these relationships should even be explicitly indicated in the text, using special connectives. [Hov88] provides a convincing example of the importance of presenting information linked together in the right order and with the right connectives. He observes that the following discourse is difficult to understand:

The system performs the enhancement. Before that, the system resolves conflicts. First, the system asks the user to tell it the characteristic of the program to be enhanced. The system applies transformations to the program. It confirms the enhancement with the user. It scans the program in order to find opportunities to apply transformations to the program.
Meanwhile this discourse, which conveys exactly the same propositions, is relatively clear:
The system asks the user to tell it the characteristic of the program to be enhanced. Then the system applies transformations to the program. In particular, the system scans the program in order to find opportunities to apply transformations to the program. Then the system resolves conflicts. It confirms the enhancement with the user. Finally, it performs the enhancement.

Researchers have argued that the difference between examples such as these arises because natural discourses have an intentional structure [GS86]. Discourses can be broken up into nested blocks of contiguous material, called segments, on the basis of how the material contributes to the speaker (or writer)'s plans for presenting information. Each segment is associated with a discourse segment purpose which describes this contribution and which the speaker expects the hearer to recognize as part of understanding the discourse. Cue words, like finally or in particular above, facilitate this recognition by making explicit the argumentative relationships between segments---so finally marks the concluding step in an argument, and in particular introduces a more detailed description of a previously mentioned process or generalization [Coh87].

There are two somewhat competing approaches, schema- and plan-based, that bring this idea to bear in natural language generation. The earlier work on text planning, which [McK85] pioneered, used schemata: schemata represent naturally occurring patterns of discourse. For example, McKeown's system constituted the interface to a database containing knowledge about different kinds of ships. Thus, one of her schemata to describe a concept includes instructions to identify its superclass, to name its parts, and to list its attributes. Schemata in McKeown's TEXT system were implemented by means of ATN's: TEXT traverses the schemata and sequentially instantiates rhetorical predicates with propositions from the knowledge base. Subsequent work (in particular [MP93,Caw92,Moo95]) pointed out that a major problem with schema-based text planners is that they cannot reason about the structure, content and goals of explanations. This is a very important issue for systems that must be able to answer follow-up questions, or to replan how a certain goal has been achieved, in case the user is not satisfied with or doesn't understand the previous explanation. These systems, which include [Hov88,MP93,WAB91], use planning operators that reason explicitly about a system's intentions in presenting content and how those intentions can be achieved.

As it commonly happens, there is a trade-off between the two approaches. Schemata are less powerful, but are easier to write than plan operators, and planners using schematas are generally more efficient than plan-based text planners [LP95]. On the other hand, the latter are more principled, but they are still at the prototype stage. Moreover, as [YM94] points out, plan-based text generators have rarely if ever been formally assessed in terms of soundness and completeness, and the basis for writing plan operators has often been researchers' intuitions rather than more solid motivations. Finally, note that, while the two approaches are definitely different, they are related in that schemata can be seen as precompiled text (sub)plans.

A common characteristic of the two approaches just described is that the structure of the text is planned top-down. There are also local approaches such as Sibun's [Sib92] that can be seen as bottom-up; in fact, [Sib92] takes an even more radical approach to text planning, as she does away with building the global structure for the text altogether. Sibun argues that the hierarchical structure present in some texts, in particular descriptions of highly structured domains such as house layouts, family trees, etc, just reflects the domain itself, and not necessarily requires to be planned by a text planner; rather, such a text can be generated with local strategies. While Sibun's local strategies seem appropriate for the kinds of texts her system generates, it is not clear how they could be applied to texts that don't reflect the domain structure so closely; on the other hand, her approach is related to the need of encoding discourse communication knowledge (DCK for short). [KKR91] argues that DCK is a third level of knowledge that a NL generator should encode, intermediate between domain knowledge and communication knowledge. We will come back to the different approaches to text planning and to discourse communication knowledge in Sec. gif, when we will discuss these topics in relation to generating technical orders.



next up previous contents
Next: Sentence planning Up: Overview Previous: Overview



DTOG Group