Marcu, Daniel (2000) The Theory and Practice of Discourse Parsing and
Summarization, The MIT Press, A Bradford Book, ISBN: 0-262-13372-5
Reviewed by Catalina Barbu, School of Humanities, Languages and European
Studies, University of Wolverhampton, UK
It is an acknowledged fact that discourse exhibits internal structure, that
contributes to its cohesiveness and texture. It has also been previously stated
that the discourse structure can be used in different natural language
processing applications, such as summarization, machine translation, natural
language generation, anaphora and coreference resolution.
Although research has been conducted in the theory of discourse parsing,
previous attempts of building automatic discourse parsers have failed.
This book represents the first serious attempt of tackling the discourse
parsing problem both from a theoretical and practical point of view. In
contrast with other attempts of deriving the discourse structure (see
[Kurohashi&Nagao, 1994], [Asher&Lascarides, 1993]), Marcu's model only employs
surface-form methods for determining the discourse markers and the textual
units, without the need for deep syntactic and semantic analysis; it also
employs a shift-reduce model of constructing discourse structures, which comes
closer to the way humans construct the discourse trees than the incremental
method proposed by other reseachers ([Polanyi1988], [Crsitea&Webber1997]).
The book is organised in 3 main parts: part I describes the linguistic and
mathematical theories behind discourse representation, part II tackles the
discourse parsing problem from a computational point of view and part III
discusses the applications of discourse parsing in text summarisation.
Part I - Theoretical Foundations
This first part introduces the concept of discourse structure as a factor of
cohesion in the text. Most linguistic theories of discourse structure rely on
the following assumptions: that text can be split into sequences of elementary
units, that some units are more important than others, that discourse relations
hold between units of different sizes and that trees can be used for modelling
the structure of a discourse.
The author investigates the problem of discourse parsing from two points of
view: linguistic and mathematical formalization.
Chapter 2 describes one of the most popular theories of discourse structure,
RST, analysing the mechanism one employs to build a valid representation of a
discourse. The author formulates two compositionality criteria of valid text
structures that explain the relationship between discourse relations that hold
between large spans of text and discourse relations that hold between
elementary discourse units.
Chapter 3 introduces a mathematical formalization of valid discourse trees,
expressed in the language of first-order logic. The author then proposes a
proof theory that provides support for deriving the valid text structures. H
proves that the application of the proof theory is sound and complete with
respect to the axiomatizaton of text structures.
Part II - The Rhetorical Parsing of Free Texts
This part concentrates on the problem of the identification of rhetorical
relations that hold between two nits of text.
Two methods of rhetorical parsing are presented: one based on manually derived
rules and one based on a machine learning approach.
1. The cue-phrase based rhetorical parsing algorithm
The first approach is based on the assumption that cue phrases can be used as a
sufficiently accurate indicator of the boundaries between elementary textual
units and of the rhetorical relations that hold between them.
The research that led to designing an algorithm for discourse parsing based on
discourse markers was based on an extensive corpus analysis of cue-phrases.
The first step in the algorithm is the identification of all the potential
discourse markers of a text, then it determines the elementary units of the
text and it builds the valid discourse structures. Three levels of granularity
are considered: sentence, paragraph and section. Rhetorical relations holding
between elementary units are hypothesised on the basis of the corpus analysis
of the cue-phrases and the trees for each level of granularity are built using
one of the algorithms previously described. The final discourse trees are built
by merging the trees corresponding to each level.
The ambiguity of discourse
Discourse is inherently ambiguous: more than one correct structure can be
usually produced for a single text.
One method for disambiguation is to give preference to trees that are skewed to
the right, following the assumption that the human readers tend to interpret
new textual units as continuations of the topic of previous units.
Practically, this preference is expressed in weights associated to trees, the
weight of a tree growing proportionally with the development of its right
The parser is evaluated against human-built trees and with respect to its
suitability for text summarization. Precision and recall are calculated for
each step in the building of the tree: identification of elementary units,
spans, nuclearity and rhetorical relations. The results show that the parser's
performance is consistently below that of humans. However, the author proves
that it is still useful in text summarisation for selecting the most salient
units of discourse.
2. Rhetorical Parsing by means of automatically derived rules
This second method for discourse parsing presented in the book uses decision-
tree classifiers for deriving the discourse structure of unrestricted texts.
The corpus used for training contains 90 manually built discourse trees for
texts extracted from 3 corpora.
The first task is the discourse segmentation. A C4.5 classifier is used for
classifying lexemes as boundaries of sentences, elementary disourse units,
paranthetical discourse units or nonboundaries. The features used in learning
- the local context (characteristics of the lexemes surrounding the one under
consideration: the part of speech tags of the lexemes in a window of size 5,
the potential of a lexeme to be a discourse marker, an estimate of a lexeme to
be an abbreviation) and
- the global context (existence of certain punctuation marks before the
estimated end of sentence, existence of a verb in the current unit).
The evaluation of the discourse segmenter shows impressive results, with an
accuracy in the range of 92.4-97.87% (depending on the corpus). Sentence
boundaries were identified with a precision of 98.55%, which is similar to
those obtained by specialised sentence splitters.
For modelling the parsing of discourse trees, a shift-reduce parsing model is
employed, where elementary discourse trees are processed at each step, either
by promoting them through a shift operation or by combining them through a
reduce operation. One shift and six reduce operations are implemented for
enabling the derivation of any discourse tree.
A C4.5 program is for learning decision trees and rules that specify how
discourse segments should be assembled into trees, i.e. what action is taken at
each step in building the tree. The learning cases are generated by decomposing
the training trees into sequences of shift-reduce actions and associating a
learning case to each action. Four classes of features are used for learning:
-structural features (relating to the structures of the trees in focus)
-lexical and syntactic features (regarding the lexemes delimiting the text span
subsumed by the trees in focus)
-operational factors (regarding the operation previously performed on the trees
in focus) -semantic-similarity factors (similarity between the text segments
subsumed by the trees in focus and similarity between words contained in the
The functioning of the classifier is explained by examples from the MUC corpus.
Unfortunately, the low results reported show that the parser in many cases
fails to identify the elementary discourse units and the rhetorical relations
holding between discourse segments.
Chapter 8 is a discussion on previous research on empirical discourse analysis.
Apart from briefly reviewing works in discourse segmentation, cue-phrase
disambiguation and the discourse function of cue phrases, the author refers to
previous attempts of building discourse parsers, comparing them with the
discourse parsers presented in the book. This is followed by a discussion of
possible further developments that would allow the performance of the discourse
parser to approach human performance levels.
Part III - Summarization
This chapter shows how discourse parsers as those described previously in the
book can be used for selecting the most salient units of discourse in order to
The idea behind discourse-based summarizers, which has been previously hinted,
is that nuclei of a discourse tree correlate with what human judges consider to
be important in a text, and should therefore appear in a summary. The discourse
parser provides a way of computing algorithmically the importance of textual
units, by associating weights according to the depth in the discourse tree
where the node containing the unit occurs first as a promotion unit. The author
evaluates the suitability of using the disourse structure for selecting the
most important units in a text, reporting accuracies close to the level of
human-constructed summaries. Furthermore, the performance of a summariser that
uses the cue-phrase based discourse parser for building the discourse structure
is evaluated. The interesting result shows that, although the overall
performance of the parser is quite low, the performance of the summarizer (that
uses only part of the information supplied by the parser) is still close to the
human performance. Furthermore, the discourse based summarizer outperforms two
baseline models and the Microsoft Office summarizer.
The following chapters describe ways of combining traditional indicators of
textual importance, like word frequency, with indicators given by the discourse
[Kurohashi&Nagao 1994] Sadao Kurohashi and Makoto Nagao. "Automatic detection
of discourse structure by checking surface information in sentences". In
Proceedings of the 15th International Conference on Computational Linguistics
(Coling94), Kyoto, Japan, 1994
[Asher&Lascarides 1993] Alex Lascarides and Nicholas Asher. "Temporal
interpretation, discourse relations and common sense entailment". Linguistics
and Philosphy, 16(5), 1993
[Polanyi 1988] Livia Polanyi. "A formal model of the structure of discourse".
Journal of Pragmatics, 12, 1988
[Cristea&Webber 1997] Dan Cristea and Bonnie Webber. "Expectations in
incremental discourse processing". In Proceedings of ACL/EACL-97, Madrid,
Catalina Barbu is a PhD student in Computational Linguistics at the University
of Wolverhampton, UK. Her field of research is multilingual anaphora
This mail sent through IMP: mail.wlv.ac.uk