Publishing Partner: Cambridge University Press CUP Extra Publisher Login
amazon logo
More Info

New from Oxford University Press!


It's Been Said Before

By Orin Hargraves

It's Been Said Before "examines why certain phrases become clichés and why they should be avoided -- or why they still have life left in them."

New from Cambridge University Press!


Sounds Fascinating

By J. C. Wells

How do you pronounce biopic, synod, and Breughel? - and why? Do our cake and archaic sound the same? Where does the stress go in stalagmite? What's odd about the word epergne? As a finale, the author writes a letter to his 16-year-old self.

Academic Paper

Title: Interlingual annotation of parallel text corpora: a new framework for annotation and evaluation
Author: Bonnie J. Dorr
Email: click here TO access email
Institution: University of Maryland
Author: Rebecca J. Passonneau
Institution: Columbia University
Author: David Farwell
Institution: New Mexico State University
Author: Rebecca Green
Email: click here TO access email
Institution: Online Computer Library Center
Author: Nizar Habash
Institution: Columbia University
Author: Stephen Helmreich
Institution: New Mexico State University
Author: Eduard Hovy
Institution: University of Southern California
Author: Lori S Levin
Institution: Carnegie Mellon University
Author: Keith J Miller
Institution: MITRE Corporation
Author: Teruko Mitamura
Institution: Carnegie Mellon University
Author: Owen Rambow
Institution: Columbia University
Author: Advaith Siddharthan
Institution: University of Aberdeen
Linguistic Field: Applied Linguistics; Computational Linguistics; Text/Corpus Linguistics
Abstract: This paper focuses on an important step in the creation of a system of meaning representation and the development of semantically annotated parallel corpora, for use in applications such as machine translation, question answering, text summarization, and information retrieval. The work described below constitutes the first effort of any kind to annotate multiple translations of foreign-language texts with interlingual content. Three levels of representation are introduced: deep syntactic dependencies (IL0), intermediate semantic representations (IL1), and a normalized representation that unifies conversives, nonliteral language, and paraphrase (IL2). The resulting annotated, multilingually induced, parallel corpora will be useful as an empirical basis for a wide range of research, including the development and evaluation of interlingual NLP systems and paraphrase-extraction systems as well as a host of other research and development efforts in theoretical and applied linguistics, foreign language pedagogy, translation studies, and other related disciplines.


This article appears IN Natural Language Engineering Vol. 16, Issue 3.

Return to TOC.

Add a new paper
Return to Academic Papers main page
Return to Directory of Linguists main page