LINGUIST List 15.206

Tue Jan 20 2004

Calls: Computational Ling/Spain; Computational Ling/UK

Editor for this issue: Andrea Berez <>

As a matter of policy, LINGUIST discourages the use of abbreviations or acronyms in conference announcements unless they are explained in the text. To post to LINGUIST, use our convenient web form at


  1. Anna Korhonen, ACL 2004 Workshop on Multiword Expressions: Integrating Processing
  2. Magali Jeanmaire, Cross-Language Information Retrieval

Message 1: ACL 2004 Workshop on Multiword Expressions: Integrating Processing

Date: Wed, 14 Jan 2004 11:32:37 +0000
From: Anna Korhonen <>
Subject: ACL 2004 Workshop on Multiword Expressions: Integrating Processing


ACL-2004 Workshop on
Multiword Expressions: Integrating Processing

26th July 2004, Barcelona, Spain


Workshop website:

ACL website:


In recent years, there has been a growing awareness in the NLP
community of the problems that Multiword Expressions (MWEs) pose and
the need for their robust handling.

MWEs include a large range of linguistic phenomena, such as phrasal
verbs (e.g. "add up"), nominal compounds (e.g. "telephone box"), and
institutionalized phrases (e.g. "salt and pepper"). These
expressions, which can be syntactically and/or semantically
idiosyncratic in nature, are used frequently in everyday language,
usually to express precisely ideas and concepts that cannot be
compressed into a single word.

Most real-world applications tend to ignore MWEs or address them
simply by listing. However, it is clear that successful applications
will need to be able to identify and treat them appropriately. This
particularly applies to the many applications which require some
degree of semantic interpretation (e.g. machine translation,
question-answering, summarisation, generation) and require tasks such
as parsing and word sense disambiguation.

A considerable amount of research has lately been conducted in this
area, some within large research projects dedicated to MWEs. In this
context, a successful workshop on MWEs was held at ACL 2003
(<>), with papers
presenting a cross section of research on MWEs. There is some research
on MWEs in general. Some is very computational, examining detection
and extraction using a variety of methods. Some is more linguistic,
focusing on classification of the various types. There is also a lot
of research on particular subtypes of MWEs, especially English phrasal

In this workshop the focus is on papers that integrate analysis,
acquisition and treatment of various kinds of multiword expressions
(MWEs) in NLP. For example,

(1) research that combines a linguistic analysis with a method of
automatically acquiring the classes described

(2) work that combines the computational treatment of a class of MWEs
with a solid linguistic analysis

(3) research that extracts MWEs and either classifies them or uses
them in some task.

These combinations of research will help to bridge the gap between the
needs of NLP and the descriptive tradition of linguistics.


The workshop will be of interest to anyone working on MWEs, e.g. in
the areas of computational grammars, computational lexicography,
automatic lexical acquisition, machine translation, information
retrieval, text mining, and computer-assisted language teaching and
learning. The objective is to summarise what has been achieved in the
area, to establish common themes between different approaches, and to
discuss future trends.


Papers are invited on, but not limited to, the following topics:

* Theoretical research on MWEs, including corpus based analysis
* MWE taxonomies, classifications and databases
* Cross-lingual analysis of MWE types, use, and behaviour
* Methods for identification and extraction of MWEs (machine learning,
statistical, example- or rule-based, or hybrid)
* Evaluation of MWE extraction methods
* Methods for determining the compositionality of MWEs
* Integration of MWE data into grammars and NLP applications
(e.g. machine translation and generation)

Papers can cover one or more of these areas, but research that
combines different topics is especially encouraged.


Papers should be submitted electronically in Postscript or PDF format
to: . Submissions should conform to the
two-column format of ACL proceedings and should not exceed eight (8)
pages, including references. We strongly recommend the use of ACL-2004
style files, also available from the ACL-2004 website.

The subject line of the submission email should be "ACL2004 WORKSHOP
PAPER SUBMISSION". As reviewing will be blind, the body of the paper
should not include the names or affiliations of the authors. The
following identification information should be sent in a separate
email with the subject line "ACL2004 WORKSHOP ID PAGE":

Title: title of paper
Authors: list of all authors
Keywords: up to five topic keywords
Contact author: email address of author of record (for correspondence)
Abstract: abstract of paper (not more than 10 lines)

Notification of receipt will be emailed to the contact author.


Submission deadline: 1 April 2003
Acceptance notification: 1 May 2003
Final version deadline: 15 May 2003
Workshop date: 26 July 2003


Takaaki Tanaka (NTT Communication Science Laboratories, Japan)
Aline Villavicencio (University of Cambridge, UK)
Francis Bond (NTT Communication Science Laboratories, Japan)
Anna Korhonen (University of Cambridge, UK)


Timothy Baldwin (Stanford University, USA)
Colin Bannard (University of Edinburgh, UK)
Ann Copestake (University of Cambridge, UK)
Gael Dias (Beira Interior University, Portugal)
James Dowdall (University of Zurich, Switzerland)
Dan Flickinger (Stanford University, USA)
Matthew Hurst (Intelliseek, USA)
Stephan Oepen (Stanford University, USA; University of Oslo, Norway)
Kyonghee Paik (ATR Spoken Language Translation Research Laboratories,
Scott Piao (University of Lancaster, UK)
Beata Trawinski (University of Tuebningen, Germany)
Kiyoko Uchiyama (Keio University, Japan)


Workshop registration information will be posted at a later date. The
registration fee will include attendance at the workshop and a copy of
workshop proceedings.
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 2: Cross-Language Information Retrieval

Date: Tue, 20 Jan 2004 09:20:25 +0100
From: Magali Jeanmaire <>
Subject: Cross-Language Information Retrieval

ELRA is happy to announce that registration for CLEF 2004
evaluation campaign is now open.

16-Sep-2004 - 17-Sep-2004
Bristol, United Kingdom


The CLEF series of system evaluation campaigns aims at promoting
research and development in Cross-Language Information Retrieval.

Registration is now open for CLEF 2004.

The objective of CLEF 2004 will be to test different aspects of mono-
and cross-language information retrieval system performance. There
will be eight tracks this year:

a/ Multilingual Information Retrieval
b/ Bilingual Information Retrieval
c/ Monolingual (non-English) Information Retrieval
d/ Mono- and Cross-Language IR for Scientific Collections (GIRT)
e/ Interactive Cross-Language Information Retrieval (iCLEF)
f/ Multiple Language Question Answering (QAatCLEF)
g/ Cross-language Retrieval in Image Collections (ImageCLEF)
h/ Cross-Language Spoken Document Retrieval (CL-SDR)

- Data Release - from 15 February 2004
- Topic Release - from 15 March 2004
- Submission of Runs by Participants - 15 May 2004 (may vary slightly
for some tracks)
- Release of relevance assessments and individual results - from 15 July 2004
- Submission of paper for Working Notes - 15 August 2004
- Workshop - 16-17 September (in conjunction with ECDL 2004)

For full details on the CLEF Agenda and Task Description for 2004 and
instructions on How to Participate, see

For further information, contact:
Carol Peters - ISTI-CNR
Tel: +39 050 315 2987
Fax: +39 050 315 2810


55-57, rue Brillat-Savarin
75013 Paris FRANCE
Tel: (+33) 1 43 13 33 33 / Fax: (+33) 1 43 13 33 30
URL: or

LREC conference:
LangTech forum:

Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue