LINGUIST List 14.646

Thu Mar 6 2003

FYI: Fr/Eng Glossary, Basque Endangered, It. Corpus

Editor for this issue: Karen Milligan <>


  1. Michael Cahill, French/English Glossary of Linguistic Terms
  2. Eneko Agirre, Today Basque is "an even more" endangered language
  3. Schneider, Stefan, New site with Italian corpus

Message 1: French/English Glossary of Linguistic Terms

Date: Tue, 04 Mar 2003 10:25:36 +0000
From: Michael Cahill <>
Subject: French/English Glossary of Linguistic Terms

SIL International is pleased to announce the French/English Glossary
of Linguistic Terms is now posted at 

This glossary contains thousands of linguistic concepts in both French
(6,702) and English (6,791). As a glossary, it does not define the
terms but simply gives the equivalent(s) in the other
language. Lexical and semantic relationships are displayed for many of
the terms in both languages. Domains or schools of thought are also
shown for many terms. All the qualified terms are indexed to major
linguistic dictionaries and major works that represent specific
domains or schools of thought in both languages. The Glossary is still
a work-in-process; new entries will regularly be added as the editor
and compiler continue their work. We invite the participation of any
linguists who would like to contribute to this work.
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 2: Today Basque is "an even more" endangered language

Date: Wed, 05 Mar 2003 10:29:45 +0000
From: Eneko Agirre <>
Subject: Today Basque is "an even more" endangered language

Dear colleagues

We know that this kind of message is not common in this mailing list,
but we would like to inform you about a direct attack to the Basque
culture, which has a direct influence in our research efforts.

The only Basque language newspaper in the world ''Egunkaria'' was
temporarily closed on February the 20th and 10 top representatives of
Basque culture arrested by a Spanish judge, under allegations of
collaboration with terrorists. We want to stress that there has not
been any trial yet; they have been held in protective custody. Before
even finding the newspaper employees guilty, the judge decided to
close down the newspaper. The closing of the newspaper is a preventive
temporary measure, but Spanish law allows the closing to go on for
five years. Even after a few weeks the newspaper becomes financially

It is worth mentioning that Egunkaria has the support of different
political sensibilities in the Basque Society, and it is also well
known in the International Community. The vast majority of Basque
society does not agree with the closing of Egunkaria (list of
supporters in The
International Federation of Journalists
(, Reporters Without
Borders ( and the
president of the European Bureau of Lesser Used Languages, among
others, have also criticized the measure.

Being Basque an endangered language (around 800.000 speakers) under a
normalization process, currently available corpora are small in size,
and one of the most promising sources for our research efforts was
Egunkaria. There is also an English version of it that would allow us
to research on parallel corpora. One of the biggest linguistic corpora
available for Basque is the compilation of the daily issues since
2000. Language technology was being used to search in their online
news database (unfortunately, their internet edition was also
closed). A document classification research project was underway, as
well as a research project on a pragma-rhetorical analysis of the
contents of EGUNKARIA.

We do not want to initiate a debate. If you want more information or
to express your sympathy, please refer to

Today Basque is ''an even more'' endangered language.

Research groups and companies working on Human Language Technology
from the Basque Country supporting this message:

 AHOLAB group (
 DELi group (
 ILCLI group on semantics, pragmatics and rhetoric 
 IXA NLP group (

 Code & Syntax (
 Diana Teknologia (
 Eleka (
 Elhuyar (
 Hizkia Informatika (
 UZEI ( 
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 3: New site with Italian corpus

Date: Thu, 6 Mar 2003 07:28:41 +0100
From: Schneider, Stefan <>
Subject: New site with Italian corpus

Dear linguists!

I want to inform those interested in the analysis of spoken Italian
that now there is a new database called BADIP (Banca dati
dell'italiano parlato) containing an online edition of the 500,000
word LIP-Corpus. The edition is being enriched with POS-tags and
lemmata, more data are being added continuously. Other corpora of
spoken Italian will be included in the database as soon as
possible. The database is part of the Language Server of the
University of Graz (Austria). Access to BADIP is free:

Univ.-Ass. Mag. Dr. Stefan Schneider
Karl-Franzens-Universit�t Graz
Institut f�r Romanistik
Merangasse 70/3, A-8010 Graz
Tel. ++43/316/380-2509
Fax ++43/316/380-9770

Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue