From: Jeremy Jancsary <jeremy.jancsaryofai.at>
Subject: Algorithms and Resources for Modelling of Dialects and Language Varieties
E-mail this message to a friend
Full Title: Algorithms and Resources for Modelling of Dialects and Language Varieties
Short Title: DIALECTS2011
Date: 31-Jul-2011 - 31-Jul-2011
Location: Edinburgh, United Kingdom
Contact Person: Jeremy Jancsary
Meeting Email: < click here to access email >
Web Site: http://www.ofai.at/~dialects2011/
Linguistic Field(s): Computational Linguistics
Call Deadline: 02-May-2011
The currently prevailing statistical paradigm has made possible major achievements in many areas of natural language processing. But since the methods employed critically depend on the availability of large training corpora, the applicability of these methods is generally limited to major languages / standard varieties, to the exclusion of dialects or varieties that substantially differ from the standard.
However, language varieties (and specifically dialects) are a primary means of expressing a person's social affiliation and identity. Hence, computer systems that can adapt to the user by displaying a familiar socio-cultural identity are expected to raise the acceptance within certain contexts and target groups dramatically. But current systems are far from achieving the fidelity required for realization of these benefits.
The crucial obstacle is scarcity of data. Most important of all, substantial corpora of language varieties or dialects are rare. Moreover, authoritative orthographic conventions usually do not exist. As a result, the notation of written texts can vary widely and there are no obvious conventions for the annotation of speech corpora.
This situation calls for novel approaches, methods and techniques to overcome or circumvent the problem of data scarcity, but also to enhance and strengthen the standing that language varieties and dialects have in natural language processing technologies as well as in interaction technologies that build upon the former.
While there will be a clear focus on machine learning applied to the before mentioned problems, this workshop aims at gathering researchers with expertise in various disciplines.
Call for Papers:
Due to popular request and the collision with the Easter holidays, we decided to extend the submission deadline to May 2, 23:59 GMT-11. Please see below for the updated schedule.
We invite submissions to the First Workshop on Algorithms and Resources for Modelling of Dialects and Language Varieties. The workshop will be held in conjunction with EMNLP 2011 on July 31, in Edinburgh, Scotland, UK. It will consist of one day of oral presentations and poster sessions.
Workshop URL: http://www.ofai.at/~dialects2011/
- Machine learning algorithms operating in the regime of data scarcity
- Bootstrapping and active learning schemes for principled acquisition, annotation or generation of training data
- Methods to acquire resources by exploiting the proximity between varieties and standard language
- Issues of orthography and annotation
- Machine translation between language varieties or dialects
- Speech synthesis of dialects with limited corpora
- Interaction technologies dealing with social identity in speech and text
- Novel approaches transcending the paradigm of statistical modelling
Progress in the above listed topics requires an interdisciplinary approach: machine learning, machine translation, speech synthesis, automatic speech recognition but also linguistics and interaction technologies will have to contribute. We invite researchers with a genuine interest in modelling of language varieties and the advancement of natural language processing in this area.
Paper submission deadline: May 2, 2011
Acceptance notification: May 30, 2011
Camera-ready copy due: June 13, 2011
Workshop meeting: July 31, 2011
All deadlines refer to 23:59 GMT-11 (Samoa time) on the indicated day.
We invite high-quality submissions on original, unpublished work in areas relating to the aforementioned topics. Both significant theoretical advances and descriptions of successful practical systems involving processing or generation of language varieties are welcome. Submission of work that is only incremental in nature or describes minor progress is explicitly discouraged.
Two paper categories will be distinguished:
- Long papers are expected to report on contributions of lasting value and will be presented orally in the plenary session of the workshop. Submissions should not exceed a length of 9 pages, excluding references.
- Short papers are ideally suited for exciting new work that is not yet mature enough for a long paper, but has substantial merit. The work will be presented during the poster session and - depending on the type of work – a system demonstration can be given. The length of short papers is restricted to 4 pages, excluding references.
Reviewing will be double-blind, so please ensure your submission is properly anonymized. In particular, the paper should not reveal the authors' identities or include acknowledgments or references to project names, websites, software or such that might give away the identity.
Submissions should follow the two-column format of the ACL 2011 proceedings. The official style files can be obtained at http://www.acl2011.org/call.shtml. Submission is handled using the START system at https://www.softconf.com/emnlp/DIALECTS2011/ .
Papers must be uploaded until May 2, 23:59 GMT-11.
Policy regarding submission to multiple workshops/conferences: It is acceptable to submit the same paper to another workshop or conference. However, in this case we request that you inform the organizers in a separate e-mail in advance, such that we know the paper might be withdrawn. In addition, if you do decide to withdraw, we request that you notify us by May 26th, at the very latest.
Jeremy Jancsary - Austrian Research Institute for Artificial Intelligence
Friedrich Neubarth - Austrian Research Institute for Artificial Intelligence
Harald Trost - Section for AI, Medical University of Vienna, Austria
Gérard Bailly - Speech & Cognition department, CNRS Grenoble, France
Nick Campbell - CLCS, Trinity College Dublin, Ireland
Martine Grice - IfL, Phonetik Köln, Germany
Gholamreza Haffari - BC Cancer Research Center, Vancouver, Canada
Inmaculada Hernaez Rioja - Univ. of the Basque Country (UPV/EHU), Spain
Philipp Koehn - ILCC, Univ. of Edinburgh, UK
Michael Pucher - ftw, Vienna, Austria
Milan Rusko - SAS, Slovak Academy of Sciences, Slovakia
Kevin Scannell - Department of Mathematics and Computer Science, Saint Louis University, USA
Yves Scherrer - LATL, Université de Genève, Switzerland
Beat Siebenhaar - Institut für Germanistik, Univ. of Leipzig, Germany
This Year the LINGUIST List hopes to raise $67,000. This money will go to help
keep the List running by supporting all of our Student Editors for the coming year.
See below for donation instructions, and don't forget to check out Fund
Drive 2011 site!
There are many ways to donate to LINGUIST!
You can donate right now using our secure credit card form at
Alternatively you can also pledge right now and pay later. To do so, go to:
For all information on donating and pledging, including information on how to
donate by check, money order, or wire transfer, please visit:
The LINGUIST List is under the umbrella of Eastern Michigan University and as
such can receive donations through the EMU Foundation, which is a registered
501(c) Non Profit organization. Our Federal Tax number is 38-6005986. These
donations can be offset against your federal and sometimes your state tax return
(U.S. tax payers only). For more information visit the IRS Web-Site, or contact
your financial advisor.
Many companies also offer a gift matching program, such that they will match
any gift you make to a non-profit organization. Normally this entails your
contacting your human resources department and sending us a form that the
EMU Foundation fills in and returns to your employer. This is generally a simple
administrative procedure that doubles the value of your gift to LINGUIST, without
costing you an extra penny. Please take a moment to check if your company
operates such a program.
Thank you very much for your support of LINGUIST!
Read more issues|LINGUIST home page|Top of issue