Publishing Partner: Cambridge University Press CUP Extra Wiley-Blackwell Publisher Login
amazon logo
More Info


New from Oxford University Press!

ad

Words in Time and Place: Exploring Language Through the Historical Thesaurus of the Oxford English Dictionary

By David Crystal

Offers a unique view of the English language and its development, and includes witty commentary and anecdotes along the way.


New from Cambridge University Press!

ad

Thesaurus of English Words and Phrases

By Peter Mark Roget

This book "supplies a vocabulary of English words and idiomatic phrases 'arranged … according to the ideas which they express'. The thesaurus, continually expanded and updated, has always remained in print, but this reissued first edition shows the impressive breadth of Roget's own knowledge and interests."


New from Brill!

ad

The Brill Dictionary of Ancient Greek

By Franco Montanari

Coming soon: The Brill Dictionary of Ancient Greek by Franco Montanari is the most comprehensive dictionary for Ancient Greek to English for the 21st Century. Order your copy now!


Academic Paper


Title: Sintéiseoir 1.0: a multidialectical TTS application for Irish
Author: Mícheál Mac Lochlainn
Institution: National University of Ireland, Galway
Linguistic Field: Applied Linguistics; Computational Linguistics
Subject Language: Irish
Abstract: ''This paper details the development of a multidialectical text-to-speech (TTS) application, Sintéiseoir, for the Irish language. This work is being carried out in the context of Irish as a lesser-used language, where learners and other L2 speakers have limited direct exposure to L1 speakers and speech communities, and where native sound systems and vocabularies can be seen to be receding even among L1 speakers – particularly the young.
Sintéiseoir essentially implements the diphone concatenation model, albeit augmented to include phones, half-phones and, potentially, other phonic units. It is based on a platform-independent framework comprising a user interface, a set of dialect-specific tokenisation engines, a concatenation engine and a playback device.
The tokenisation strategy is entirely rule-based and does not refer to dictionary look-ups. Provision has been made for prosodic processing in the framework but has not yet been implemented. Concatenation units are stored in the form of WAV files on the local file system.
Sintéiseoir’s user interface (UI) provides a text field that allows the user to submit a grapheme string for synthesis and a prompt to select a dialect. It also filters input to reject graphotactically invalid strings, restrict input to alphabetic and certain punctuation marks found in Irish orthography, and ensure that a dialect has, indeed, been selected.
The UI forwards the filtered grapheme string to the appropriate tokenisation engine. This searches for specified substrings and maps them to corresponding tokens that themselves correspond to concatenation units.
The resultant token string is then forwarded to the concatenation engine, which retrieves the relevant concatenation units, extracts their audio data and combines them in a new unit. This is then forwarded to the playback device.
The terms of reference for the initial development of Sintéiseoir specified that it should be capable of uttering, individually, the 99 most common Irish lemmata in the dialects of An Spidéal, Músgraí Uí Fhloínn and Gort a’ Choirce, which are internally consistent dialects within the Connacht, Munster and Ulster regions, respectively, of the dialect continuum. Audio assets to satisfy this requirement have already been prepared, and have been found to produce reasonably accurate output. The tokenisation engine is, however, capable of processing a wider range of input strings and when required concatenation units are found to be unavailable, returns a report via the user interface.

CUP at LINGUIST

This article appears in ReCALL Vol. 22, Issue 2, which you can read on Cambridge's site or on LINGUIST .



Back
Add a new paper
Return to Academic Papers main page
Return to Directory of Linguists main page