LINGUIST List 19.410
|
Mon Feb 04 2008
Disc: Automatical Metrical Markup
Editor for this issue: Ann Sawyer
<sawyer linguistlist.org>
|
To post to LINGUIST, use our convenient web form at
http://linguistlist.org/LL/posttolinguist.html.
|
Directory
1. Klemens
Bobenhausen,
Automatical Metrical Markup
Message 1: Automatical Metrical Markup
|
Date: 01-Feb-2008
From: Klemens Bobenhausen <klemens.bobenhausen germanistik.uni-freiburg.de>
Subject: Automatical Metrical Markup
E-mail this message to a friend
All, Automatic metrical markup (AMM) of written (not spoken) poetry means to reach a 100% computer based analysis of the metrical information of a poem, beginning with identifying poems (not yet reached), strophes, verse lines, words, syllables and ending with distinction of pronounced (+) and unpronounced (-) syllables and rhyme-schemata and putting all these analysis in an XML-document (TEI P5 compatible). Strophe 1: Silbenzerlegung: (Fried|lich) (be|käm|pfen) (Nacht) (sich) (und) (Tag) . (Wie) (das) (zu) (däm|pfen) , (Wie) (das) (zu) (lö|sen) (ver|mag) ! Metrik: Silben=5, Betonung=''+--+-'' Silben=4, Betonung=''+--+'' Silben=5, Betonung=''+--+-'' Silben=7, Betonung=''+--+--+'' Reim: Endreim=''abab'' (Kreuzreim) After I collected lots of prosodic forecasts of the German written language, I'm now able to analyse regular poems (with a regular row of pronounced and unpronounced syllables for each verse/strophe) in about 100% - and irregular poems (with an irregular row of pronounced and unpronounced syllables for each verse/strophe) in about 98% of their syllables. The amount of percents is a set of syllables a) defined over pronounced syllables (60%) b) defined over euphonic rules (25%) c) defined over analogies to other verses (7%) d) defined over unpronounced syllables (5%) e) defined over rhymes (1%) I'm not using any kind of POS or morphological tagging, because the system should work also with historical texts and their orthography. The missing 2% are coming from foreign or non-Germanic words (like 'Musik' or 'Natur') and compounds, which in German language are mostly pronounced on the part of the compound which describes the other part (like 'Biergarten', being pronounced on the first syllable, because 'Bier' describes which kind of 'Garten' a 'Biergarten' is.) And now I'm out of ideas and need assistance. Is anyone interested in stuff like this? The algorithm will not work with other languages than German, but the ideas may. Klemens (+-) Linguistic Field(s): Computational Linguistics Ling & Literature Phonology Text/Corpus Linguistics
Read more issues|LINGUIST home page|Top of issue
|
|

Please report any bad links or misclassified data
LINGUIST Homepage | Read
LINGUIST | Contact us

While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed on its pages, it cannot vouch for their contents.
|
|