Logo image by Hay
Kranen / CC-BY

              Infoling at
Facebook Facebook Infoling at Twitter Twitter

Moderador/a: Carlos Subirats (U. Autónoma Barcelona), Mar Cruz (U. Barcelona)
Editoras: Paloma Garrido (U. Rey Juan Carlos), Laura Romero (UB)
Programación, desarrollo: Marc Ortega (UAB)
Directoras/es de reseñas: Alexandra Álvarez (U. Los Andes, Venezuela), Yvette Bürki (U. Bern, Suiza), María Luisa Calero (U. Córdoba, España), Luis Cortés (U. Almería)
Asesoras/es: Isabel Verdaguer (UB), Gerd Wotjak (U. Leipzig, Alemania)
Colaboradoras/es: Julia Bernd (Cause Data Collective, EE.UU), Antonio Ríos (UAB), Danica Salazar (UB)

Con el patrocinio de:
Arco Libros

Infoling 1.34 (2013)
ISSN: 1576-3404

© Infoling 1996-2012. Reservados todos los derechos

Petición de contribuciones (evento): Compilation and annotation of spoken corpora: Towards best practice? (ICAME34)
Santiago de Compostela (España), 22 de mayo de 2013
(1ª circular)
URL: http://www.usc.es/export/sites/default/en/congresos/icame34/descargas/WS-Andersenetal.pdf
Información de: Gisle Andersen <[log in to unmask]>
Compartir: Send to Facebook   Tweet this

View with English headings


This workshop provides a meeting ground for scholars involved in the creation of corpora of spoken language or with a more general interested in the representation of spoken data based on audio/video recordings. The workshop addresses the need to harmonise corpus-building methods by developing or utilising internationally recognised standards in corpus linguistics or best practice guidelines for the transcription and annotation of audio/video data.

The aim is to facilitate the exchange of experience from large-scale and coordinated corpus building efforts as well as small-scale and local initiatives. This includes accounts of, on the one hand, the practicalities encountered in corpus compilation, transcription and annotation, and on the other hand, how annotation decisions are grounded in linguistic theory. This will hopefully stimulate a fruitful discussion about whether/how cross-corpora comparison is hampered by lack of uniformity in annotation schema and procedures, what solutions corpus builders recommend at different annotation levels, practical experience with the use of existing standards or de facto standards (e.g. COBUILD/NERC, TEI, XCES), methods for testing and improving inter-annotator agreement, etc. Relevant topics include, but are not restricted to:
- Corpus design (techniques for capturing and linking text and audio/video data; ensuring consistency in transcription; ensuring inter-annotator agreement)
- Orthographic transcription (transcription of non-standard vocabulary, slang, swearing, neologisms; standardised vs. idiosyncratic orthography; standardised representation of pauses, backchannels and hesitation phenomena)
- Annotation of syntactic features (the relevance and reliability of part-of-speech tagging for (informal/messy) conversational data; syntactic parsing of speech; parsers’/taggers’ capability of handling non-standard forms and neologisms)
- Annotation of prosodic, phonetic, or acoustic features (standardised vs. in-house annotation schemes, simple vs. detailed prosodic annotation; the relevance and reliability of phonetic annotation)
- Pragmatic or gestural annotation (standardised/in-house systems for annotation of speech act information, discourse functions, pragmatic markers, quotatives, anaphora and deixis; gestural annotation schemes)
We invite papers that discuss specific corpus initiatives dealing with any of the above topics, or that report on corpus-based case studies which illustrate or problematise the need for methodological harmonisation and standardisation in the field.

The workshop will be organised as a series of thematic slots consisting of 15-minute papers followed by joint discussions.

Abstracts of 300-400 words should be submitted by e-mail to all three convenors: [log in to unmask], [log in to unmask] and [log in to unmask] The notification of acceptance will be sent out in late February 2013.

Workshop convenors: Gisle Andersen (NHH-NO), John Kirk (QUB-UK), Susan Lee Nacey (HiHm-NO)

Área temática: Lingüística de corpus

Entidad Organizadora: NHH Norwegian School of Economics

Contacto: Gisle Andersen <[log in to unmask]>

Plazo de envío de propuestas: hasta el 31 de enero de 2013

Lengua(s) oficial(es) del evento: English

Nº de información: 1

Información en la web de Infoling: