A multi-software integration platform and support for multimedia transcripts of language

Christophe Parisse; Aliyah Morgenstern

Communication dans un congrès Année : 2010

A multi-software integration platform and support for multimedia transcripts of language

(1) , (2)

1
2

Christophe Parisse

Fonction : Auteur
PersonId : 9317
IdHAL : christophe-parisse
ORCID : 0000-0002-0010-3363
IdRef : 069504245

Modèles, Dynamiques, Corpus

Aliyah Morgenstern

Fonction : Auteur
PersonId : 7815
IdHAL : aliyah-morgenstern
ORCID : 0000-0001-8440-2186
IdRef : 028583507

PRISMES - Langues, Textes, Arts et Cultures du Monde Anglophone - EA 4398

Résumé

Using and sharing multimedia corpora is a vital feature for research about language, but the number of different and often not easily compatible tools available makes this difficult to do. As the aims of the COLAJE project are to use multimodal linguistic data about language development in oral and sign languages, it was necessary to create a system (VICLO) that allowed sharing and using data coming from at least three different sources Clan (CHILDES), Elan (MPI) and Praat (U. of Amsterdam). For this reason, a multi-purpose storage format based on the TEI was created, which allowed us to store information coming from all (these) origins, and include every type of specific information. When part of the information is processed by a specific software, the changes are integrated later in the system without loosing information specific to other software. Thus it is possible to store information shared and not shared between the different corpus editing tools. This common base allowed us to implement complementary features such as fine-grained participant and metadata information, common visualisation and data-retrieval tools. VICLO is based on XML technology and all data can be displayed using all purpose web browsers.

Mots clés

multimedia transcription format CLAN ELAN

Domaines

Linguistique Sciences de l'information et de la communication

Liste complète des métadonnées

Format du dépôt	Fichier
Type de dépôt	Communication dans un congrès
Titre	en A multi-software integration platform and support for multimedia transcripts of language
Résumé	en Using and sharing multimedia corpora is a vital feature for research about language, but the number of different and often not easily compatible tools available makes this difficult to do. As the aims of the COLAJE project are to use multimodal linguistic data about language development in oral and sign languages, it was necessary to create a system (VICLO) that allowed sharing and using data coming from at least three different sources Clan (CHILDES), Elan (MPI) and Praat (U. of Amsterdam). For this reason, a multi-purpose storage format based on the TEI was created, which allowed us to store information coming from all (these) origins, and include every type of specific information. When part of the information is processed by a specific software, the changes are integrated later in the system without loosing information specific to other software. Thus it is possible to store information shared and not shared between the different corpus editing tools. This common base allowed us to implement complementary features such as fine-grained participant and metadata information, common visualisation and data-retrieval tools. VICLO is based on XML technology and all data can be displayed using all purpose web browsers.
Auteur(s)	Christophe Parisse ¹ , Aliyah Morgenstern ² 1 MoDyCo - Modèles, Dynamiques, Corpus ( 1057 ) - Université Paris Nanterre Bâtiment A - Bureau 402 A 200, avenue de la République 92001 Nanterre Cedex - France Université Paris Nanterre UMR7114 ( 116205 ) ; Centre National de la Recherche Scientifique UMR7114 ( 441569 ) 2 PRISMES - PRISMES - Langues, Textes, Arts et Cultures du Monde Anglophone - EA 4398 ( 106107 ) - Université Sorbonne Nouvelle Maison de la Recherche Bureau A110 4, rue des Irlandais 75005 PARIS - France Université Sorbonne Nouvelle - Paris 3 ( 52995 )
Vulgarisation	Non
Comité de lecture	Oui
Actes	Oui
Invité	Non
Langue du document	Anglais
Titre de l'ouvrage	LREC 2010 Proceedings : Workshop on Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality
Audience	Non spécifiée
Date de publication	2010
Page/Identifiant	106-110
Titre du congrès	LREC 2010 : Workshop on Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality
Date début congrès	2010-05
Ville	La Valette
Pays	Malte
Domaine(s)	Sciences de l'Homme et Société/Linguistique Sciences de l'Homme et Société/Sciences de l'information et de la communication
Mots-clés	ro multimedia, transcription, format, CLAN, ELAN

Fichier principal

2010-3-Parisse-Morgenstern-LREC.pdf ( 307.87 Ko )

Origine : Fichiers produits par l'(les) auteur(s)

Christophe Parisse : Connectez-vous pour contacter le contributeur

https://shs.hal.science/halshs-00495648

Soumis le : lundi 28 juin 2010 à 14:14:06

Dernière modification le : jeudi 21 décembre 2023 à 17:18:03

Archivage à long terme le : jeudi 30 septembre 2010 à 17:53:41

Dates et versions

halshs-00495648, version 1 (28-06-2010)

Identifiants

HAL Id : halshs-00495648 , version 1

Citer

Christophe Parisse, Aliyah Morgenstern. A multi-software integration platform and support for multimedia transcripts of language. LREC 2010 : Workshop on Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality, May 2010, La Valette, Malta. pp.106-110. ⟨halshs-00495648⟩

Exporter

BibTeX TEI Dublin Core DC Terms EndNote Datacite

Collections

CNRS UNIV-PARIS3 MODYCO CAMPUS-AAR AAI PRISMES UNIV-PARIS-LUMIERES UNIV-PARIS-NANTERRE

172 Consultations

149 Téléchargements

Dernière date de mise à jour le 20/04/2024

A multi-software integration platform and support for multimedia transcripts of language

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager