Research Projects

  • image

    CorTypo (2013-2017)

    Designing spoken corpora for cross-linguistic research

    CorTypo (Designing spoken corpora for cross-linguistic research) interfaces a typological database to annotated spoken corpora in twelve lesser-described languages belonging to different families and phyla.
    The aim is to compare functions across languages, starting from language-internal categories, in view of providing an empirical, bottom-up typology.

  • image

    LABEX EFL (2012-2022)

    Typology and Corpus Annotation of Information Structure and Grammatical Relations

    TCA ISGR (Typology and Corpus Annotation of Information Structure and Grammatical Relations) is a project which aims at developing innovative schemes for the corpus annotation of information structure phenomena and grammatical relations, so that comparability can be achieved across the spoken corpora created by the members of the project.

  • image

    CorpAfroAs (2006-2011)

    A Corpus for Spoken AfroAsiatic Languages

    CorpAfroAs (Spoken Corpora in AfroAsiatic Languages : Prosodic and Morphosyntactic Analysis) is a pilot project which created corpora in thirteen different lesser-described spoken AfroAsiatic languages, in view of making them searchable through a query engine.
    Each corpus is prosodically segmented into minor and major intonation units, and morphosyntactically-annotated.

  • image

    IUF Junior (2004-2009)

    Designing spoken corpora for lesser-described languages

    IUF Junior : Designing spoken corpora for lesser-described languages was the theme of the individual research project submitted to the IUF. It consisted in conceptualizing a framework for the creation of corpora in languages with no written tradition, incorporating the latest advances in the treatment of spoken corpora for major languages.
    The central aspects of the research were transcription, and segmentation.