Tundra Nenets linguistic research

Elicitation questionnaire for Tundra Nenets interrogatives

There are currently two active elicitation questionnaires on Tundra Nenets interrogatives, which are still open for responses. You can participate by filling out Questionnaire 1 and Questionnaire 2. Your contributions are highly valuable for our research.

Corpora of the Tundra Nenets language

During the project, we gathered both published and unpublished sources in the Tundra Nenets and Forest Nenets languages. These materials were then digitised for further analysis. For a detailed overview of the digitisation process, please refer to Mus & Metzger (2021a) and Mus & Metzger (2021b).

The project also encompasses several corpora, which are currently unpublished. These corpora include the following:

A Tundra Nenets monolingual corpus (OCR-ed and unified),
A Forest Nenets monolingual corpus (OCR-ed and unified),
A Tundra Nenets – Russian – English parallel corpus (sentence-level aligned),
A Forest Nenets – Russian – English parallel corpus (sentence-level aligned).

To describe our resources, we collected metadata associated with these sources. The metadata categories were defined according to established standards, including the IMDI (Interactive Metadata for Data Integration), the CLARIN Metadata Standard (Common Language Resources and Technology Infrastructure), the FIMS (Fieldwork Information Management System), and the MARC (Machine-Readable Cataloging) standards. The metadata were systematically organised into a catalog, with each category separated into individual columns to ensure clarity and facilitate efficient organisation and analysis. Consistent data formats were maintained across all fields, such as using a uniform date format (e.g., YYYY-MM-DD) and standardising categorical values, to ensure consistency and streamline processing. The catalog includes the following information:

Language Information
Corpus Information
Data Context
Speaker(s) Information
Data Information

Tundra Nenets question structures: a linguistic database

-- Under construction --