DoReCo - Language Totoli

Language: Totoli

Corpus creator(s):	Maria Bardají, Christoph Bracks, Claudia Leto, Datra Hasan, Sonja Riesberg, Winarno S. Alamudi and Nikolaus P. Himmelmann
Archive:	TLA ; LAC ;
Annotation files license:
Audio files license:	Audio at TLA ; LAC ;
Translation:	English

The Totoli DoReCo dataset was compiled by Maria Bardají, Christoph Bracks, Claudia Leto, Datra Hasan, Sonja Riesberg, Winarno S. Alamudi, and Nikolaus P. Himmelmann based on recordings made in 2007, 2017, and 2018 and further processed for DoReCo by Aleksandr Schamberger, Michelle Throssell, and Ludger Paschen between 2022 and 2024. The files that the Totoli DoReCo dataset are based on are part of two Totoli collections:

Leto, Claudia, Winarno S. Alamudi, Jani Kuhnt-Saptodewo, Sonja Riesberg, Hasan Basri & Nikolaus P. Himmelmann. 2005-2010. DoBeS Totoli Documentation. DoBeS Archive MPI Nijmegen (https://dobes.mpi.nl/). From this collection, the following files were taken: - chicken_eagle: https://hdl.handle.net/1839/00-0000-0000-0014-C80B-1 - Nahres_life: https://hdl.handle.net/1839/00-0000-0000-0014-C7F9-A - Yulins_life: https://hdl.handle.net/1839/96A31A44-08CA-4730-892A-8451E77E468C

Bracks, Christoph A., Datra Hasan, Maria Bardají i Farré, Sumitro Pogi & Nikolaus P. Himmelmann. (2023). Totoli documentation corpus 2. Cologne: Data Center for the Humanities. https://doi.org/10.18716/dch/a.00000014 From this collection, the following files were taken (all available via https://dch.phil-fak.uni-koeln.de/bestaende/datensicherung/totoli) - explanation-lelegesan_SYNO - explanation-making-red-sugar_IS - explanation-wedding-tradition_ZBR - lifestory_RDA_1, story-comedy_RSTM - lifestory_TS-IA - pearstory_14_SP - pearstory_36_SELP - spacegames_sequence1_KSR-SP - story-comedy_RSTM - story-monkey-butterfly_RSM - story-monkey-crocodile_RSM - story-monkey-turtle_RSM - story-two-obedient-children_RSM

A set of files with further information on the Totoli DoReCo dataset, including metadata and PIDs is automatically included in each bulk download of files from this dataset.

The Totoli DoReCo dataset should be cited as follows:

Bardají, Maria, Christoph Bracks, Claudia Leto, Datra Hasan, Sonja Riesberg, Winarno S. Alamudi and Nikolaus P. Himmelmann. 2024. Totoli DoReCo dataset. In Seifart, Frank, Ludger Paschen and Matthew Stave (eds.). Language Documentation Reference Corpus (DoReCo) 2.0. Lyon: Laboratoire Dynamique Du Langage (UMR5596, CNRS & Université Lyon 2). https://doreco.huma-num.fr/languages/toto1304 (Accessed on 13/05/2026). DOI:10.34847/nkl.c8b6ei29

Please note that when actual data from any number of DoReCo datasets is used, the full reference for each individual dataset must be provided, including the name(s) of the creator(s) of each dataset. It is NOT sufficient to refer to DoReCo as a whole. We are aware that this may result in very long lists of references, but it is only in this way that corpus creators get due recognition for their work. The default is to include the full set of bibliographical references in the reference section of the main text of a paper or abstract. If this is absolutely impossible (because of page limit restrictions, for instance), then inclusion of the full list of references in an appendix is acceptable, or - as a last resort - in supplementary material published separately, e.g. on Zenodo or OSF, in which case the main text of the paper or the abstract must explicitly refer to this list and provide its URL or PID.

Language Information

Family:	Austronesian (aust1307)
Macro-area:	Papunesia

Core set
Extended set

Name	Speaker(s) Age(s)	Speaker(s) Gender(s)	Genre	Gloss	Word tokens

Name	Speaker(s) Age(s)	Speaker(s) Gender(s)	Genre	Gloss	Word tokens