PI-YALLI Nawatl Corpus

PI-YALLI nawatlahtolkorpus / Corpus PI-YALLI de documents nawatl

The PI-YALLI corpus of Nahuatl documents | El corpus PI-YALLI de documentos nahuatl

NAWATL

The Nawatl corpus PI-YALLI contains a set of several Nawatl documents, and a three different static embeddings models. The corpus was splitted in 16 topics.

The corpus was manually compiled and processed by a team from Université d'Avignon (France), Universidad Veracruzana (Mexique) and independent researchers.

This corpus is suitable for testing and learning systems working on Nawatl language. New versions, containing more texts, will be aggregated periodically.

The PI-YALLI corpus and its embeddings is distributed by Laboratoire Informatique d'Avignon (France) under LGPL license.

Updated Pi-yalli Corpus Version 1.8 (soon available!)
- Télécharger/Bajar/Download le corpus nahuatl PI-YALLI / the Nahuatl corpus PI-YALLI / Nahuazösisch PI-YALLI Korpus
Updated Embeddings pi-yalli Version 0.001 of 01.07.2025
- Télécharger ML nawatl Word2Vec
- Télécharger ML nawatl FastText
- Télécharger ML nawatl Glove

How to cite this corpus ?

Guzman-Landa, et al. (2025) PI-yalli: un nouveau corpus pour le nahuatl / Yankuik nawatlahtolkorpus pampa tlahtolmachiotl. [article] [bib] [résumé] TALN'25, Marseille, 2025

Torres-Moreno JM., et al. (2024) NAHU2: Un nouveau corpus pour le Nahuatl , HAL 2024

Torres-Moreno JM., et al. (2024) PI-yalli: un nouveau corpus pour le nahuatl. ArXiv, arXiv:2412.15821, https://doi.org/10.48550/arXiv.2412.15821

Contact : Juan-Manuel Torres-Moreno
http://lia.univ-avignon.fr / Universite d'Avignon, France
juan *-* manuel *dot* torres *at* univ-avignon *dot* fr

Updated 20.06.25