Preservation of low resource languages through natural language processing: challenges, opportunities, and the case of Tiv
DOI:
https://doi.org/10.4314/jobasr.v4i2.21Keywords:
Low-resource Languages, Natural Language Processing, Tiv language, NLP strategies, Computational LinguisticsAbstract
The research progress made in Natural Language Processing (NLP) still cannot hide the fact that low-resource languages are well underrepresented in computational research. Low-resource languages face underrepresentation of computational resources in natural language processing (NLP) tools. Natural Language Processing tools such as machine translation, morphological analysis, and speech recognition are computational systems designed to document and revitalize low-resource language and are seen as an opportunity to document and revitalize them. However, the successful application of NLP tools requires adequate digital resources and linguistic dataset. Current NLP for low-resource languages has been seen to lean heavily on methods such as transfer learning, multilingual pre-trained models and data augmentation but still has shortcomings stemming from the scarcity of data and linguistic resources as well, the existing models are not suited to the morphological and tonal nature of Tiv. This work undertakes a careful review of research on NLP in low-resource Nigerian languages to ascertain the key challenges associated with Tiv language as pertain NLP. Fifty-four (54) publications were selected through a search and filter procedure which employed inclusion/exclusion rules to filter results from large scholarly databases and other NLP dataset sources. The analysis of the result exposes the near none existence of NLP resources for Tiv language. This research work contributes to identifying challenges in NLP research for Tiv language through creating a consolidated overview to provide a context-specific analytical framework and a a roadmap for the development of Tiv NLP, which spans resource development, method adaptation, and application use.
References
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.