Bias Amplification in AI Models Trained on Low-Resource Data: A Sociolinguistic Audit
DOI:
https://doi.org/10.38124/ijsrmt.v4i12.1075Keywords:
Bias Amplification, Low-Resource Languages, Sociolinguistics, AI Ethics, Language Equity, Natural Language Processing (NLP)Abstract
This paper explored how the rapid advancement of artificial intelligence (AI) has shaped global communication while simultaneously creating linguistic inequalities. AI systems that had been trained on high-resource languages showed high performance compared to those that had been trained on low-resource languages with low digital presence or datasets. The study revealed that data scarcity not only decreased the accuracy of the models but also exacerbated the existing social and linguistic inequalities. Amplification of bias was made possible by processes like statistical over-fitting, representational gaps, and normalizing standardized forms of language in training corpora. Therefore, speakers of marginalized or minoritized languages had increased chances of misclassification, exclusion, and digital discrimination. This study introduced a sociolinguistic audit framework that included community involvement, documentation of data sets, and modeling methods that were based on fairness. Overall all the review highlighted the necessity of fair AI development that would be based on collaborative, justice-oriented approaches that would consider the linguistic rights and agency of the underrepresented language communities.
Downloads
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 International Journal of Scientific Research and Modern Technology

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
PlumX Metrics takes 2–4 working days to display the details. As the paper receives citations, PlumX Metrics will update accordingly.