In recent years, the way people learn languages has dramatically evolved thanks to technological advancements—most recently, generative artificial intelligence (AI). Tools like ChatGPT, Gemini, Copilot, and others have revolutionized access to information, content creation, and multilingual communication. But while these technologies offer immense potential, their impact is far from equal across all languages.
A new report by communication firm Llorente y Cuenca, in collaboration with BID Lab and Microsoft, opens up a critical conversation about this issue by analyzing the performance of generative AI models in seven Indigenous American languages. The results highlight both unexplored opportunities and urgent concerns.
AI Falls Short in Indigenous Language Performance
The report, titled The Performance of Artificial Intelligence in the Use of Indigenous American Languages, reveals that the most advanced AI systems today significantly underperform when interacting in native languages. According to the study, AI-generated responses in Indigenous languages are often up to four times shorter than their Spanish equivalents. Even more concerning is the expressive quality score of just 2.4 out of 10, and comprehension rated at 2.3 out of 10.
Furthermore, the report points out a troubling bias: AI models tend to redirect answers toward Western cultural references, even when questions are posed in Indigenous languages. This means that these systems often function as if the world were monolingual and monocultural, ignoring the rich diversity of Indigenous identities.
In a world where digital technologies are increasingly mediating interactions between people and institutions, this lack of cultural sensitivity can reinforce colonial dynamics—systems that historically silenced native voices.
As Ayuujk linguist and activist Yásnaya Aguilar emphasizes, “Language not only communicates but constructs the world.” When AI fails to represent these cultural viewpoints, it excludes entire communities from participating in the digital future.
A Race Against Time: Indigenous Languages at Risk
According to UNESCO, one Indigenous language disappears every two weeks, putting nearly 3,000 unique languages at risk of extinction by the end of the century. This looming loss underscores the urgency of creating technologies that not only support but also preserve linguistic and cultural diversity.
A Vision for Inclusive AI
Despite its deficiencies, the report presents an optimistic path forward. Rather than rejecting AI, it outlines 21 actionable strategies to create more inclusive and representative models. Key recommendations include:
- Increasing the volume of digital content in Indigenous languages
- Developing voice recognition and translation technologies for native tongues
- Promoting Indigenous influencers in digital spaces
- Preserving digital archives of oral traditions
Crucially, these efforts must be developed in collaboration with Indigenous communities—not as passive recipients but as active agents in the design of technological solutions.
During a recent discussion organized by Motorola with Zapotec artists and authors, writer Irma Pineda emphasized the importance of safeguarding Indigenous identities. “Beyond language,” she noted, “it’s about protecting a worldview.”
This perspective aligns with the International Decade of Indigenous Languages promoted by UNESCO, which highlights the role of technology as a means for intergenerational transmission and cultural preservation.
Global and Local Inspirations
This call for community-led innovation is not without precedent. The Masakhane project in Africa successfully trained multilingual AI models in dozens of African languages through decentralized collaboration networks. In Latin America, similar goals are being pursued by Colmena Lab and Mexico’s Instituto Nacional de Lenguas Indígenas (INALI), albeit with less support from major tech companies.
According to César Buenadicha, Head of Ecosystem Building at BID Lab, “AI can be a transformative tool if it’s trained with culturally diverse data and if the perspectives of speakers are incorporated into its design.”
One of the most powerful findings in the report is a 91% correlation between the volume of digital content in a given language and the performance of AI in that language. In simple terms: if data doesn’t exist, AI can’t learn.
The Future Is Multilingual—If We Make It So
The emergence of generative AI marks a pivotal moment in language learning and communication. However, its benefits will only be truly universal if the technology reflects the full spectrum of human culture and language. This includes not only global power languages but also the Indigenous tongues that hold unique knowledge, identities, and ways of seeing the world.
In this new era, the question is not whether AI can support Indigenous languages—it’s whether we will choose to design AI systems that do.