Google has expanded its African language efforts by adding Kikuyu, Dholuo and Luganda to its WAXAL speech dataset, a move aimed at making voice-based technology more accessible across East Africa.
The announcement was made in Nairobi on Tuesday, February 2, 2026, during the launch of the updated dataset. It is designed to help developers and researchers build tools that can understand and respond in African languages, an area long overlooked in global technology.
By including the three languages, Google hopes to widen access to services such as voice assistants, speech-to-text tools, learning platforms and digital public services, especially for users who are not fluent in English.
Filling a long-standing gap
According to Google, WAXAL now contains 1,250 hours of transcribed natural speech and more than 20 hours of studio-quality recordings, collected over a three-year period.
“The ultimate impact of WAXAL is the empowerment of people in Africa,” said Aisha Walcott-Bryant, Head of Google Research Africa.
She said the dataset gives local innovators a foundation to build technology that reflects how people actually speak.
“This dataset provides the foundation for students, researchers and entrepreneurs to build technology on their own terms, in their own languages, reaching more than 100 million people,” she added.
Why language matters
In many parts of Africa, limited English proficiency remains a barrier to digital access. Tools that work in local languages could help close that gap in areas such as education, agriculture and healthcare, where clear communication is critical.
The data was collected with the support of African universities and community groups, including Makerere University in Uganda, the University of Ghana, and Digital Umuganda in Rwanda, working alongside Google researchers.
Open access for developers
WAXAL is published under a Creative Commons licence, allowing developers to use the data freely to create new products and services suited to local needs.
Alongside Kikuyu, Dholuo and Luganda, the dataset also includes Swahili, which is widely spoken across Kenya and the wider region.
Other languages covered range from Hausa and Yoruba in West Africa to Lingala and Shona in Central and Southern Africa, reflecting a growing push to ensure African languages are part of the digital future.
For millions of users, the update signals a step towards technology that listens — and speaks — in familiar voices.