Connect with us

Local

Africa: African Languages for AI – the Project That's Gathering a Huge New Dataset

Published

on

49 Views

Artificial intelligence (AI) tools like ChatGPT, DeepSeek, Siri or Google Assistant are developed by the global north and trained in English, Chinese or European languages. In comparison, African languages are largely missing from the internet.
A team of African computer scientists, linguists, language specialists and others have been working on precisely this problem for two years already. The African Next Voices project, primarily funded by the Gates Foundation (with other funding from Meta) and involving a network of African universities and organisations, recently released what’s thought to be the largest dataset of African languages for AI so far. We asked them about their project, with sites in Kenya, Nigeria and South Africa.
Why is language so important to AI?
Language is how we interact, ask for help, and hold meaning in community. We use it to organise complex thoughts and share ideas. It’s the medium we use to tell an AI what we want – and to judge whether it understood us.
Keep up with the latest headlines on WhatsApp | LinkedIn
We are seeing an upsurge of applications that rely on AI, from education to health to agriculture. These models are trained from large volumes of (mostly) linguistic (language) data. These are called large language models or LLMs but are found in only a few of the world’s languages.
Read more: AI in Africa: 5 issues that must be tackled for digital equality
Languages also carry culture, values and local wisdom. If AI doesn’t speak our languages, it can’t reliably understand our intent, and we can’t trust or verify its answers. In short: without language, AI can’t communicate with us – and we can’t communicate with it. Building AI in our languages is therefore the only way for AI to work for people.
If we limit whose language gets modelled, we risk missing out on the majority of human cultures, history and knowledge.
Why are African languages missing and what are the consequences for AI?
The development of language is intertwined with the histories of people. Many of those who experienced colonialism and empire have seen their own languages being marginalised and not developed to the same extent as colonial languages. African languages are not as often recorded, including on the internet.
So there isn’t enough high-quality, digitised text and speech to train and evaluate robust AI models. That scarcity is the result of decades of policy choices that privilege colonial languages in schools, media and government.
Read more: AI chatbots can boost public health in Africa – why language inclusion matters
Language data is just one of the things that’s missing. Do we have dictionaries, terminologies, glossaries? Basic tools are few and many other issues raise the cost of building datasets. These include African language keyboards, fonts, spell-checkers, tokenisers (which break text into smaller pieces so a language model can understand it), orthographic variation (differences in how words are spelled across regions), tone marking and rich dialect diversity.
The result is AI that performs poorly and sometimes unsafely: mistranslations, poor transcription, and systems that barely understand African languages.
In practice this denies many Africans access – in their own languages – to global news, educational materials, healthcare information, and the productivity gains AI can deliver.
When a language isn’t in the data, its speakers aren’t in the product, and AI cannot be safe, useful or fair for them. They end up missing the necessary language technology tools that could support service delivery. This marginalises millions of people and increases the technology divide.
What is your project doing about it – and how?
Our main objective is to collect speech data for automatic speech recognition (ASR). ASR is an important tool for languages that are largely spoken. This technology converts spoken language into written text.
The bigger ambition of our project is to explore how data for ASR is collected and how much of it is needed to create ASR tools. We aim to share our experiences across different geographic regions.
The data we collect is diverse by design: spontaneous and read speech; in various domains – everyday conversations, healthcare, financial inclusion and agriculture. We are collecting data from people of diverse ages, gender and educational backgrounds.
Every recording is collected with informed consent, fair compensation and clear data-rights terms. We transcribe with language-specific guidelines and a large range of other technical checks.
In Kenya, through Maseno Centre for Applied AI, we are collecting voice data for five languages. We’re capturing the three main language groups Nilotic (Dholuo, Maasai and Kalenjin) as well as Cushitic (Somali) and Bantu (Kikuyu).
Read more: What do Nigerian children think about computers? Our study found out
Through Data Science Nigeria, we are collecting speech in five widely spoken languages – Bambara, Hausa, Igbo, Nigerian Pidgin and Yoruba. The dataset aims to accurately reflect authentic language use within these communities.
In South Africa, working through the Data Science for Social Impact lab and its collaborators, we have been recording seven South African languages. The aim is to reflect the country’s rich linguistic diversity: isiZulu, isiXhosa, Sesotho, Sepedi, Setswana, isiNdebele and Tshivenda.
Importantly, this work does not happen in isolation. We are building on the momentum and ideas from the Masakhane Research Foundation network, Lelapa AI, Mozilla Common Voice, EqualyzAI, and many other organisations and individuals who have been pioneering African language models, data and tooling.
Each project strengthens the others, and together they form a growing ecosystem committed to making African languages visible and usable in the age of AI.
How can this be put to use?
The data and models will be useful for captioning local-language media; voice assistants for agriculture and health; call-centre and support in the languages. The data will also be archived for cultural preservation.
Read more: Hype and western values are shaping AI reporting in Africa: what needs to change
Larger, balanced, publicly available African language datasets will allow us to connect text and speech resources. Models will not just be experimental, but useful in chatbots, education tools and local service delivery. The opportunity is there to go beyond datasets into ecosystems of tools (spell-checkers, dictionaries, translation systems, summarisation engines) that make African languages a living presence in digital spaces.
In short, we are pairing ethically collected, high-quality speech at scale with models. The aim is for people to be able to speak naturally, be understood accurately, and access AI in the languages they live their lives in.
What happens next for the project?
Get the latest in African news delivered straight to your inbox
By submitting above, you agree to our privacy policy.
Almost finished…
We need to confirm your email address.
To complete the process, please follow the instructions in the email we just sent you.
There was a problem processing your submission. Please try again later.
This project only collected voice data for certain languages. What of the remaining languages? What of other tools like machine translation or grammar checkers?
We will continue to work on multiple languages, ensuring that we build data and models that reflect how Africans use their languages. We prioritise building smaller language models that are both energy efficient and accurate for the African context.
The challenge now is integration: making these pieces work together so that African languages are not just represented in isolated demos, but in real-world platforms.
One of the lessons from this project, and others like it, is that collecting data is only step one. What matters is making sure that the data is benchmarked, reusable, and linked to communities of practice. For us, the “next” is to ensure that the ASR benchmarks we build can connect with other ongoing African efforts.
Read more: Does AI pose an existential risk? We asked 5 experts
We also need to ensure sustainability: that students, researchers, and innovators have continued access to compute (computer resources and processing power), training materials and licensing frameworks (Like NOODL or Esethu). The long-term vision is to enable choice: so that a farmer, a teacher, or a local business can use AI in isiZulu, Hausa, or Kikuyu, not just in English or French.
If we succeed, built-in AI in African languages won’t just be catching up. It will be setting new standards for inclusive, responsible AI worldwide.
Vukosi Marivate, Chair of Data Science, Professor of Computer Science, Director AfriDSAI, University of Pretoria
Ife Adebara, Assistant Professor, University of Alberta
Lilian Wanzare, Lecturer and chair of the Department of Computer Science, Maseno University
This article is republished from The Conversation Africa under a Creative Commons license. Read the original article.
AllAfrica publishes around 500 reports a day from more than 110 news organizations and over 500 other institutions and individuals, representing a diversity of positions on every topic. We publish news and views ranging from vigorous opponents of governments to government publications and spokespersons. Publishers named above each report are responsible for their own content, which AllAfrica does not have the legal right to edit or correct.
Articles and commentaries that identify allAfrica.com as the publisher are produced or commissioned by AllAfrica. To address comments or complaints, please Contact us.
AllAfrica is a voice of, by and about Africa – aggregating, producing and distributing 500 news and information items daily from over 110 African news organizations and our own reporters to an African and global public. We operate from Cape Town, Dakar, Abuja, Johannesburg, Nairobi and Washington DC.
Get the latest in African news delivered straight to your inbox
By submitting above, you agree to our privacy policy.
Almost finished…
We need to confirm your email address.
To complete the process, please follow the instructions in the email we just sent you.
There was a problem processing your submission. Please try again later.

source

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Local

Africa: Climate Science and Early Warnings Key to Saving Lives

Published

on

11 Views

No country is safe from the devastating impacts of extreme weather — and saving lives means making early-warning systems accessible to all, UN chief António Guterres said on Wednesday.
“Early-warning systems work,” he told the World Meteorological Organization (WMO) in Geneva. “They give farmers the power to protect their crops and livestock. Enable families to evacuate safely. And protect entire communities from devastation.”
“We know that disaster-related mortality is at least six times lower in countries with good early-warning systems in place,” the UN chief said.
He added that just 24 hours’ notice before a hazardous event can reduce damage by up to 30 per cent.
Follow us on WhatsApp | LinkedIn for the latest headlines
In 2022, Mr. Guterres launched the Early Warnings for All initiative aiming to ensure that “everyone, everywhere” is protected by an alert system by 2027.
Progress has been made, with more than half of all countries now reportedly equipped with multi-hazard early-warning systems. The world’s least developed countries have nearly doubled their capacity since official reporting began “but we have a long way to go,” the UN chief acknowledged.
At a special meeting of the World Meteorological Congress earlier this week, countries endorsed an urgent Call to Action aiming to close the remaining gaps in surveillance.
Extreme weather worsens
WMO head Celeste Saulo, who has been urging a scale-up in early-warning system adoption, warned that the impacts of climate change are accelerating, as “more extreme weather is destroying lives and livelihoods and eroding hard-won development gains”.
She spoke of a “profound opportunity to harness climate intelligence and technological advances to build a more resilient future for all.”
Weather, water, and climate-related hazards have killed more than two million people in the past five decades, with developing countries accounting for 90 per cent of deaths, according to WMO.
Mr. Guterres emphasized the fact that for countries to “act at the speed and scale required” a ramp-up in funding will be key.
Surge in financing
“Reaching every community requires a surge in financing,” he said. “But too many developing countries are blocked by limited fiscal space, slowing growth, crushing debt burdens and growing systemic risks.”
He also urged action at the source of the climate crisis, to try to limit fast-advancing global warming to 1.5 degrees Celsius above pre-industrial era temperatures – even though we know that this target will be overshot over the course of the next few years, he said.
Get the latest in African news delivered straight to your inbox
By submitting above, you agree to our privacy policy.
Almost finished…
We need to confirm your email address.
To complete the process, please follow the instructions in the email we just sent you.
There was a problem processing your submission. Please try again later.
“One thing is already clear: we will not be able to contain global warming below 1.5 degrees in the next few years,” Mr. Guterres warned. “The overshooting is now inevitable. Which will mean that we’re going to have a period, bigger or smaller, with higher or lower intensity, above 1.5 degrees in the years to come.”
Still, “we are not condemned to live with 1.5 degrees” if there is a global paradigm shift and countries take appropriate action.
At the UN’s next climate change conference, where states are expected to commit to reducing greenhouse gas emissions over the next decade, “we need to be much more ambitious,” he said. COP30 will take place on 10-21 November, in Belén, Brazil.
“In Brazil, leaders need to agree on a credible plan in order to mobilize $1.3 trillion per year by 2035 for developing countries, to finance climate action,” Mr. Guterres insisted.
Developed countries should honour their commitment to double climate adaptation funding to $40 billion this year and the Loss and Damage Fund needs to attract “substantial contributions,” he said.
Mr. Guterres stressed the need to “fight disinformation, online harassment and greenwashing,” referring to the UN-backed Global Initiative on Climate Change Information Integrity.
“Scientists and researchers should never fear telling the truth,” he said.
He expressed his solidarity with the scientific community and said that the “ideas, expertise and influence” of the WMO, which marks its 75th anniversary this week, are needed now “more than ever”.
Read the original article on UN News.
AllAfrica publishes around 600 reports a day from more than 110 news organizations and over 500 other institutions and individuals, representing a diversity of positions on every topic. We publish news and views ranging from vigorous opponents of governments to government publications and spokespersons. Publishers named above each report are responsible for their own content, which AllAfrica does not have the legal right to edit or correct.
Articles and commentaries that identify allAfrica.com as the publisher are produced or commissioned by AllAfrica. To address comments or complaints, please Contact us.
AllAfrica is a voice of, by and about Africa – aggregating, producing and distributing 600 news and information items daily from over 110 African news organizations and our own reporters to an African and global public. We operate from Cape Town, Dakar, Abuja, Johannesburg, Nairobi and Washington DC.
Get the latest in African news delivered straight to your inbox
By submitting above, you agree to our privacy policy.
Almost finished…
We need to confirm your email address.
To complete the process, please follow the instructions in the email we just sent you.
There was a problem processing your submission. Please try again later.

source

Continue Reading

Local

Africa: Insecurity Is Threatening Africa's Ability to Finance Its Own Development, Warns New Mo Ibrahim Foundation Research Brief

Published

on

11 Views

London — The Mo Ibrahim Foundation has released a new research brief, Africa’s natural resources and conflicts: a vicious cycle, examining how growing competition over natural resources is fuelling conflicts across the continent – and how these conflicts are, in turn, undermining Africa’s ability to leverage its own wealth for development.

The Foundation warns of a vicious cycle in which resources fuel conflict, while insecurity erodes governments’ capacity to manage those resources effectively, deters investment, and reinforces perceptions of Africa as a high-risk destination.

The new research brief highlights that the security situation in Africa has worsened sharply, with security incidents increasing by 87% between 2019 and 2024. Drawing on data from the 2024 Ibrahim Index of African Governance (IIAG), it notes that Security & Safety is the most deteriorated of all 16 governance sub-categories, declining by -5.0 points between 2014 and 2023 at the continental average level.

While this surge is seen as reflective of wider international rise in conflict, the brief highlights the enormous economic cost of insecurity in Africa. Between 1996 and 2022, intense conflict was associated with an average 20% reduction in annual economic growth. National-level impacts are also stark: in Sudan, GDP is projected to shrink by up to 42% under current conflict conditions.
The research identifies an emerging trend across the continent, where struggles over resource control are intensifying insecurity and weakening governance. The brief includes three case studies:
Keep up with the latest headlines on WhatsApp | LinkedIn
Sudan: The war has deepened an already complex illicit financial flows (IFFs) landscape, with an estimated 57% of gold production smuggled in 2023. Both the SAF and RSF are funding operations through the gold sector, as international actors compete for influence.
The Sahel: Conflicts are increasingly driven by local grievances over land, climate stress, and control of resources such as gold, uranium, and oil. Armed groups, criminal networks, and foreign actors exploit these resources to finance violence, further eroding state authority in Mali, Burkina Faso, Niger, and Chad.
DR Congo: Foreign powers and armed groups continue to fight over the country’s mineral wealth, especially cobalt, of which the DRC produces 75% of global supply. Corruption and underreporting remain rampant, with mining companies failing to declare an estimated $16.8 billion in revenue between 2018 and 2023.
The research underscores the urgent need to address the links between security and resource management to ensure that Africa can leverage its own resources and take ownership of its development agenda.
AllAfrica publishes around 600 reports a day from more than 110 news organizations and over 500 other institutions and individuals, representing a diversity of positions on every topic. We publish news and views ranging from vigorous opponents of governments to government publications and spokespersons. Publishers named above each report are responsible for their own content, which AllAfrica does not have the legal right to edit or correct.
Articles and commentaries that identify allAfrica.com as the publisher are produced or commissioned by AllAfrica. To address comments or complaints, please Contact us.
AllAfrica is a voice of, by and about Africa – aggregating, producing and distributing 600 news and information items daily from over 110 African news organizations and our own reporters to an African and global public. We operate from Cape Town, Dakar, Abuja, Johannesburg, Nairobi and Washington DC.
Get the latest in African news delivered straight to your inbox
By submitting above, you agree to our privacy policy.
Almost finished…
We need to confirm your email address.
To complete the process, please follow the instructions in the email we just sent you.
There was a problem processing your submission. Please try again later.

source

Continue Reading

Local

Africa: Powering Africa's First Solar Ai Research Hub

Published

on

11 Views

The Namibia University of Science and Technology (Nust) is partnering with international and local institutions to develop Africa’s first solar-powered artificial intelligence (AI) research cluster.
The university is in advanced discussions with the Fraunhofer Institute for Solar Energy Systems and Karibu Kwetu Trading to establish micro-concentrated photovoltaic technology.
Micro-concentrated photovoltaic technology is a high-efficiency solar technology that uses lenses to focus sunlight onto highly efficient solar cells to achieve high concentration ratios.
Fraunhofer delivers up to 43% higher conversion efficiency, which will be aligned with Namibia’s growing research and innovation ecosystem.
This will be supported by Karibu Kwetu’s renewable energy expertise and Nust’s academic leadership in digital transformation.
The Namibian uses AI tools to assist with improved quality, accuracy and efficiency, while maintaining editorial oversight and journalistic integrity.
Read the original article on Namibian.
AllAfrica publishes around 600 reports a day from more than 110 news organizations and over 500 other institutions and individuals, representing a diversity of positions on every topic. We publish news and views ranging from vigorous opponents of governments to government publications and spokespersons. Publishers named above each report are responsible for their own content, which AllAfrica does not have the legal right to edit or correct.
Articles and commentaries that identify allAfrica.com as the publisher are produced or commissioned by AllAfrica. To address comments or complaints, please Contact us.
AllAfrica is a voice of, by and about Africa – aggregating, producing and distributing 600 news and information items daily from over 110 African news organizations and our own reporters to an African and global public. We operate from Cape Town, Dakar, Abuja, Johannesburg, Nairobi and Washington DC.
Get the latest in African news delivered straight to your inbox
By submitting above, you agree to our privacy policy.
Almost finished…
We need to confirm your email address.
To complete the process, please follow the instructions in the email we just sent you.
There was a problem processing your submission. Please try again later.

source

Continue Reading

Trending

Copyright © 2024 an24.africa