Lacuna Fund - Open Data Sets & Use Cases

Lacuna Fund is the world’s first collaborative effort to provide data scientists, researchers, and social entrepreneurs in low- and middle-income contexts globally with the resources they need to produce labeled datasets that address urgent problems in their communities.

[Learn more about the Lacuna Fund]
0
Datasets
0
Use Cases
0
Countries
Additional 18 dataset/use-cases in the pipeline
View:
Domain/SDG:
Data Type:
Region:
SDG 13

Combatting Air Pollution and GHG Emissions in India through hyperlocal AI-powered mapping

India
SDG 10
SDG 8

Facilitating access to financial applications in informal settings in four Ghanaian dialects: Akuapem Twi, Ashante Twi, Fante and Ga.

SDG 10
SDG 5
SDG 2

Enabling machine translation from Kiswahili into the indigenous Kenyan languages Kidaw'ida, Kalenjin, and Dholuo, preserving these languages & supporting crowd-sourced voice recognition via Mozilla Common Voice for these languages

SDG 13

Helping to measure solar energy adoption across Madagascar via AI - Labelled Open solar panel data for Madagascar

SDG 10

Detecting sentiments and combatting hate speech in Hausa, Igbo, Nigerian-Pidgin and Yorùbá - NaijaSenti: a Nigerian Corpus for Multilingual Sentiment Analysis

Nigeria
SDG 13
SDG 2

Enable Cashew, Cocoa and Coffee farmers to make good business decisions - Drone-based Agricultural Dataset for Crop Yield Estimation in Ghana and Uganda

Ghana,Uganda
SDG 13
SDG 10
SDG 5

Preserving privacy and avoiding gender bias of AI systems in Luganda, Lumasaba, Hausa, and Kanuri - The Lacuna personally identifiable information Text Dataset

SDG 13
SDG 7

Promoting energy conservation and market analysis in Pakistan through Residential Energy and Weather Data (REWD)

Pakistan
Lahore University of Management Sciences (LUMS)
SDG 10
SDG 2

Datasets for transportation impact evaluation

Colombia
Fundación Despacio, World Resources Institute, Fundación Despacio
SDG 13
SDG 15

Monitoring the impact of palm oil monoculture, shrimp aquaculture & mining in continental Ecuador and the Galapagos using AI

Ecuador
Fundacion Ecociencia

Indigenous Knowledge Meets AI: Monitoring Elephants and Rodents in Kenya and Ecuadorian Amazon with Biodiversity AI-Datasets

Ecuador,Kenya
Space4Innovation, Diana Mastracci diana@space4innovation.com

AI for Mangrove Carbon Credits: Turning Forest Data into Climate Action in Côte d’Ivoire

Cote d'Ivoire
data354
SDG 15

Mapping Cocoa Landscapes in Ghana: Reference Data for Tracking Land Use Change

Ghana
Center for Remote Sensing and Geographic Information Services
SDG 15

African Trees for Climate Resilience: A Comprehensive Database

Angola, DRC, Kenya, Mozambique, Nigeria, South Africa,Tanzania, Zambia
Professor Guy F Midgley gfmidgley@sun.ac.za University of Stellenbosch
SDG 13
SDG 15

Inclusive MRV for India's Eastern Himalayas

India
Vertify.earth, Michael Anthony michael@vertify.earth, Alsisar Impact, Saurabh Singhavi saurabh@alsisarimpact.com
SDG 2

Data-enabled climate shock absorbance through agroforestry (Agrof4resilience)

Kenya
International Center of Insect Physiology and Ecology (ICIPE)
SDG 15

Quantifying Colombian mangroves aboveground biomass and carbon content

Colombia
María Cuevas (mcuevas@cttc.es / Centre Tecnològic de Telecomunicacions de Catalunya, CTTC, Spain), Cristian Montes (cristian.montes@invemar.org.co / Instituto de Investigaciones Marinas y Costeras José Benito Vives de Andreis, INVEMAR, Colombia)
SDG 15

Phenological Dataset for Ecological Forecasting (PheDEF Project)

Ghana
bismark.ofosu-bamfo@uenr.edu.gh; bofosubamfo@gmail.com) and Daniel Yawson, School of Science, University of Energy and Natural Resources, Sunyani, Ghana/Raul Zurita-Milla, Faculty ITC, University of Twente, The Netherlands. Primary contact/Maintainance of datasets: Bismark Ofosu-Bamfo
SDG 2

Eyes on the Ground Image Data

Kenya
Lilian Waithaka, Koen Hufkens, Berber Kramer and Benson Njuguna
SDG 2

High-Accuracy Maize Plot Location and Yield Dataset in East Africa

Kenya, Rwanda, Tanzania
One Acre Fund
SDG 2
SDG 6

Sensor Based Aquaponics Fish Pond Datasets: IoT Fish Pond Monitoring Datasets

Nigeria
Udanor Collins, Blessing Ogbuokiri, and Nweke Onyiny
SDG 2

Machine Learning Datasets for Crop Pest and Disease Diagnosis: Crop Imagery and Spectrometry Data

Uganda, Tanzania, Ghana
Joyce Nakatumba-Nabende, Andrew Katumba, Claire Babirye, Jeremy Francis Tusubira, Godliver Owomugisha, Neema Mduma, Darlington Akogo, Blessing Sibanda
SDG 2
SDG 16

A Decision-Supporting Tool for Developing Community-led Land Use Plans

SDG 2
SDG 1

Enhanced Agriculture Datasets for Remote Crop Monitoring to Provide Access to Essential Social and Financial Services to Smallholder Farmers in Zimbabwe

Zimbabwe
SDG 2

Continental Crop Field Boundary Detection

SDG 2

CropHarvest: Informing decision-making around agricultural development, early warning systems, and trade in Sub-Saharan Africa

Kenya, Mali, Togo, Rwanda, Uganda, Ethiopia, Malawi, Zambia, Tanzania, Namibia, Sudan, Nigeria
SDG 2
SDG 1

Improving livelihoods in Ghana and Uganda: Drone-based Agricultural Dataset for Crop Yield Estimation of cashew, cocoa, and coffee

Ghana, Uganda

Machine Learning Dataset for Rabies Diagnosis and Outbreak Prediction

Global
Asa Emmanuel: asakalonga@gmail.com, Kennedy Lushasi: klushasi@ihi.or.tz

Childhood Malnutrition in Chile

Chile
Maria Paz Hermosilla: paz.hermosilla@uai.cl

Lacuna Malaria Datasets

Uganda, Ghana
Rose Nakasi: g.nakasi.rose@gmail.com or rose.nakasi@mak.ac.ug

Intraoperative Anesthesia and Outcomes Dataset: Improving patient outcomes by predicting risk of mortality and post-operative recovery

Sub-Saharan Africa
Bhiken Naik: bin4n@uvahealth.org

Brain Tumor Segmentation Africa (BraTS-Africa) Dataset

Nigeria
Udunna Anazodo: udunna.anazodo@mcgill.ca

AI-Assisted Smartphone Microscopy for detection of Diarrhea Parasites

Nepal
Bishesh Khanal: bishesh.khanal@naamii.org.np
SDG 13

Project Climate Change, Health, and Artificial Intelligence (CCHAIN): Public Health Data Insights for the Philippines

Philippines
Thinking Machines Data Science | data-for-development@thinkingmachin.es
SDG 13

Air quality dataset of abattoir centers in Southern Nigeria

Nigeria
Emmanuel Chukwuma | emmanuel.chukwuma@apse-ngo.org
SDG 13

Global Horizontal Irradiance Dataset for Mauritius, Rodrigues, and Agalega Islands

Mauritius, Rodrigues, and Agalega Islands
Not specified
SDG 13

Labelled Open Solar Panel Data to measure solar energy adoption in Madagascar

Madagascar
Fabienne Rafidiharinirina | f.rafidiharinirina@association-maidi.mg or assomaidi@gmail.com
SDG 13

Climate Energy Dataset for Off-Grid Electricity Infrastructure

Pakistan
Dr. Zeeshan Shafiq | zeeshanshafiq@uetpeshawar.edu.pk
SDG 4

A Nigerian Twitter Sentiment Corpus for Multilingual Sentiment Analysis

Nigeria
Shamsuddeen Hassan Muhammad, David Ifeoluwa Adelani, Sebastian Ruder, Ibrahim Said Ahmad, Idris Abdulmumin, Bello Shehu Bello, Monojit Choudhury, Chris Chinenye Emezue, Saheed Salahudeen Abdullahi, Anuoluwapo Aremu, Alipio Jeorge, and Pavel Brazdil
SDG 4

Machine Translation Benchmark Dataset for Languages in the Horn of Africa

Horn of Africa (Ethiopia, Eritrea)
Asmelash Teka Hadgu, Gebrekirstos G. Gebremeskel, Abel Aregawi
SDG 4

Kencorpus: Kenyan Languages Corpus for Machine Learning and Natural Language Processing

Kenya
Owen McOnyango (Maseno University), Florence Indede (Maseno University), Lilian D.A. Wanzare (Maseno University), Barack Wanjawa (University of Nairobi), Edward Ombui (Africa Nazarene University), Lawrence Muchemi (University of Nairobi)
SDG 4

KenPos: Kenyan Languages Part of Speech Tagged dataset

Kenya
Florence Indede (Maseno University), Owen McOnyango (Maseno University), Lilian D.A. Wanzare (Maseno University), Barack Wanjawa (University of Nairobi), Edward Ombui (Africa Nazarene University), Lawrence Muchemi (University of Nairobi)
SDG 4

KenSpeech: Swahili Speech Transcriptions

Kenya
Dorcas Awino (University of Nairobi), Lawrence Muchemi (University of Nairobi), Lilian D.A. Wanzare (Maseno University), Edward Ombui (Africa Nazarene University), Barack Wanjawa (Maseno University), Owen McOnyango (Maseno University), Florence Indede (Maseno University)
SDG 4

KenTrans: A Parallel Corpora for Swahili and local Kenyan Languages

Kenya
Lilian D.A Wanzare (Maseno University), Florence Indede (Maseno University), Owen McOnyango (Maseno University), Edward Ombui (Africa Nazarene University), Barack Wanjawa (University of Nairobi), Lawrence Muchemi (University of Nairobi)
SDG 4

KenSwQuAD – A Question Answering Dataset for Swahili Low Resource Language

Kenya
Barack Wanjawa (University of Nairobi), Lilian D.A. Wanzare (Maseno University), Florence Indede (Maseno University), Owen McOnyango (Maseno University), Lawrence Muchemi (University of Nairobi), Edward Ombui (Africa Nazarene University)
SDG 4

MasakhaNER 2.0: Named Entity Recognition datasets for 20 African languages

Africa (Multi-country)
David Ifeoluwa Adelani, D.ADELANI@UCL.AC.UK
SDG 4

MAFAND-MT: Masakhane Anglo & Franco African News Corpus for Machine Translation

Africa (Multi-country)
David Ifeoluwa Adelani, D.ADELANI@UCL.AC.UK
SDG 4

MasakhaPOS: Part-of-Speech Tagging Dataset for 20 African Languages

Africa (Multi-country)
David Ifeoluwa Adelani, D.ADELANI@UCL.AC.UK
SDG 4

Financial Inclusion Speech Dataset for some Ghanaian Languages

Ghana
Dennis Asamoah Owusu, DOWUSU@ASHESI.EDU.GH
SDG 4

IgboSynCorp: Dataset for Igbo Natural Language Processing Tasks

Nigeria
SDG 4

Bayelemabaga Aligned Bambara-French Corpus for Machine Translation

Mali/France
Christopher Homan, christopher.m.homan.phd@gmail.com
SDG 4

Makerere University NLP Datasets

Uganda, Tanzania, Kenya
Andrew Katumba | andrew.katumba@mak.ac.ug
SDG 4

BIG-C: A Multimodal Multi-Purpose Dataset for Bemba

Zambia
Claytone Sikasote | claytonsikasote@gmail.com
SDG 4

KALLAAMA

Senegal
Aminata Ndiaye | amina.ndiaye@jokalante.com and Elodie Gauthier | elodie.gauthier@orange.com
SDG 4

NaijaVoices: Our Language is Our Strength

Nigeria
info@naijavoices.com
SDG 4

AFRIDOC-MT: Document-level MT Corpus for African Languages

Multiple African Countries
Jesujoba O. Alabi | jalabi@lsv.uni-saarland.de
SDG 4

Masakhane-NLU: Conversational AI & Benchmark datasets for African languages

Multiple African Countries
David Adelani | david.adelani@mila.quebec
SDG 4

Lacuna PII Multilingual Dataset

Multiple African Countries
Andrew Katumba | katumba@mak.ac.ug, Milena Haykowska | milena.haykowska@clearglobal.org, Peter Nabende | nabende@gmail.com
SDG 4

Building Parallel Corpora for Kenya's Indigenous Languages and Kiswahili

Multiple African Countries
Audrey Mbogho | ambogho@usiu.ac.ke
SDG 4

Expanding a parallel corpus of Portuguese and the Bantu language Emakhuwa

Multiple African Countries
Felermino D. M. A. Ali | felermino.ali@unilurio.ac.mz or felerminoali@gmail.com

No matching items found

Try adjusting your filters or search term to find what you're looking for.

Dataset Details

Loading details...