Data Science for Social Impact

Research Group @ University of Pretoria

Advancing Data Science and Natural Language Processing for African languages and societal impact

Leading Research in Data Science for Social Impact

We bridge the gap between cutting-edge data science research and real-world societal challenges, with a special focus on African language technologies.

Our Publications Explore Projects

Our Impact Areas

Research that drives meaningful change in society

🌍 Data Science for Society

Developing methods and tools that enhance decision-making capabilities while ensuring user-centered approaches to data science research.

🗣️ African Language NLP

Advancing natural language processing for African languages through innovative tools, datasets, and methodologies.

🤖 Machine Learning Research

Developing ML approaches that address real-world challenges while considering ethical and societal implications.

Recent Research Highlights

Our latest contributions to the scientific community

Our research spans multiple domains, from developing language technologies for under-resourced African languages to creating data-driven solutions for societal challenges. We publish in top-tier venues and actively contribute to open science through our datasets, models, and tools.

View All Publications Access Our Code

Open Resources & Tools

We believe in open science and making our work accessible to all

Datasets

Curated datasets for African language research and social impact studies

Software & Models

Open-source tools and pre-trained models for researchers worldwide

Documentation

Comprehensive guides and tutorials for using our resources

Featured Projects

Masakhane Translate

Machine translation platform for African languages, making communication across language barriers more accessible.

Masakhane NLP

Grassroots organization focused on advancing NLP research and applications for African languages.

COVID-19 ZA Data

Comprehensive data repository tracking COVID-19 in South Africa, supporting research and policy decisions.

Explore All Projects

Latest News

25 Jun 2025

DSFSI at DSLL Panel: Copyright and Data Science—Legal Challenges in African Data Projects

The Data Science Law Lab (DSLL) recently hosted a dynamic panel discussion focused on the intersection of copyright, law, and data science in Africa. DSFSI was proud to participate through lab members Prof. Vukosi Marivate and Dr. Tsosheletso Chidi, who joined moderator Prof. Chijioke Okorie for a wide-ranging conversation. Held as part of our ongoing collaboration with DSLL, the event explored real-world legal, ethical, and community challenges faced by researchers working with African language data—and why these issues matter for innovation and inclusion in AI.

Join Our Research Community

Interested in contributing to cutting-edge research that makes a difference? We welcome collaboration and new members.

Join Our Team Explore Collaborations