These are resources from the class.
Slides [Updated for 2025]
Note: Slides are provided under Creative Commons 4.0 Share-Alike. You are encouraged to make comments on the slides so that they can be improved.
- Block 1: Introduction to NLP [Slides]
- Block 2: Modern NLP Approaches [Slides]
- Block 3: Data Science + NLP [Slides]
Block 1: Introduction to NLP
Block 2: Modern NLP Approaches
Block 3: Data Science + NLP
African Language NLP
- Masakhane African Machine Translation URL
Data Augmentation
- TextAugment Library URL
Datasets
Misinformation/Disinformation
- Credibility Corpus (Twitter, Web) in French and English
- Fake News Challenge
- Fake News (Kaggle)
- Hyperpartisan News Detection
- RumourEval
Hate Speech
- hatEval: Multilingual hate speech detection
- OffensEval: Offensive Language in Social Media
- HateSpeech Dataset
African Languages
Other
Presentations/Readings from other Researchers
- “How to do good research, get it published in SIGKDD and get it cited!”, Eamonn Keogh, SIGKDD 2009 Tutorial. URL
- Heuristics for Scientific Writing (a Machine Learning Perspective) - Zachary C. Lipton URL
- Developing Language Annotation for Machine Learning Algorithms - Marie Meteer URL