Dataset for Inclusive Fintech Software Development

- Citation Author(s):
-
Nagwovuma Margaret (Makerere University)Nansamba Barbara (Makerere University)Ggaliwango Marvin (Makerere University)
- Submitted by:
- Belinda Kobusingye
- Last updated:
- DOI:
- 10.21227/jp32-an80
- Data Format:
- Categories:
- Keywords:
Abstract
This study presents a English-Luganda parallel corpus comprising over 2,000 sentence pairs, focused on financial decision-making and products. The dataset draws from diverse sources, including social media platforms (TikTok comments and Twitter posts from authoritative accounts like Bank of Uganda and Capital Markets Uganda), as well as fintech blogs (Chipper Cash and Xeno). The corpus covers a range of financial topics, including bonds, loans, and unit trust funds, providing a comprehensive resource for financial language processing in both English and Luganda.
Instructions:
- Load the dataset using pandas.
- Inspect the data to understand its structure and identify potential issues.
- Handle missing values by filling the 'source' column with 'Unknown' and dropping rows with missing values in 'english' or 'luganda' columns.
- Normalize text in both 'english' and 'luganda' columns by converting to lowercase, removing extra whitespace, and removing special characters.
- Adjust these steps as needed based on your specific dataset characteristics and project requirements.
I am looking dataset for my project I request you to please kindly provide me dataset
How do I receive my funding
i request to provide fintech microservices dataset.