Datasets
Standard Dataset
Dataset for Assessing Water Quality for Drinking and Irrigation Purposes using Machine Learning Models
- Citation Author(s):
- Submitted by:
- Olasupo Ajayi
- Last updated:
- Tue, 07/19/2022 - 06:54
- DOI:
- 10.21227/trcf-1s03
- Data Format:
- Research Article Link:
- License:
- Categories:
- Keywords:
Abstract
Access to potable water is a critical requirement for human survival. Beyond drinking, water is also necessary for animal consumption, irrigation, as well as domestic and commercial uses. Laboratory assessments of water samples to determine their fitness for use is a vital step in water quality assurance processes. However, laboratory assessments require adherence to stringent measures, which might be difficult to comply with. Machine learning (ML) has emerged in recent years as viable and cheaper solutions to complement (or replace) lab-based assessments, with a caveat of availability of sufficient data to train the ML models. Unfortunately, such data are not always (or sparsely) available, especially in less developed countries. To this end, the work attempts to fill this gap by creating ample sized datasets that can be used to train (and test) ML models. Two datasets are curated in this work, one for drinking water and the other for irrigation water. The datasets were curated by aggregating data from smaller datasets on related concepts, then processed and labelled to make them useful for supervised ML models. To prove the applicability of the curated datasets, they were used to train ML models in a related work and yielded good results.
The datasets are in CSV formats and contain physico-chemical parameters of water from different sources. Each data entry is labelled as 0 or 1 representing usable or not usable respectively. This label field is the last column in each dataset. Two datasets are uploaded, the first is for drinking / potable water, while the second is for water usable for irrigation. An algorithm for calculating the label value has also been included in the documentation, as well as the python scripts used to calculate the values.
Dataset Files
- WaterNet_Dataset.zip (26.70 kB)
- Python Script for labelling data WaterNet_Dataset_Python_Script.py (5.79 kB)
Documentation
Attachment | Size |
---|---|
Dataset documentation | 192.53 KB |
Comments
i
good
good
NEEDED THIS DATA SET FOR WATER QUALITY ANALYSIS.