Abstract

Access to potable water is a critical requirement for human survival. Beyond drinking, water is also necessary for animal consumption, irrigation, as well as domestic and commercial uses. Laboratory assessments of water samples to determine their fitness for use is a vital step in water quality assurance processes. However, laboratory assessments require adherence to stringent measures, which might be difficult to comply with. Machine learning (ML) has emerged in recent years as viable and cheaper solutions to complement (or replace) lab-based assessments, with a caveat of availability of sufficient data to train the ML models. Unfortunately, such data are not always (or sparsely) available, especially in less developed countries. To this end, the work attempts to fill this gap by creating ample sized datasets that can be used to train (and test) ML models. Two datasets are curated in this work, one for drinking water and the other for irrigation water. The datasets were curated by aggregating data from smaller datasets on related concepts, then processed and labelled to make them useful for supervised ML models. To prove the applicability of the curated datasets, they were used to train ML models in a related work and yielded good results.

Instructions:

The datasets are in CSV formats and contain physico-chemical parameters of water from different sources. Each data entry is labelled as 0 or 1 representing usable or not usable respectively. This label field is the last column in each dataset. Two datasets are uploaded, the first is for drinking / potable water, while the second is for water usable for irrigation. An algorithm for calculating the label value has also been included in the documentation, as well as the python scripts used to calculate the values.

Comments

Submitted by RASHMITHA CHALLA on Sun, 02/05/2023 - 20:47

good

Submitted by jiayi liu on Fri, 04/07/2023 - 03:05

good

Submitted by jiayi liu on Fri, 04/07/2023 - 03:06

NEEDED THIS DATA SET FOR WATER QUALITY ANALYSIS.

Submitted by Payala Krishnan... on Thu, 08/15/2024 - 22:54

Please i need this data set for my master thesis and i want to know if is available for free download for some unitersities in Germany?

Your swift response will be appreciated.

Submitted by Bartholomew Aifuwa on Tue, 02/18/2025 - 10:30

Dataset Files

WaterNet_Dataset.zip (26.70 kB)
Python Script for labelling data WaterNet_Dataset_Python_Script.py (5.79 kB)

Documentation

Attachment	Size
Dataset documentation	192.53 KB

Datasets

Standard Dataset

Dataset for Assessing Water Quality for Drinking and Irrigation Purposes using Machine Learning Models

Abstract

Comments

More from this Author

15 Year Evolution of Smart NICs

Dataset Files

Documentation

QUESTIONS?