Moringa Leaf Extraction

Citation Author(s):: Kurnianingsih Kurnianingsih (Department of Electrical Engineering, Politeknik Negeri Semarang, Indonesia)
Submitted by:: Kurnianingsih K
Last updated:: Mon, 01/06/2025 - 05:24
DOI:: 10.21227/g858-5w12
Data Format:: *.csv

66 views

Categories:

Agriculture

Keywords:

artificial intelligence; deep learning; real-time system; unsupervised learning

ACCESS DATASET CITE

Abstract

This dataset provides turbidity measurements collected during a Moringa oleifera leaf water treatment process for compound extraction. The extraction process was conducted over a 15-minute duration, capturing key changes in turbidity to reflect the dynamics of the process. The raw data has been preprocessed, upsampled, and annotated for time series analysis, enabling detailed investigation of extraction patterns. Additionally, the dataset has been optimized using the ForGAN (Forecasting GAN) algorithm to enhance data granularity and support predictive modeling. Researchers can utilize this dataset to explore the efficiency of extraction methods, develop predictive models, and further investigate the application of Moringa oleifera in water treatment technologies.

Instructions:

# Data Description: Kelor Turbidity Dataset
## OverviewThis dataset contains turbidity measurements from a kelor (Moringa oleifera) water treatment process. The data has been upsampled and annotated for time series analysis using the ForGAN (Forecasting GAN) algorithm.
## File Structure- **Format**: CSV (Comma Separated Values)- **Columns**: 1. `timestamp`: Unix timestamp representing the time of measurement 2. `value`: Turbidity measurement value (float) 3. `event`: Binary flag for anomaly labeling (0 = normal, 1 = anomaly) 4. `dx`: Sequential index of the measurement (0-based)
## Data Characteristics- **Sampling Rate**: 1 second- **Value Range**: Approximately 975-1006 turbidity units- **Temporal Coverage**: Starting from Unix timestamp 1603177927- **Data Points**: Over 1,000 measurements- **Preprocessing**: Data has been upsampled using interpolation
## PurposeThis dataset is used in the ForGAN implementation for:1. Training a probabilistic forecasting model using Generative Adversarial Networks2. Time series analysis with sliding window approach3. Anomaly detection using KL divergence between predicted and actual distributions
## Usage in ForGAN- The data is split into training (50%), validation (10%), and test (40%) sets- Used with a sliding window approach where previous values (condition_size) predict the next value- The turbidity values are normalized using mean and standard deviation before model training- The model generates multiple predictions (default: 200) for each time step to capture uncertainty
This dataset serves as a real-world example for testing ForGAN's capabilities in probabilistic forecasting and anomaly detection.

Funding Agency

Ministry of Education, Culture, Research and Technology of the Republic of Indonesia

Grant Number

Grant No. /S54PK/D.D4/PPK.01.APTV/III/2024

Nice

Muhammad Faizan Khan Wed, 03/19/2025 - 17:24 Permalink