Datasets
Standard Dataset
Moringa Leaf Extraction
- Citation Author(s):
- Submitted by:
- Kurnianingsih K
- Last updated:
- Mon, 01/06/2025 - 00:24
- DOI:
- 10.21227/g858-5w12
- Data Format:
- License:
- Categories:
- Keywords:
Abstract
This dataset provides turbidity measurements collected during a Moringa oleifera leaf water treatment process for compound extraction. The extraction process was conducted over a 15-minute duration, capturing key changes in turbidity to reflect the dynamics of the process. The raw data has been preprocessed, upsampled, and annotated for time series analysis, enabling detailed investigation of extraction patterns. Additionally, the dataset has been optimized using the ForGAN (Forecasting GAN) algorithm to enhance data granularity and support predictive modeling. Researchers can utilize this dataset to explore the efficiency of extraction methods, develop predictive models, and further investigate the application of Moringa oleifera in water treatment technologies.
# Data Description: Kelor Turbidity Dataset
## OverviewThis dataset contains turbidity measurements from a kelor (Moringa oleifera) water treatment process. The data has been upsampled and annotated for time series analysis using the ForGAN (Forecasting GAN) algorithm.
## File Structure- **Format**: CSV (Comma Separated Values)- **Columns**: 1. `timestamp`: Unix timestamp representing the time of measurement 2. `value`: Turbidity measurement value (float) 3. `event`: Binary flag for anomaly labeling (0 = normal, 1 = anomaly) 4. `dx`: Sequential index of the measurement (0-based)
## Data Characteristics- **Sampling Rate**: 1 second- **Value Range**: Approximately 975-1006 turbidity units- **Temporal Coverage**: Starting from Unix timestamp 1603177927- **Data Points**: Over 1,000 measurements- **Preprocessing**: Data has been upsampled using interpolation
## PurposeThis dataset is used in the ForGAN implementation for:1. Training a probabilistic forecasting model using Generative Adversarial Networks2. Time series analysis with sliding window approach3. Anomaly detection using KL divergence between predicted and actual distributions
## Usage in ForGAN- The data is split into training (50%), validation (10%), and test (40%) sets- Used with a sliding window approach where previous values (condition_size) predict the next value- The turbidity values are normalized using mean and standard deviation before model training- The model generates multiple predictions (default: 200) for each time step to capture uncertainty
This dataset serves as a real-world example for testing ForGAN's capabilities in probabilistic forecasting and anomaly detection.