Datasets
Standard Dataset
Hydrocarbon Spill Hyperspectral Dataset (HSHD
- Citation Author(s):
- Submitted by:
- David Rivas
- Last updated:
- Thu, 12/05/2024 - 15:39
- DOI:
- 10.21227/4etm-h961
- Data Format:
- License:
- Categories:
- Keywords:
Abstract
Hyperspectral imaging (HSI) has become a pivotal tool for environmental monitoring, particularly in identifying and analyzing hydrocarbon spills. This study presents an Internet of Things (IoT)-based framework for the collection, management, and analysis of hyperspectral data, employing a controlled experimental setup to simulate hydrocarbon contamination. Using a state-of-the-art hyperspectral camera, a dataset of 116 images was generated, encompassing temporal and spectral variations of gasoline, thinner, and motor oil spills. The data were transmitted and stored in a cloud-based repository using an IoT-enabled architecture, ensuring accessibility and scalability. This work addresses the lack of specialized hyperspectral datasets for hydrocarbon analysis, offering valuable resources for predictive modeling and decision-making. Ethical and sustainable database practices were also prioritized to enhance long-term usability for environmental research and mitigation strategies.
Instructions for Using the Dataset
1. Accessing the Dataset
-
Repository Location:
- The dataset is organized in a cloud repository. Ensure you have access to the provided repository link (e.g., Google Drive, IEEE DataPort, or equivalent platform).
-
Repository Structure:
- Each file contains information on specific samples, hierarchically organized into folders:
- Clean: Reference samples without contamination (
m
). - Gasoline: Samples contaminated with gasoline (
g
). - Thinner: Samples contaminated with thinner (
d
). - Motor Oil: Samples contaminated with used motor oil (
a
).
- Clean: Reference samples without contamination (
- Each file is labeled following the scheme:
ContaminantType_Number
(e.g.,g0001
for the first gasoline sample).
- Each file contains information on specific samples, hierarchically organized into folders:
2. Data Format Description
-
File Format:
The data is stored in the ENVI format, commonly used for remote sensing applications..hdr
Files: Contain metadata such as wavelength ranges, pixel dimensions, and exposure conditions..dat
Files: Contain the hyperspectral data cubes.
-
Resolution and Specifications:
- Spatial resolution: 1024 × 1024 pixels.
- Spectral bands: 20 bands ranging from 400 nm to 1000 nm.
- Bit depth: 12 bits per pixel.
3. Software Requirements
- Hyperspectral Processing Software:
- Use tools like ENVI, QGIS, or any software compatible with the ENVI format to visualize and analyze the data.
- Custom Programming:
- If you prefer using scripts, libraries such as Spectral Python (SPy), OpenCV, or NumPy can be used to manipulate hyperspectral data.
- System Requirements:
- Processor: Intel Core i5 or higher.
- RAM: Minimum 8 GB.
- Disk Space: At least 5 GB free.
4. Analysis Workflow
-
Load the Data:
- Download the files from the repository and save them to your local system.
- Maintain the folder structure and metadata to ensure proper traceability during the analysis.
-
Data Preprocessing:
- Use the
.hdr
files to extract metadata and calibrate the spectral data. - Apply preprocessing techniques such as noise reduction or normalization to enhance data quality.
- Use the
-
Data Visualization:
- Use hyperspectral visualization tools to explore the spectral and spatial characteristics of the samples.
- Compare clean samples (
m
) with contaminated samples (g
,d
,a
) to identify unique spectral signatures.
-
Advanced Analysis:
- Implement machine learning models or statistical tools to classify contaminants based on their spectral signatures.
- Develop predictive models using the temporal evolution of the contaminant data.
5. Citation
- Please cite this dataset in your publications or projects using the provided citation format in the dataset metadata.
Documentation
Attachment | Size |
---|---|
ENVI Standard in Case of SENOP HSI Application_27.06.2019.pdf | 1.28 MB |