Hydrocarbon Spill Hyperspectral Dataset (HSHD

Citation Author(s):
David
Rivas-Lalaleo
Universidad de las Fuerzas Armadas ESPE
Carlos
Hernandez
Universidad Don Bosco
Submitted by:
David Rivas
Last updated:
Thu, 12/05/2024 - 15:39
DOI:
10.21227/4etm-h961
Data Format:
License:
0
0 ratings - Please login to submit your rating.

Abstract 

Hyperspectral imaging (HSI) has become a pivotal tool for environmental monitoring, particularly in identifying and analyzing hydrocarbon spills. This study presents an Internet of Things (IoT)-based framework for the collection, management, and analysis of hyperspectral data, employing a controlled experimental setup to simulate hydrocarbon contamination. Using a state-of-the-art hyperspectral camera, a dataset of 116 images was generated, encompassing temporal and spectral variations of gasoline, thinner, and motor oil spills. The data were transmitted and stored in a cloud-based repository using an IoT-enabled architecture, ensuring accessibility and scalability. This work addresses the lack of specialized hyperspectral datasets for hydrocarbon analysis, offering valuable resources for predictive modeling and decision-making. Ethical and sustainable database practices were also prioritized to enhance long-term usability for environmental research and mitigation strategies.

Instructions: 

Instructions for Using the Dataset

1. Accessing the Dataset

  1. Repository Location:

    • The dataset is organized in a cloud repository. Ensure you have access to the provided repository link (e.g., Google Drive, IEEE DataPort, or equivalent platform).
  2. Repository Structure:

    • Each file contains information on specific samples, hierarchically organized into folders:
      • Clean: Reference samples without contamination (m).
      • Gasoline: Samples contaminated with gasoline (g).
      • Thinner: Samples contaminated with thinner (d).
      • Motor Oil: Samples contaminated with used motor oil (a).
    • Each file is labeled following the scheme:
      ContaminantType_Number (e.g., g0001 for the first gasoline sample).

2. Data Format Description

  • File Format:
    The data is stored in the ENVI format, commonly used for remote sensing applications.

    • .hdr Files: Contain metadata such as wavelength ranges, pixel dimensions, and exposure conditions.
    • .dat Files: Contain the hyperspectral data cubes.
  • Resolution and Specifications:

    • Spatial resolution: 1024 × 1024 pixels.
    • Spectral bands: 20 bands ranging from 400 nm to 1000 nm.
    • Bit depth: 12 bits per pixel.

3. Software Requirements

  1. Hyperspectral Processing Software:
    • Use tools like ENVI, QGIS, or any software compatible with the ENVI format to visualize and analyze the data.
  2. Custom Programming:
    • If you prefer using scripts, libraries such as Spectral Python (SPy), OpenCV, or NumPy can be used to manipulate hyperspectral data.
  3. System Requirements:
    • Processor: Intel Core i5 or higher.
    • RAM: Minimum 8 GB.
    • Disk Space: At least 5 GB free.

4. Analysis Workflow

  1. Load the Data:

    • Download the files from the repository and save them to your local system.
    • Maintain the folder structure and metadata to ensure proper traceability during the analysis.
  2. Data Preprocessing:

    • Use the .hdr files to extract metadata and calibrate the spectral data.
    • Apply preprocessing techniques such as noise reduction or normalization to enhance data quality.
  3. Data Visualization:

    • Use hyperspectral visualization tools to explore the spectral and spatial characteristics of the samples.
    • Compare clean samples (m) with contaminated samples (g, d, a) to identify unique spectral signatures.
  4. Advanced Analysis:

    • Implement machine learning models or statistical tools to classify contaminants based on their spectral signatures.
    • Develop predictive models using the temporal evolution of the contaminant data.

5. Citation

  • Please cite this dataset in your publications or projects using the provided citation format in the dataset metadata.