Terahertz Images for Grass Seed Detection

Citation Author(s):
Shaghik
Atakaramians
Deepak
Mishra
Qigejian
Wang
Amus
Goay
Ewa
Goldys
Submitted by:
Amus Goay
Last updated:
Tue, 01/21/2025 - 05:17
DOI:
10.21227/4hs2-rn27
Research Article Link:
License:
0
0 ratings - Please login to submit your rating.

Abstract 

This dataset comprises Terahertz (THz) images collected to support the research presented in the IEEE Access paper titled Diagnosing Grass Seed Infestation: Convolutional Neural Network Based Terahertz Imaging. The dataset is intended for the detection and classification of grass seeds embedded in biological samples, specifically ham, covered with varying thicknesses of wool. The images were captured at different frequencies within the THz spectrum, providing valuable data for the development of deep-learning models for seed detection. Each image is labeled based on the frequency at which it was taken and the specific seed within the sample. For instance, an image labeled "20" corresponds to a THz image taken at 0.2 THz for a sample containing four seeds. The individual seeds within the sample are further labeled as "20s1," "20s2," etc., denoting specific seed locations in the 0.2 THz image. This structured labeling facilitates precise training and validation of machine learning models, enabling accurate detection and classification of seeds even when obscured by wool layers.

Instructions: 

This dataset is structured to efficiently utilize Terahertz (THz) imaging-based seed detection and classification. It is organized into three primary subfolders, each corresponding to different experimental conditions: "1cm wool, Oblique incidence, 45degree," "2cm wool, Normal incidence," and "2cm wool, Oblique incidence, 45degree." Each subfolder is further divided into specific categories based on frequency range, seed presence, and sample orientation, ensuring comprehensive coverage of experimental variations. The structured naming conventions and folder organization provide clear guidance, allowing users to navigate and locate relevant data easily.

Within the "1cm wool, Oblique incidence, 45degree" folder, the dataset is categorized based on frequency domain data ranging from 0.2 to 0.5 THz. The "2cm wool, Normal incidence" folder contains subdirectories addressing variations in sample orientation, including horizontal and vertical rotations of ±5 degrees and measurements conducted without seeds. The filenames indicate the experimental setup, including details such as the step sizes used in frequency measurements. Similarly, the "2cm wool, Oblique incidence, 45degree" folder includes frequency-domain and time-domain amplitude data, providing a comprehensive view of the sample responses under different experimental conditions.

Each image file follows a consistent naming convention to ensure clarity. The numerical prefix, such as "20" or "22," represents the THz frequency at which the image was captured, with "20" corresponding to 0.2 THz, "22" to 0.22 THz, and so forth. The suffix appended to the filename, such as "s1," "s2," etc., indicates the specific seed within the sample, where "20s1" corresponds to the first seed in the 0.2 THz image, and "20s4" represents the fourth seed. This structured labeling facilitates straightforward identification and segmentation of the dataset.

Researchers can split the dataset into training and validation subsets based on different THz frequencies or specific seed positions to assess the generalization of machine learning models, particularly convolutional neural networks (CNNs). Pre-processing techniques, including normalization and contrast enhancement, can be employed to optimize feature extraction and improve classification accuracy. The dataset is particularly useful for developing automated detection systems for agricultural applications, such as monitoring grass seed infestation in livestock wool. It provides valuable insights into the effectiveness of THz imaging for non-invasive inspection. Users are encouraged to refer to our paper when using the dataset in their research to ensure proper attribution. 

 

Funding Agency: 
Keysight

Dataset Files

    Files have not been uploaded for this dataset