Datasets
Standard Dataset
OES_FDC_Data
- Citation Author(s):
- Submitted by:
- Dahoon Seol
- Last updated:
- Sun, 03/31/2024 - 10:30
- DOI:
- 10.21227/n5xt-z623
- Data Format:
- Research Article Link:
- Links:
- License:
- Categories:
- Keywords:
Abstract
Plasma-based semiconductor processing is highly sensitive, thus even minor changes in the procedure can have serious consequences. The monitoring and classification of these equipment anomalies can be performed using fault detection and classification (FDC). However, class imbalance in semiconductor process data poses a significant obstacle to the introduction of FDC into semiconductor equipment. Overfitting can occur in machine learning due to the diversity and imbalance of datasets for normal and abnormal. In this study, we suggest a suitable preprocessing method to address the issue of class imbalance in semiconductor process data. We compare existing oversampling models to reduce class imbalance, and then we suggest an appropriate sampling strategy. In order to improve the FC performance of plasma-based semiconductor process data, it was confirmed that the SMOTE-based model using an undersampling technique such as Tomek link is effective. SMOTE-TOMEK, which removes multiple classes and makes the boundary clear, is suitable for FDC to classify minute changes in plasma-based semiconductor equipment data.