Optimization Deep Neural Network Test Case through Abductive Learning

- Citation Author(s):
- Submitted by:
- Ding Rui
- Last updated:
- DOI:
- 10.21227/v8ag-gp51
- Data Format:
- Links:
- Categories:
- Keywords:
Abstract
The rapid advancement of deep neural network (DNN) models has enabled their widespread application across various domains, including face recognition and natural language processing. However, data-driven DNN models are prone to erroneous behavior when inadequately trained, necessitating extensive predictive labeling of test data to identify and mitigate defects. Manual labeling, however, remains both labor-intensive and inefficient. To address this limitation, several automated test predictions labeling methods have been proposed. These approaches, however, often result in a high rate of mislabeling, thereby limiting improvements in classification performance during model retraining.
To address these challenges, this study proposes a Deep Neural Network Test Case Optimization Method based on Abductive Learning. This method integrates domain knowledge with logical reasoning to optimize the labeling of incorrectly predicted samples and utilizes abductively labeled data to retrain DNN models, thereby reducing both human effort and time requirements. Experiments were conducted on four deep learning test sets using four classical neural network models and compared against eight baseline methods. The proposed method achieved a fault detection rate of 96.77%, surpassing baseline methods by up to 13.46%. Furthermore, the labeling accuracy within the selected test case subset reached a maximum of 98.99%.
Instructions:
The MNIST dataset consists of simple grayscale images of handwritten digits, which are easy to preprocess and train. It is frequently employed to benchmark the performance of machine learning and DNN models. The Fashion dataset comprise ten categories of fashion items, offering richer visual features that facilitate a more rigorous assessment of classification performance. The SVHN dataset contains images of house number extracted from street views imagery. The presence of complex backgrounds and significant noise makes this dataset particularly challenging for digit recognition tasks. The CIFAR10 dataset includes ten object categories, each containing 6,000 images, providing a well-balanced dataset for both training and testing phases.